blob: f4c41ac68d2f7997004f2a59a6ff05cccc0ee187 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04004.. module:: pickle
5 :synopsis: Convert Python objects to streams of bytes and back.
6
7.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
8.. sectionauthor:: Barry Warsaw <barry@python.org>
9
10**Source code:** :source:`Lib/pickle.py`
11
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: persistence
14 pair: persistent; objects
15 pair: serializing; objects
16 pair: marshalling; objects
17 pair: flattening; objects
18 pair: pickling; objects
19
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040020--------------
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000021
Antoine Pitroud4d60552013-12-07 00:56:59 +010022The :mod:`pickle` module implements binary protocols for serializing and
23de-serializing a Python object structure. *"Pickling"* is the process
24whereby a Python object hierarchy is converted into a byte stream, and
25*"unpickling"* is the inverse operation, whereby a byte stream
26(from a :term:`binary file` or :term:`bytes-like object`) is converted
27back into an object hierarchy. Pickling (and unpickling) is alternatively
28known as "serialization", "marshalling," [#]_ or "flattening"; however, to
29avoid confusion, the terms used here are "pickling" and "unpickling".
Georg Brandl116aa622007-08-15 14:28:22 +000030
Georg Brandl0036bcf2010-10-17 10:24:54 +000031.. warning::
32
Benjamin Peterson7dcbf902015-07-06 11:28:07 -050033 The :mod:`pickle` module is not secure against erroneous or maliciously
Benjamin Petersonb8fd2622015-07-06 09:40:43 -050034 constructed data. Never unpickle data received from an untrusted or
35 unauthenticated source.
Georg Brandl0036bcf2010-10-17 10:24:54 +000036
Georg Brandl116aa622007-08-15 14:28:22 +000037
38Relationship to other Python modules
39------------------------------------
40
Antoine Pitroud4d60552013-12-07 00:56:59 +010041Comparison with ``marshal``
42^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000043
44Python has a more primitive serialization module called :mod:`marshal`, but in
45general :mod:`pickle` should always be the preferred way to serialize Python
46objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
47files.
48
Georg Brandl5aa580f2010-11-30 14:57:54 +000049The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000050
51* The :mod:`pickle` module keeps track of the objects it has already serialized,
52 so that later references to the same object won't be serialized again.
53 :mod:`marshal` doesn't do this.
54
55 This has implications both for recursive objects and object sharing. Recursive
56 objects are objects that contain references to themselves. These are not
57 handled by marshal, and in fact, attempting to marshal recursive objects will
58 crash your Python interpreter. Object sharing happens when there are multiple
59 references to the same object in different places in the object hierarchy being
60 serialized. :mod:`pickle` stores such objects only once, and ensures that all
61 other references point to the master copy. Shared objects remain shared, which
62 can be very important for mutable objects.
63
64* :mod:`marshal` cannot be used to serialize user-defined classes and their
65 instances. :mod:`pickle` can save and restore class instances transparently,
66 however the class definition must be importable and live in the same module as
67 when the object was stored.
68
69* The :mod:`marshal` serialization format is not guaranteed to be portable
70 across Python versions. Because its primary job in life is to support
71 :file:`.pyc` files, the Python implementers reserve the right to change the
72 serialization format in non-backwards compatible ways should the need arise.
73 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
Gregory P. Smithe3287532018-12-09 11:42:58 -080074 across Python releases provided a compatible pickle protocol is chosen and
75 pickling and unpickling code deals with Python 2 to Python 3 type differences
76 if your data is crossing that unique breaking change language boundary.
Georg Brandl116aa622007-08-15 14:28:22 +000077
Antoine Pitroud4d60552013-12-07 00:56:59 +010078Comparison with ``json``
79^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000080
Antoine Pitroud4d60552013-12-07 00:56:59 +010081There are fundamental differences between the pickle protocols and
82`JSON (JavaScript Object Notation) <http://json.org>`_:
83
84* JSON is a text serialization format (it outputs unicode text, although
85 most of the time it is then encoded to ``utf-8``), while pickle is
86 a binary serialization format;
87
88* JSON is human-readable, while pickle is not;
89
90* JSON is interoperable and widely used outside of the Python ecosystem,
91 while pickle is Python-specific;
92
93* JSON, by default, can only represent a subset of the Python built-in
94 types, and no custom classes; pickle can represent an extremely large
95 number of Python types (many of them automatically, by clever usage
96 of Python's introspection facilities; complex cases can be tackled by
97 implementing :ref:`specific object APIs <pickle-inst>`).
98
99.. seealso::
100 The :mod:`json` module: a standard library module allowing JSON
101 serialization and deserialization.
Georg Brandl116aa622007-08-15 14:28:22 +0000102
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100103
104.. _pickle-protocols:
105
Georg Brandl116aa622007-08-15 14:28:22 +0000106Data stream format
107------------------
108
109.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000110 single: External Data Representation
111
112The data format used by :mod:`pickle` is Python-specific. This has the
113advantage that there are no restrictions imposed by external standards such as
Antoine Pitroua9494f62012-05-10 15:38:30 +0200114JSON or XDR (which can't represent pointer sharing); however it means that
115non-Python programs may not be able to reconstruct pickled Python objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000116
Antoine Pitroua9494f62012-05-10 15:38:30 +0200117By default, the :mod:`pickle` data format uses a relatively compact binary
118representation. If you need optimal size characteristics, you can efficiently
119:doc:`compress <archiving>` pickled data.
120
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000121The module :mod:`pickletools` contains tools for analyzing data streams
Antoine Pitroua9494f62012-05-10 15:38:30 +0200122generated by :mod:`pickle`. :mod:`pickletools` source code has extensive
123comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000124
Antoine Pitroub6457242014-01-21 02:39:54 +0100125There are currently 5 different protocols which can be used for pickling.
126The higher the protocol used, the more recent the version of Python needed
127to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000128
Antoine Pitroua9494f62012-05-10 15:38:30 +0200129* Protocol version 0 is the original "human-readable" protocol and is
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000130 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000131
Antoine Pitroua9494f62012-05-10 15:38:30 +0200132* Protocol version 1 is an old binary format which is also compatible with
Georg Brandl116aa622007-08-15 14:28:22 +0000133 earlier versions of Python.
134
135* Protocol version 2 was introduced in Python 2.3. It provides much more
Antoine Pitroua9494f62012-05-10 15:38:30 +0200136 efficient pickling of :term:`new-style class`\es. Refer to :pep:`307` for
137 information about improvements brought by protocol 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000138
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100139* Protocol version 3 was added in Python 3.0. It has explicit support for
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700140 :class:`bytes` objects and cannot be unpickled by Python 2.x. This was
141 the default protocol in Python 3.0--3.7.
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100142
143* Protocol version 4 was added in Python 3.4. It adds support for very large
144 objects, pickling more kinds of objects, and some data format
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700145 optimizations. It is the default protocol starting with Python 3.8.
146 Refer to :pep:`3154` for information about improvements brought by
147 protocol 4.
Georg Brandl116aa622007-08-15 14:28:22 +0000148
Antoine Pitroud4d60552013-12-07 00:56:59 +0100149.. note::
150 Serialization is a more primitive notion than persistence; although
151 :mod:`pickle` reads and writes file objects, it does not handle the issue of
152 naming persistent objects, nor the (even more complicated) issue of concurrent
153 access to persistent objects. The :mod:`pickle` module can transform a complex
154 object into a byte stream and it can transform the byte stream into an object
155 with the same internal structure. Perhaps the most obvious thing to do with
156 these byte streams is to write them onto a file, but it is also conceivable to
157 send them across a network or store them in a database. The :mod:`shelve`
158 module provides a simple interface to pickle and unpickle objects on
159 DBM-style database files.
160
Georg Brandl116aa622007-08-15 14:28:22 +0000161
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000162Module Interface
163----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000164
Antoine Pitroua9494f62012-05-10 15:38:30 +0200165To serialize an object hierarchy, you simply call the :func:`dumps` function.
166Similarly, to de-serialize a data stream, you call the :func:`loads` function.
167However, if you want more control over serialization and de-serialization,
168you can create a :class:`Pickler` or an :class:`Unpickler` object, respectively.
169
170The :mod:`pickle` module provides the following constants:
Georg Brandl116aa622007-08-15 14:28:22 +0000171
172
173.. data:: HIGHEST_PROTOCOL
174
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100175 An integer, the highest :ref:`protocol version <pickle-protocols>`
176 available. This value can be passed as a *protocol* value to functions
177 :func:`dump` and :func:`dumps` as well as the :class:`Pickler`
178 constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000179
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000180.. data:: DEFAULT_PROTOCOL
181
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100182 An integer, the default :ref:`protocol version <pickle-protocols>` used
183 for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700184 default protocol is 4, first introduced in Python 3.4 and incompatible
185 with previous versions.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000186
Ɓukasz Langac51d8c92018-04-03 23:06:53 -0700187 .. versionchanged:: 3.0
188
189 The default protocol is 3.
190
191 .. versionchanged:: 3.8
192
193 The default protocol is 4.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000194
Georg Brandl116aa622007-08-15 14:28:22 +0000195The :mod:`pickle` module provides the following functions to make the pickling
196process more convenient:
197
Georg Brandl18244152009-09-02 20:34:52 +0000198.. function:: dump(obj, file, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000199
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000200 Write a pickled representation of *obj* to the open :term:`file object` *file*.
201 This is equivalent to ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000202
Antoine Pitroub6457242014-01-21 02:39:54 +0100203 The optional *protocol* argument, an integer, tells the pickler to use
204 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
205 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
206 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000207
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000208 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200209 argument. It can thus be an on-disk file opened for binary writing, an
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000210 :class:`io.BytesIO` instance, or any other custom object that meets this
211 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000212
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200213 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800214 map the new Python 3 names to the old module names used in Python 2, so
215 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000216
Georg Brandl18244152009-09-02 20:34:52 +0000217.. function:: dumps(obj, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000218
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800219 Return the pickled representation of the object as a :class:`bytes` object,
220 instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000221
Antoine Pitroub6457242014-01-21 02:39:54 +0100222 Arguments *protocol* and *fix_imports* have the same meaning as in
223 :func:`dump`.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000224
Georg Brandl18244152009-09-02 20:34:52 +0000225.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000226
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800227 Read a pickled object representation from the open :term:`file object`
228 *file* and return the reconstituted object hierarchy specified therein.
229 This is equivalent to ``Unpickler(file).load()``.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000230
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800231 The protocol version of the pickle is detected automatically, so no
232 protocol argument is needed. Bytes past the pickled object's
233 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000234
235 The argument *file* must have two methods, a read() method that takes an
236 integer argument, and a readline() method that requires no arguments. Both
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800237 methods should return bytes. Thus *file* can be an on-disk file opened for
Martin Panter7462b6492015-11-02 03:37:02 +0000238 binary reading, an :class:`io.BytesIO` object, or any other custom object
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000239 that meets this interface.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000240
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000241 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000242 which are used to control compatibility support for pickle stream generated
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800243 by Python 2. If *fix_imports* is true, pickle will try to map the old
244 Python 2 names to the new names used in Python 3. The *encoding* and
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000245 *errors* tell pickle how to decode 8-bit string instances pickled by Python
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800246 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
247 be 'bytes' to read these 8-bit string instances as bytes objects.
Serhiy Storchaka8452ca12018-12-07 13:42:10 +0200248 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
249 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
250 :class:`~datetime.time` pickled by Python 2.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000251
Georg Brandl18244152009-09-02 20:34:52 +0000252.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict")
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000253
254 Read a pickled object hierarchy from a :class:`bytes` object and return the
Martin Panterd21e0b52015-10-10 10:36:22 +0000255 reconstituted object hierarchy specified therein.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000256
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800257 The protocol version of the pickle is detected automatically, so no
258 protocol argument is needed. Bytes past the pickled object's
259 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000260
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000261 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000262 which are used to control compatibility support for pickle stream generated
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800263 by Python 2. If *fix_imports* is true, pickle will try to map the old
264 Python 2 names to the new names used in Python 3. The *encoding* and
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000265 *errors* tell pickle how to decode 8-bit string instances pickled by Python
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800266 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
267 be 'bytes' to read these 8-bit string instances as bytes objects.
Serhiy Storchaka8452ca12018-12-07 13:42:10 +0200268 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
269 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
270 :class:`~datetime.time` pickled by Python 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000271
Georg Brandl116aa622007-08-15 14:28:22 +0000272
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000273The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000274
275.. exception:: PickleError
276
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000277 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000278 :exc:`Exception`.
279
Georg Brandl116aa622007-08-15 14:28:22 +0000280.. exception:: PicklingError
281
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000282 Error raised when an unpicklable object is encountered by :class:`Pickler`.
283 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000284
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000285 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
286 pickled.
287
Georg Brandl116aa622007-08-15 14:28:22 +0000288.. exception:: UnpicklingError
289
Ezio Melottie62aad32011-11-18 13:51:10 +0200290 Error raised when there is a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000291 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000292
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000293 Note that other exceptions may also be raised during unpickling, including
294 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
295 IndexError.
296
297
298The :mod:`pickle` module exports two classes, :class:`Pickler` and
Georg Brandl116aa622007-08-15 14:28:22 +0000299:class:`Unpickler`:
300
Georg Brandl18244152009-09-02 20:34:52 +0000301.. class:: Pickler(file, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000303 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000304
Antoine Pitroub6457242014-01-21 02:39:54 +0100305 The optional *protocol* argument, an integer, tells the pickler to use
306 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
307 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
308 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000309
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000310 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200311 argument. It can thus be an on-disk file opened for binary writing, an
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800312 :class:`io.BytesIO` instance, or any other custom object that meets this
313 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000314
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200315 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800316 map the new Python 3 names to the old module names used in Python 2, so
317 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000318
Benjamin Petersone41251e2008-04-25 01:59:09 +0000319 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000320
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000321 Write a pickled representation of *obj* to the open file object given in
322 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000323
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000324 .. method:: persistent_id(obj)
325
326 Do nothing by default. This exists so a subclass can override it.
327
328 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
329 other value causes :class:`Pickler` to emit the returned value as a
330 persistent ID for *obj*. The meaning of this persistent ID should be
331 defined by :meth:`Unpickler.persistent_load`. Note that the value
332 returned by :meth:`persistent_id` cannot itself have a persistent ID.
333
334 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000335
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100336 .. attribute:: dispatch_table
337
338 A pickler object's dispatch table is a registry of *reduction
339 functions* of the kind which can be declared using
340 :func:`copyreg.pickle`. It is a mapping whose keys are classes
341 and whose values are reduction functions. A reduction function
342 takes a single argument of the associated class and should
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300343 conform to the same interface as a :meth:`__reduce__`
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100344 method.
345
346 By default, a pickler object will not have a
347 :attr:`dispatch_table` attribute, and it will instead use the
348 global dispatch table managed by the :mod:`copyreg` module.
349 However, to customize the pickling for a specific pickler object
350 one can set the :attr:`dispatch_table` attribute to a dict-like
351 object. Alternatively, if a subclass of :class:`Pickler` has a
352 :attr:`dispatch_table` attribute then this will be used as the
353 default dispatch table for instances of that class.
354
355 See :ref:`pickle-dispatch` for usage examples.
356
357 .. versionadded:: 3.3
358
Pierre Glaser289f1f82019-05-08 23:08:25 +0200359 .. method:: reducer_override(self, obj)
360
361 Special reducer that can be defined in :class:`Pickler` subclasses. This
362 method has priority over any reducer in the :attr:`dispatch_table`. It
363 should conform to the same interface as a :meth:`__reduce__` method, and
364 can optionally return ``NotImplemented`` to fallback on
365 :attr:`dispatch_table`-registered reducers to pickle ``obj``.
366
367 For a detailed example, see :ref:`reducer_override`.
368
369 .. versionadded:: 3.8
370
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000371 .. attribute:: fast
372
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000373 Deprecated. Enable fast mode if set to a true value. The fast mode
374 disables the usage of memo, therefore speeding the pickling process by not
375 generating superfluous PUT opcodes. It should not be used with
376 self-referential objects, doing otherwise will cause :class:`Pickler` to
377 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000378
379 Use :func:`pickletools.optimize` if you need more compact pickles.
380
Georg Brandl116aa622007-08-15 14:28:22 +0000381
Georg Brandl18244152009-09-02 20:34:52 +0000382.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
Georg Brandl116aa622007-08-15 14:28:22 +0000383
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000384 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000385
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000386 The protocol version of the pickle is detected automatically, so no
387 protocol argument is needed.
388
389 The argument *file* must have two methods, a read() method that takes an
390 integer argument, and a readline() method that requires no arguments. Both
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800391 methods should return bytes. Thus *file* can be an on-disk file object
Martin Panter7462b6492015-11-02 03:37:02 +0000392 opened for binary reading, an :class:`io.BytesIO` object, or any other
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800393 custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000394
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000395 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000396 which are used to control compatibility support for pickle stream generated
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800397 by Python 2. If *fix_imports* is true, pickle will try to map the old
398 Python 2 names to the new names used in Python 3. The *encoding* and
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000399 *errors* tell pickle how to decode 8-bit string instances pickled by Python
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800400 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
Sebastian Pucilowskia8d25a12017-12-21 20:00:49 +1100401 be 'bytes' to read these 8-bit string instances as bytes objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000402
Benjamin Petersone41251e2008-04-25 01:59:09 +0000403 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000404
Benjamin Petersone41251e2008-04-25 01:59:09 +0000405 Read a pickled object representation from the open file object given in
406 the constructor, and return the reconstituted object hierarchy specified
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000407 therein. Bytes past the pickled object's representation are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000408
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000409 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000410
Ezio Melottie62aad32011-11-18 13:51:10 +0200411 Raise an :exc:`UnpicklingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000412
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000413 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000414 the persistent ID *pid*. If an invalid persistent ID is encountered, an
Ezio Melottie62aad32011-11-18 13:51:10 +0200415 :exc:`UnpicklingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000416
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000417 See :ref:`pickle-persistent` for details and examples of uses.
418
419 .. method:: find_class(module, name)
420
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000421 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000422 where the *module* and *name* arguments are :class:`str` objects. Note,
423 unlike its name suggests, :meth:`find_class` is also used for finding
424 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000425
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000426 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000427 how they can be loaded, potentially reducing security risks. Refer to
428 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000429
Steve Dowerb82e17e2019-05-23 08:45:22 -0700430 .. audit-event:: pickle.find_class "module name"
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000431
432.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000433
434What can be pickled and unpickled?
435----------------------------------
436
437The following types can be pickled:
438
439* ``None``, ``True``, and ``False``
440
Georg Brandlba956ae2007-11-29 17:24:34 +0000441* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000442
Georg Brandlf6945182008-02-01 11:56:49 +0000443* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000444
445* tuples, lists, sets, and dictionaries containing only picklable objects
446
Ethan Furman2498d9e2013-10-18 00:45:40 -0700447* functions defined at the top level of a module (using :keyword:`def`, not
448 :keyword:`lambda`)
Georg Brandl116aa622007-08-15 14:28:22 +0000449
450* built-in functions defined at the top level of a module
451
452* classes that are defined at the top level of a module
453
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300454* instances of such classes whose :attr:`~object.__dict__` or the result of
455 calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800456 details).
Georg Brandl116aa622007-08-15 14:28:22 +0000457
458Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
459exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000460been written to the underlying file. Trying to pickle a highly recursive data
Yury Selivanovf488fb42015-07-03 01:04:23 -0400461structure may exceed the maximum recursion depth, a :exc:`RecursionError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000462raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000463:func:`sys.setrecursionlimit`.
464
465Note that functions (built-in and user-defined) are pickled by "fully qualified"
Ethan Furman2498d9e2013-10-18 00:45:40 -0700466name reference, not by value. [#]_ This means that only the function name is
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800467pickled, along with the name of the module the function is defined in. Neither
468the function's code, nor any of its function attributes are pickled. Thus the
Georg Brandl116aa622007-08-15 14:28:22 +0000469defining module must be importable in the unpickling environment, and the module
470must contain the named object, otherwise an exception will be raised. [#]_
471
472Similarly, classes are pickled by named reference, so the same restrictions in
473the unpickling environment apply. Note that none of the class's code or data is
474pickled, so in the following example the class attribute ``attr`` is not
475restored in the unpickling environment::
476
477 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000478 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000479
480 picklestring = pickle.dumps(Foo)
481
482These restrictions are why picklable functions and classes must be defined in
483the top level of a module.
484
485Similarly, when class instances are pickled, their class's code and data are not
486pickled along with them. Only the instance data are pickled. This is done on
487purpose, so you can fix bugs in a class or add methods to the class and still
488load objects that were created with an earlier version of the class. If you
489plan to have long-lived objects that will see many versions of a class, it may
490be worthwhile to put a version number in the objects so that suitable
491conversions can be made by the class's :meth:`__setstate__` method.
492
493
Georg Brandl116aa622007-08-15 14:28:22 +0000494.. _pickle-inst:
495
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000496Pickling Class Instances
497------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000498
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300499.. currentmodule:: None
500
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000501In this section, we describe the general mechanisms available to you to define,
502customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000503
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000504In most cases, no additional code is needed to make instances picklable. By
505default, pickle will retrieve the class and the attributes of an instance via
506introspection. When a class instance is unpickled, its :meth:`__init__` method
507is usually *not* invoked. The default behaviour first creates an uninitialized
508instance and then restores the saved attributes. The following code shows an
509implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000510
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000511 def save(obj):
512 return (obj.__class__, obj.__dict__)
513
514 def load(cls, attributes):
515 obj = cls.__new__(cls)
516 obj.__dict__.update(attributes)
517 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000518
Georg Brandl6faee4e2010-09-21 14:48:28 +0000519Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000520methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000521
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100522.. method:: object.__getnewargs_ex__()
523
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300524 In protocols 2 and newer, classes that implements the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100525 :meth:`__getnewargs_ex__` method can dictate the values passed to the
526 :meth:`__new__` method upon unpickling. The method must return a pair
527 ``(args, kwargs)`` where *args* is a tuple of positional arguments
528 and *kwargs* a dictionary of named arguments for constructing the
529 object. Those will be passed to the :meth:`__new__` method upon
530 unpickling.
531
532 You should implement this method if the :meth:`__new__` method of your
533 class requires keyword-only arguments. Otherwise, it is recommended for
534 compatibility to implement :meth:`__getnewargs__`.
535
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300536 .. versionchanged:: 3.6
537 :meth:`__getnewargs_ex__` is now used in protocols 2 and 3.
538
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100539
Georg Brandlc8148262010-10-17 11:13:37 +0000540.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000541
Andrés Delfino0e0534c2018-06-09 21:41:09 -0300542 This method serves a similar purpose as :meth:`__getnewargs_ex__`, but
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300543 supports only positional arguments. It must return a tuple of arguments
544 ``args`` which will be passed to the :meth:`__new__` method upon unpickling.
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100545
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300546 :meth:`__getnewargs__` will not be called if :meth:`__getnewargs_ex__` is
547 defined.
548
549 .. versionchanged:: 3.6
550 Before Python 3.6, :meth:`__getnewargs__` was called instead of
551 :meth:`__getnewargs_ex__` in protocols 2 and 3.
Georg Brandl116aa622007-08-15 14:28:22 +0000552
Georg Brandl116aa622007-08-15 14:28:22 +0000553
Georg Brandlc8148262010-10-17 11:13:37 +0000554.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000555
Georg Brandlc8148262010-10-17 11:13:37 +0000556 Classes can further influence how their instances are pickled; if the class
557 defines the method :meth:`__getstate__`, it is called and the returned object
558 is pickled as the contents for the instance, instead of the contents of the
559 instance's dictionary. If the :meth:`__getstate__` method is absent, the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300560 instance's :attr:`~object.__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000561
Georg Brandlc8148262010-10-17 11:13:37 +0000562
563.. method:: object.__setstate__(state)
564
565 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
566 the unpickled state. In that case, there is no requirement for the state
567 object to be a dictionary. Otherwise, the pickled state must be a dictionary
568 and its items are assigned to the new instance's dictionary.
569
570 .. note::
571
572 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
573 method will not be called upon unpickling.
574
Georg Brandl116aa622007-08-15 14:28:22 +0000575
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000576Refer to the section :ref:`pickle-state` for more information about how to use
577the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000578
Benjamin Petersond23f8222009-04-05 19:13:16 +0000579.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000580
Benjamin Petersond23f8222009-04-05 19:13:16 +0000581 At unpickling time, some methods like :meth:`__getattr__`,
582 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100583 instance. In case those methods rely on some internal invariant being
584 true, the type should implement :meth:`__getnewargs__` or
585 :meth:`__getnewargs_ex__` to establish such an invariant; otherwise,
586 neither :meth:`__new__` nor :meth:`__init__` will be called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000587
Georg Brandlc8148262010-10-17 11:13:37 +0000588.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000589
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000590As we shall see, pickle does not use directly the methods described above. In
591fact, these methods are part of the copy protocol which implements the
592:meth:`__reduce__` special method. The copy protocol provides a unified
593interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000594objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000595
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000596Although powerful, implementing :meth:`__reduce__` directly in your classes is
597error prone. For this reason, class designers should use the high-level
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100598interface (i.e., :meth:`__getnewargs_ex__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000599:meth:`__setstate__`) whenever possible. We will show, however, cases where
600using :meth:`__reduce__` is the only option or leads to more efficient pickling
601or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000602
Georg Brandlc8148262010-10-17 11:13:37 +0000603.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000604
Georg Brandlc8148262010-10-17 11:13:37 +0000605 The interface is currently defined as follows. The :meth:`__reduce__` method
606 takes no argument and shall return either a string or preferably a tuple (the
607 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000608
Georg Brandlc8148262010-10-17 11:13:37 +0000609 If a string is returned, the string should be interpreted as the name of a
610 global variable. It should be the object's local name relative to its
611 module; the pickle module searches the module namespace to determine the
612 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000613
Pierre Glaser65d98d02019-05-08 21:40:25 +0200614 When a tuple is returned, it must be between two and six items long.
Georg Brandlc8148262010-10-17 11:13:37 +0000615 Optional items can either be omitted, or ``None`` can be provided as their
616 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000617
Georg Brandlc8148262010-10-17 11:13:37 +0000618 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000619
Georg Brandlc8148262010-10-17 11:13:37 +0000620 * A callable object that will be called to create the initial version of the
621 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000622
Georg Brandlc8148262010-10-17 11:13:37 +0000623 * A tuple of arguments for the callable object. An empty tuple must be given
624 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000625
Georg Brandlc8148262010-10-17 11:13:37 +0000626 * Optionally, the object's state, which will be passed to the object's
627 :meth:`__setstate__` method as previously described. If the object has no
628 such method then, the value must be a dictionary and it will be added to
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300629 the object's :attr:`~object.__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000630
Georg Brandlc8148262010-10-17 11:13:37 +0000631 * Optionally, an iterator (and not a sequence) yielding successive items.
632 These items will be appended to the object either using
633 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
634 This is primarily used for list subclasses, but may be used by other
635 classes as long as they have :meth:`append` and :meth:`extend` methods with
636 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
637 used depends on which pickle protocol version is used as well as the number
638 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000639
Georg Brandlc8148262010-10-17 11:13:37 +0000640 * Optionally, an iterator (not a sequence) yielding successive key-value
641 pairs. These items will be stored to the object using ``obj[key] =
642 value``. This is primarily used for dictionary subclasses, but may be used
643 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000644
Pierre Glaser65d98d02019-05-08 21:40:25 +0200645 * Optionally, a callable with a ``(obj, state)`` signature. This
Xtreak9b5a0ef2019-05-16 10:04:24 +0530646 callable allows the user to programmatically control the state-updating
Pierre Glaser65d98d02019-05-08 21:40:25 +0200647 behavior of a specific object, instead of using ``obj``'s static
648 :meth:`__setstate__` method. If not ``None``, this callable will have
649 priority over ``obj``'s :meth:`__setstate__`.
650
651 .. versionadded:: 3.8
652 The optional sixth tuple item, ``(obj, state)``, was added.
653
Georg Brandlc8148262010-10-17 11:13:37 +0000654
655.. method:: object.__reduce_ex__(protocol)
656
657 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
658 difference is this method should take a single integer argument, the protocol
659 version. When defined, pickle will prefer it over the :meth:`__reduce__`
660 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
661 the extended version. The main use for this method is to provide
662 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000663
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300664.. currentmodule:: pickle
665
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000666.. _pickle-persistent:
667
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000668Persistence of External Objects
669^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000670
Christian Heimes05e8be12008-02-23 18:30:17 +0000671.. index::
672 single: persistent_id (pickle protocol)
673 single: persistent_load (pickle protocol)
674
Georg Brandl116aa622007-08-15 14:28:22 +0000675For the benefit of object persistence, the :mod:`pickle` module supports the
676notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000677objects are referenced by a persistent ID, which should be either a string of
678alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
679any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000680
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000681The resolution of such persistent IDs is not defined by the :mod:`pickle`
682module; it will delegate this resolution to the user defined methods on the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300683pickler and unpickler, :meth:`~Pickler.persistent_id` and
684:meth:`~Unpickler.persistent_load` respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000685
686To pickle objects that have an external persistent id, the pickler must have a
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300687custom :meth:`~Pickler.persistent_id` method that takes an object as an
688argument and returns either ``None`` or the persistent id for that object.
689When ``None`` is returned, the pickler simply pickles the object as normal.
690When a persistent ID string is returned, the pickler will pickle that object,
691along with a marker so that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000692
693To unpickle external objects, the unpickler must have a custom
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300694:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and
695returns the referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000696
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000697Here is a comprehensive example presenting how persistent ID can be used to
698pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000699
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000700.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000701
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100702.. _pickle-dispatch:
703
704Dispatch Tables
705^^^^^^^^^^^^^^^
706
707If one wants to customize pickling of some classes without disturbing
708any other code which depends on pickling, then one can create a
709pickler with a private dispatch table.
710
711The global dispatch table managed by the :mod:`copyreg` module is
712available as :data:`copyreg.dispatch_table`. Therefore, one may
713choose to use a modified copy of :data:`copyreg.dispatch_table` as a
714private dispatch table.
715
716For example ::
717
718 f = io.BytesIO()
719 p = pickle.Pickler(f)
720 p.dispatch_table = copyreg.dispatch_table.copy()
721 p.dispatch_table[SomeClass] = reduce_SomeClass
722
723creates an instance of :class:`pickle.Pickler` with a private dispatch
724table which handles the ``SomeClass`` class specially. Alternatively,
725the code ::
726
727 class MyPickler(pickle.Pickler):
728 dispatch_table = copyreg.dispatch_table.copy()
729 dispatch_table[SomeClass] = reduce_SomeClass
730 f = io.BytesIO()
731 p = MyPickler(f)
732
733does the same, but all instances of ``MyPickler`` will by default
734share the same dispatch table. The equivalent code using the
735:mod:`copyreg` module is ::
736
737 copyreg.pickle(SomeClass, reduce_SomeClass)
738 f = io.BytesIO()
739 p = pickle.Pickler(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000740
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000741.. _pickle-state:
742
743Handling Stateful Objects
744^^^^^^^^^^^^^^^^^^^^^^^^^
745
746.. index::
747 single: __getstate__() (copy protocol)
748 single: __setstate__() (copy protocol)
749
750Here's an example that shows how to modify pickling behavior for a class.
751The :class:`TextReader` class opens a text file, and returns the line number and
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300752line contents each time its :meth:`!readline` method is called. If a
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000753:class:`TextReader` instance is pickled, all attributes *except* the file object
754member are saved. When the instance is unpickled, the file is reopened, and
755reading resumes from the last location. The :meth:`__setstate__` and
756:meth:`__getstate__` methods are used to implement this behavior. ::
757
758 class TextReader:
759 """Print and number lines in a text file."""
760
761 def __init__(self, filename):
762 self.filename = filename
763 self.file = open(filename)
764 self.lineno = 0
765
766 def readline(self):
767 self.lineno += 1
768 line = self.file.readline()
769 if not line:
770 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000771 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000772 line = line[:-1]
773 return "%i: %s" % (self.lineno, line)
774
775 def __getstate__(self):
776 # Copy the object's state from self.__dict__ which contains
777 # all our instance attributes. Always use the dict.copy()
778 # method to avoid modifying the original state.
779 state = self.__dict__.copy()
780 # Remove the unpicklable entries.
781 del state['file']
782 return state
783
784 def __setstate__(self, state):
785 # Restore instance attributes (i.e., filename and lineno).
786 self.__dict__.update(state)
787 # Restore the previously opened file's state. To do so, we need to
788 # reopen it and read from it until the line count is restored.
789 file = open(self.filename)
790 for _ in range(self.lineno):
791 file.readline()
792 # Finally, save the file.
793 self.file = file
794
795
796A sample usage might be something like this::
797
798 >>> reader = TextReader("hello.txt")
799 >>> reader.readline()
800 '1: Hello world!'
801 >>> reader.readline()
802 '2: I am line number two.'
803 >>> new_reader = pickle.loads(pickle.dumps(reader))
804 >>> new_reader.readline()
805 '3: Goodbye!'
806
Pierre Glaser289f1f82019-05-08 23:08:25 +0200807.. _reducer_override:
808
809Custom Reduction for Types, Functions, and Other Objects
810--------------------------------------------------------
811
812.. versionadded:: 3.8
813
814Sometimes, :attr:`~Pickler.dispatch_table` may not be flexible enough.
815In particular we may want to customize pickling based on another criterion
816than the object's type, or we may want to customize the pickling of
817functions and classes.
818
819For those cases, it is possible to subclass from the :class:`Pickler` class and
820implement a :meth:`~Pickler.reducer_override` method. This method can return an
821arbitrary reduction tuple (see :meth:`__reduce__`). It can alternatively return
822``NotImplemented`` to fallback to the traditional behavior.
823
824If both the :attr:`~Pickler.dispatch_table` and
825:meth:`~Pickler.reducer_override` are defined, then
826:meth:`~Pickler.reducer_override` method takes priority.
827
828.. Note::
829 For performance reasons, :meth:`~Pickler.reducer_override` may not be
830 called for the following objects: ``None``, ``True``, ``False``, and
831 exact instances of :class:`int`, :class:`float`, :class:`bytes`,
832 :class:`str`, :class:`dict`, :class:`set`, :class:`frozenset`, :class:`list`
833 and :class:`tuple`.
834
835Here is a simple example where we allow pickling and reconstructing
836a given class::
837
838 import io
839 import pickle
840
841 class MyClass:
842 my_attribute = 1
843
844 class MyPickler(pickle.Pickler):
845 def reducer_override(self, obj):
846 """Custom reducer for MyClass."""
847 if getattr(obj, "__name__", None) == "MyClass":
848 return type, (obj.__name__, obj.__bases__,
849 {'my_attribute': obj.my_attribute})
850 else:
851 # For any other object, fallback to usual reduction
852 return NotImplemented
853
854 f = io.BytesIO()
855 p = MyPickler(f)
856 p.dump(MyClass)
857
858 del MyClass
859
860 unpickled_class = pickle.loads(f.getvalue())
861
862 assert isinstance(unpickled_class, type)
863 assert unpickled_class.__name__ == "MyClass"
864 assert unpickled_class.my_attribute == 1
865
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000866
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000867.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +0000868
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000869Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000870-------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000871
Christian Heimes05e8be12008-02-23 18:30:17 +0000872.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000873 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +0000874
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000875By default, unpickling will import any class or function that it finds in the
876pickle data. For many applications, this behaviour is unacceptable as it
877permits the unpickler to import and invoke arbitrary code. Just consider what
878this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +0000879
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000880 >>> import pickle
881 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
882 hello world
883 0
Georg Brandl116aa622007-08-15 14:28:22 +0000884
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000885In this example, the unpickler imports the :func:`os.system` function and then
886apply the string argument "echo hello world". Although this example is
887inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +0000888
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000889For this reason, you may want to control what gets unpickled by customizing
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300890:meth:`Unpickler.find_class`. Unlike its name suggests,
891:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or
892a function) is requested. Thus it is possible to either completely forbid
893globals or restrict them to a safe subset.
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000894
895Here is an example of an unpickler allowing only few safe classes from the
896:mod:`builtins` module to be loaded::
897
898 import builtins
899 import io
900 import pickle
901
902 safe_builtins = {
903 'range',
904 'complex',
905 'set',
906 'frozenset',
907 'slice',
908 }
909
910 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000911
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000912 def find_class(self, module, name):
913 # Only allow safe classes from builtins.
914 if module == "builtins" and name in safe_builtins:
915 return getattr(builtins, name)
916 # Forbid everything else.
917 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
918 (module, name))
919
920 def restricted_loads(s):
921 """Helper function analogous to pickle.loads()."""
922 return RestrictedUnpickler(io.BytesIO(s)).load()
923
924A sample usage of our unpickler working has intended::
925
926 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
927 [1, 2, range(0, 15)]
928 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
929 Traceback (most recent call last):
930 ...
931 pickle.UnpicklingError: global 'os.system' is forbidden
932 >>> restricted_loads(b'cbuiltins\neval\n'
933 ... b'(S\'getattr(__import__("os"), "system")'
934 ... b'("echo hello world")\'\ntR.')
935 Traceback (most recent call last):
936 ...
937 pickle.UnpicklingError: global 'builtins.eval' is forbidden
938
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000939
940.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +0000941 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000942
943As our examples shows, you have to be careful with what you allow to be
944unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000945alternatives such as the marshalling API in :mod:`xmlrpc.client` or
946third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000947
Georg Brandl116aa622007-08-15 14:28:22 +0000948
Antoine Pitroud4d60552013-12-07 00:56:59 +0100949Performance
950-----------
951
952Recent versions of the pickle protocol (from protocol 2 and upwards) feature
953efficient binary encodings for several common features and built-in types.
954Also, the :mod:`pickle` module has a transparent optimizer written in C.
955
956
Georg Brandl116aa622007-08-15 14:28:22 +0000957.. _pickle-example:
958
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000959Examples
960--------
Georg Brandl116aa622007-08-15 14:28:22 +0000961
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000962For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000963
964 import pickle
965
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000966 # An arbitrary collection of objects supported by pickle.
967 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000968 'a': [1, 2.0, 3, 4+6j],
969 'b': ("character string", b"byte string"),
Raymond Hettingerdf1b6992014-11-09 15:56:33 -0800970 'c': {None, True, False}
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000971 }
Georg Brandl116aa622007-08-15 14:28:22 +0000972
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000973 with open('data.pickle', 'wb') as f:
974 # Pickle the 'data' dictionary using the highest protocol available.
975 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +0000976
Georg Brandl116aa622007-08-15 14:28:22 +0000977
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000978The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000979
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000980 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +0000981
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000982 with open('data.pickle', 'rb') as f:
983 # The protocol version used is detected automatically, so we do not
984 # have to specify it.
985 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000986
Georg Brandl116aa622007-08-15 14:28:22 +0000987
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000988.. XXX: Add examples showing how to optimize pickles for size (like using
989.. pickletools.optimize() or the gzip module).
990
991
Georg Brandl116aa622007-08-15 14:28:22 +0000992.. seealso::
993
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +0000994 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +0000995 Pickle interface constructor registration for extension types.
996
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000997 Module :mod:`pickletools`
998 Tools for working with and analyzing pickled data.
999
Georg Brandl116aa622007-08-15 14:28:22 +00001000 Module :mod:`shelve`
1001 Indexed databases of objects; uses :mod:`pickle`.
1002
1003 Module :mod:`copy`
1004 Shallow and deep object copying.
1005
1006 Module :mod:`marshal`
1007 High-performance serialization of built-in types.
1008
1009
Georg Brandl116aa622007-08-15 14:28:22 +00001010.. rubric:: Footnotes
1011
1012.. [#] Don't confuse this with the :mod:`marshal` module
1013
Ethan Furman2498d9e2013-10-18 00:45:40 -07001014.. [#] This is why :keyword:`lambda` functions cannot be pickled: all
Serhiy Storchaka2b57c432018-12-19 08:09:46 +02001015 :keyword:`!lambda` functions share the same name: ``<lambda>``.
Ethan Furman2498d9e2013-10-18 00:45:40 -07001016
Georg Brandl116aa622007-08-15 14:28:22 +00001017.. [#] The exception raised will likely be an :exc:`ImportError` or an
1018 :exc:`AttributeError` but it could be something else.
1019
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +00001020.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
1021 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001022
Alexandre Vassalottid0392862008-10-24 01:32:40 +00001023.. [#] The limitation on alphanumeric characters is due to the fact
1024 the persistent IDs, in protocol 0, are delimited by the newline
1025 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +00001026 persistent IDs, the resulting pickle will become unreadable.