blob: 5fe49a013bc4c615561195a64092515e8ff423a3 [file] [log] [blame]
Georg Brandl116aa622007-08-15 14:28:22 +00001:mod:`pickle` --- Python object serialization
2=============================================
3
Terry Jan Reedyfa089b92016-06-11 15:02:54 -04004.. module:: pickle
5 :synopsis: Convert Python objects to streams of bytes and back.
6
7.. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>.
8.. sectionauthor:: Barry Warsaw <barry@python.org>
9
10**Source code:** :source:`Lib/pickle.py`
11
Georg Brandl116aa622007-08-15 14:28:22 +000012.. index::
13 single: persistence
14 pair: persistent; objects
15 pair: serializing; objects
16 pair: marshalling; objects
17 pair: flattening; objects
18 pair: pickling; objects
19
Terry Jan Reedyfa089b92016-06-11 15:02:54 -040020--------------
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +000021
Antoine Pitroud4d60552013-12-07 00:56:59 +010022The :mod:`pickle` module implements binary protocols for serializing and
23de-serializing a Python object structure. *"Pickling"* is the process
24whereby a Python object hierarchy is converted into a byte stream, and
25*"unpickling"* is the inverse operation, whereby a byte stream
26(from a :term:`binary file` or :term:`bytes-like object`) is converted
27back into an object hierarchy. Pickling (and unpickling) is alternatively
28known as "serialization", "marshalling," [#]_ or "flattening"; however, to
29avoid confusion, the terms used here are "pickling" and "unpickling".
Georg Brandl116aa622007-08-15 14:28:22 +000030
Georg Brandl0036bcf2010-10-17 10:24:54 +000031.. warning::
32
Benjamin Peterson7dcbf902015-07-06 11:28:07 -050033 The :mod:`pickle` module is not secure against erroneous or maliciously
Benjamin Petersonb8fd2622015-07-06 09:40:43 -050034 constructed data. Never unpickle data received from an untrusted or
35 unauthenticated source.
Georg Brandl0036bcf2010-10-17 10:24:54 +000036
Georg Brandl116aa622007-08-15 14:28:22 +000037
38Relationship to other Python modules
39------------------------------------
40
Antoine Pitroud4d60552013-12-07 00:56:59 +010041Comparison with ``marshal``
42^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000043
44Python has a more primitive serialization module called :mod:`marshal`, but in
45general :mod:`pickle` should always be the preferred way to serialize Python
46objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc`
47files.
48
Georg Brandl5aa580f2010-11-30 14:57:54 +000049The :mod:`pickle` module differs from :mod:`marshal` in several significant ways:
Georg Brandl116aa622007-08-15 14:28:22 +000050
51* The :mod:`pickle` module keeps track of the objects it has already serialized,
52 so that later references to the same object won't be serialized again.
53 :mod:`marshal` doesn't do this.
54
55 This has implications both for recursive objects and object sharing. Recursive
56 objects are objects that contain references to themselves. These are not
57 handled by marshal, and in fact, attempting to marshal recursive objects will
58 crash your Python interpreter. Object sharing happens when there are multiple
59 references to the same object in different places in the object hierarchy being
60 serialized. :mod:`pickle` stores such objects only once, and ensures that all
61 other references point to the master copy. Shared objects remain shared, which
62 can be very important for mutable objects.
63
64* :mod:`marshal` cannot be used to serialize user-defined classes and their
65 instances. :mod:`pickle` can save and restore class instances transparently,
66 however the class definition must be importable and live in the same module as
67 when the object was stored.
68
69* The :mod:`marshal` serialization format is not guaranteed to be portable
70 across Python versions. Because its primary job in life is to support
71 :file:`.pyc` files, the Python implementers reserve the right to change the
72 serialization format in non-backwards compatible ways should the need arise.
73 The :mod:`pickle` serialization format is guaranteed to be backwards compatible
74 across Python releases.
75
Antoine Pitroud4d60552013-12-07 00:56:59 +010076Comparison with ``json``
77^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +000078
Antoine Pitroud4d60552013-12-07 00:56:59 +010079There are fundamental differences between the pickle protocols and
80`JSON (JavaScript Object Notation) <http://json.org>`_:
81
82* JSON is a text serialization format (it outputs unicode text, although
83 most of the time it is then encoded to ``utf-8``), while pickle is
84 a binary serialization format;
85
86* JSON is human-readable, while pickle is not;
87
88* JSON is interoperable and widely used outside of the Python ecosystem,
89 while pickle is Python-specific;
90
91* JSON, by default, can only represent a subset of the Python built-in
92 types, and no custom classes; pickle can represent an extremely large
93 number of Python types (many of them automatically, by clever usage
94 of Python's introspection facilities; complex cases can be tackled by
95 implementing :ref:`specific object APIs <pickle-inst>`).
96
97.. seealso::
98 The :mod:`json` module: a standard library module allowing JSON
99 serialization and deserialization.
Georg Brandl116aa622007-08-15 14:28:22 +0000100
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100101
102.. _pickle-protocols:
103
Georg Brandl116aa622007-08-15 14:28:22 +0000104Data stream format
105------------------
106
107.. index::
Georg Brandl116aa622007-08-15 14:28:22 +0000108 single: External Data Representation
109
110The data format used by :mod:`pickle` is Python-specific. This has the
111advantage that there are no restrictions imposed by external standards such as
Antoine Pitroua9494f62012-05-10 15:38:30 +0200112JSON or XDR (which can't represent pointer sharing); however it means that
113non-Python programs may not be able to reconstruct pickled Python objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000114
Antoine Pitroua9494f62012-05-10 15:38:30 +0200115By default, the :mod:`pickle` data format uses a relatively compact binary
116representation. If you need optimal size characteristics, you can efficiently
117:doc:`compress <archiving>` pickled data.
118
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000119The module :mod:`pickletools` contains tools for analyzing data streams
Antoine Pitroua9494f62012-05-10 15:38:30 +0200120generated by :mod:`pickle`. :mod:`pickletools` source code has extensive
121comments about opcodes used by pickle protocols.
Georg Brandl116aa622007-08-15 14:28:22 +0000122
Antoine Pitroub6457242014-01-21 02:39:54 +0100123There are currently 5 different protocols which can be used for pickling.
124The higher the protocol used, the more recent the version of Python needed
125to read the pickle produced.
Georg Brandl116aa622007-08-15 14:28:22 +0000126
Antoine Pitroua9494f62012-05-10 15:38:30 +0200127* Protocol version 0 is the original "human-readable" protocol and is
Alexandre Vassalottif7d08c72009-01-23 04:50:05 +0000128 backwards compatible with earlier versions of Python.
Georg Brandl116aa622007-08-15 14:28:22 +0000129
Antoine Pitroua9494f62012-05-10 15:38:30 +0200130* Protocol version 1 is an old binary format which is also compatible with
Georg Brandl116aa622007-08-15 14:28:22 +0000131 earlier versions of Python.
132
133* Protocol version 2 was introduced in Python 2.3. It provides much more
Antoine Pitroua9494f62012-05-10 15:38:30 +0200134 efficient pickling of :term:`new-style class`\es. Refer to :pep:`307` for
135 information about improvements brought by protocol 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000136
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100137* Protocol version 3 was added in Python 3.0. It has explicit support for
Łukasz Langac51d8c92018-04-03 23:06:53 -0700138 :class:`bytes` objects and cannot be unpickled by Python 2.x. This was
139 the default protocol in Python 3.0--3.7.
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100140
141* Protocol version 4 was added in Python 3.4. It adds support for very large
142 objects, pickling more kinds of objects, and some data format
Łukasz Langac51d8c92018-04-03 23:06:53 -0700143 optimizations. It is the default protocol starting with Python 3.8.
144 Refer to :pep:`3154` for information about improvements brought by
145 protocol 4.
Georg Brandl116aa622007-08-15 14:28:22 +0000146
Antoine Pitroud4d60552013-12-07 00:56:59 +0100147.. note::
148 Serialization is a more primitive notion than persistence; although
149 :mod:`pickle` reads and writes file objects, it does not handle the issue of
150 naming persistent objects, nor the (even more complicated) issue of concurrent
151 access to persistent objects. The :mod:`pickle` module can transform a complex
152 object into a byte stream and it can transform the byte stream into an object
153 with the same internal structure. Perhaps the most obvious thing to do with
154 these byte streams is to write them onto a file, but it is also conceivable to
155 send them across a network or store them in a database. The :mod:`shelve`
156 module provides a simple interface to pickle and unpickle objects on
157 DBM-style database files.
158
Georg Brandl116aa622007-08-15 14:28:22 +0000159
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000160Module Interface
161----------------
Georg Brandl116aa622007-08-15 14:28:22 +0000162
Antoine Pitroua9494f62012-05-10 15:38:30 +0200163To serialize an object hierarchy, you simply call the :func:`dumps` function.
164Similarly, to de-serialize a data stream, you call the :func:`loads` function.
165However, if you want more control over serialization and de-serialization,
166you can create a :class:`Pickler` or an :class:`Unpickler` object, respectively.
167
168The :mod:`pickle` module provides the following constants:
Georg Brandl116aa622007-08-15 14:28:22 +0000169
170
171.. data:: HIGHEST_PROTOCOL
172
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100173 An integer, the highest :ref:`protocol version <pickle-protocols>`
174 available. This value can be passed as a *protocol* value to functions
175 :func:`dump` and :func:`dumps` as well as the :class:`Pickler`
176 constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000177
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000178.. data:: DEFAULT_PROTOCOL
179
Antoine Pitrou9bcb1122013-12-07 01:05:57 +0100180 An integer, the default :ref:`protocol version <pickle-protocols>` used
181 for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
Łukasz Langac51d8c92018-04-03 23:06:53 -0700182 default protocol is 4, first introduced in Python 3.4 and incompatible
183 with previous versions.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000184
Łukasz Langac51d8c92018-04-03 23:06:53 -0700185 .. versionchanged:: 3.0
186
187 The default protocol is 3.
188
189 .. versionchanged:: 3.8
190
191 The default protocol is 4.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000192
Georg Brandl116aa622007-08-15 14:28:22 +0000193The :mod:`pickle` module provides the following functions to make the pickling
194process more convenient:
195
Georg Brandl18244152009-09-02 20:34:52 +0000196.. function:: dump(obj, file, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000197
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000198 Write a pickled representation of *obj* to the open :term:`file object` *file*.
199 This is equivalent to ``Pickler(file, protocol).dump(obj)``.
Georg Brandl116aa622007-08-15 14:28:22 +0000200
Antoine Pitroub6457242014-01-21 02:39:54 +0100201 The optional *protocol* argument, an integer, tells the pickler to use
202 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
203 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
204 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000205
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000206 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200207 argument. It can thus be an on-disk file opened for binary writing, an
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000208 :class:`io.BytesIO` instance, or any other custom object that meets this
209 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000210
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200211 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800212 map the new Python 3 names to the old module names used in Python 2, so
213 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000214
Georg Brandl18244152009-09-02 20:34:52 +0000215.. function:: dumps(obj, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000216
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800217 Return the pickled representation of the object as a :class:`bytes` object,
218 instead of writing it to a file.
Georg Brandl116aa622007-08-15 14:28:22 +0000219
Antoine Pitroub6457242014-01-21 02:39:54 +0100220 Arguments *protocol* and *fix_imports* have the same meaning as in
221 :func:`dump`.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000222
Georg Brandl18244152009-09-02 20:34:52 +0000223.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000224
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800225 Read a pickled object representation from the open :term:`file object`
226 *file* and return the reconstituted object hierarchy specified therein.
227 This is equivalent to ``Unpickler(file).load()``.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000228
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800229 The protocol version of the pickle is detected automatically, so no
230 protocol argument is needed. Bytes past the pickled object's
231 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000232
233 The argument *file* must have two methods, a read() method that takes an
234 integer argument, and a readline() method that requires no arguments. Both
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800235 methods should return bytes. Thus *file* can be an on-disk file opened for
Martin Panter7462b6492015-11-02 03:37:02 +0000236 binary reading, an :class:`io.BytesIO` object, or any other custom object
Antoine Pitrou11cb9612010-09-15 11:11:28 +0000237 that meets this interface.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000238
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000239 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000240 which are used to control compatibility support for pickle stream generated
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800241 by Python 2. If *fix_imports* is true, pickle will try to map the old
242 Python 2 names to the new names used in Python 3. The *encoding* and
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000243 *errors* tell pickle how to decode 8-bit string instances pickled by Python
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800244 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
245 be 'bytes' to read these 8-bit string instances as bytes objects.
Serhiy Storchaka8452ca12018-12-07 13:42:10 +0200246 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
247 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
248 :class:`~datetime.time` pickled by Python 2.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000249
Georg Brandl18244152009-09-02 20:34:52 +0000250.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict")
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000251
252 Read a pickled object hierarchy from a :class:`bytes` object and return the
Martin Panterd21e0b52015-10-10 10:36:22 +0000253 reconstituted object hierarchy specified therein.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000254
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800255 The protocol version of the pickle is detected automatically, so no
256 protocol argument is needed. Bytes past the pickled object's
257 representation are ignored.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000258
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000259 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000260 which are used to control compatibility support for pickle stream generated
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800261 by Python 2. If *fix_imports* is true, pickle will try to map the old
262 Python 2 names to the new names used in Python 3. The *encoding* and
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000263 *errors* tell pickle how to decode 8-bit string instances pickled by Python
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800264 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
265 be 'bytes' to read these 8-bit string instances as bytes objects.
Serhiy Storchaka8452ca12018-12-07 13:42:10 +0200266 Using ``encoding='latin1'`` is required for unpickling NumPy arrays and
267 instances of :class:`~datetime.datetime`, :class:`~datetime.date` and
268 :class:`~datetime.time` pickled by Python 2.
Georg Brandl116aa622007-08-15 14:28:22 +0000269
Georg Brandl116aa622007-08-15 14:28:22 +0000270
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000271The :mod:`pickle` module defines three exceptions:
Georg Brandl116aa622007-08-15 14:28:22 +0000272
273.. exception:: PickleError
274
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000275 Common base class for the other pickling exceptions. It inherits
Georg Brandl116aa622007-08-15 14:28:22 +0000276 :exc:`Exception`.
277
Georg Brandl116aa622007-08-15 14:28:22 +0000278.. exception:: PicklingError
279
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000280 Error raised when an unpicklable object is encountered by :class:`Pickler`.
281 It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000282
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000283 Refer to :ref:`pickle-picklable` to learn what kinds of objects can be
284 pickled.
285
Georg Brandl116aa622007-08-15 14:28:22 +0000286.. exception:: UnpicklingError
287
Ezio Melottie62aad32011-11-18 13:51:10 +0200288 Error raised when there is a problem unpickling an object, such as a data
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000289 corruption or a security violation. It inherits :exc:`PickleError`.
Georg Brandl116aa622007-08-15 14:28:22 +0000290
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000291 Note that other exceptions may also be raised during unpickling, including
292 (but not necessarily limited to) AttributeError, EOFError, ImportError, and
293 IndexError.
294
295
296The :mod:`pickle` module exports two classes, :class:`Pickler` and
Georg Brandl116aa622007-08-15 14:28:22 +0000297:class:`Unpickler`:
298
Georg Brandl18244152009-09-02 20:34:52 +0000299.. class:: Pickler(file, protocol=None, \*, fix_imports=True)
Georg Brandl116aa622007-08-15 14:28:22 +0000300
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000301 This takes a binary file for writing a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000302
Antoine Pitroub6457242014-01-21 02:39:54 +0100303 The optional *protocol* argument, an integer, tells the pickler to use
304 the given protocol; supported protocols are 0 to :data:`HIGHEST_PROTOCOL`.
305 If not specified, the default is :data:`DEFAULT_PROTOCOL`. If a negative
306 number is specified, :data:`HIGHEST_PROTOCOL` is selected.
Georg Brandl116aa622007-08-15 14:28:22 +0000307
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000308 The *file* argument must have a write() method that accepts a single bytes
Serhiy Storchakad65c9492015-11-02 14:10:23 +0200309 argument. It can thus be an on-disk file opened for binary writing, an
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800310 :class:`io.BytesIO` instance, or any other custom object that meets this
311 interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000312
Serhiy Storchakafbc1c262013-11-29 12:17:13 +0200313 If *fix_imports* is true and *protocol* is less than 3, pickle will try to
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800314 map the new Python 3 names to the old module names used in Python 2, so
315 that the pickle data stream is readable with Python 2.
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000316
Benjamin Petersone41251e2008-04-25 01:59:09 +0000317 .. method:: dump(obj)
Georg Brandl116aa622007-08-15 14:28:22 +0000318
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000319 Write a pickled representation of *obj* to the open file object given in
320 the constructor.
Georg Brandl116aa622007-08-15 14:28:22 +0000321
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000322 .. method:: persistent_id(obj)
323
324 Do nothing by default. This exists so a subclass can override it.
325
326 If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any
327 other value causes :class:`Pickler` to emit the returned value as a
328 persistent ID for *obj*. The meaning of this persistent ID should be
329 defined by :meth:`Unpickler.persistent_load`. Note that the value
330 returned by :meth:`persistent_id` cannot itself have a persistent ID.
331
332 See :ref:`pickle-persistent` for details and examples of uses.
Georg Brandl116aa622007-08-15 14:28:22 +0000333
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100334 .. attribute:: dispatch_table
335
336 A pickler object's dispatch table is a registry of *reduction
337 functions* of the kind which can be declared using
338 :func:`copyreg.pickle`. It is a mapping whose keys are classes
339 and whose values are reduction functions. A reduction function
340 takes a single argument of the associated class and should
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300341 conform to the same interface as a :meth:`__reduce__`
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100342 method.
343
344 By default, a pickler object will not have a
345 :attr:`dispatch_table` attribute, and it will instead use the
346 global dispatch table managed by the :mod:`copyreg` module.
347 However, to customize the pickling for a specific pickler object
348 one can set the :attr:`dispatch_table` attribute to a dict-like
349 object. Alternatively, if a subclass of :class:`Pickler` has a
350 :attr:`dispatch_table` attribute then this will be used as the
351 default dispatch table for instances of that class.
352
353 See :ref:`pickle-dispatch` for usage examples.
354
355 .. versionadded:: 3.3
356
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000357 .. attribute:: fast
358
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000359 Deprecated. Enable fast mode if set to a true value. The fast mode
360 disables the usage of memo, therefore speeding the pickling process by not
361 generating superfluous PUT opcodes. It should not be used with
362 self-referential objects, doing otherwise will cause :class:`Pickler` to
363 recurse infinitely.
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000364
365 Use :func:`pickletools.optimize` if you need more compact pickles.
366
Georg Brandl116aa622007-08-15 14:28:22 +0000367
Georg Brandl18244152009-09-02 20:34:52 +0000368.. class:: Unpickler(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
Georg Brandl116aa622007-08-15 14:28:22 +0000369
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000370 This takes a binary file for reading a pickle data stream.
Georg Brandl116aa622007-08-15 14:28:22 +0000371
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000372 The protocol version of the pickle is detected automatically, so no
373 protocol argument is needed.
374
375 The argument *file* must have two methods, a read() method that takes an
376 integer argument, and a readline() method that requires no arguments. Both
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800377 methods should return bytes. Thus *file* can be an on-disk file object
Martin Panter7462b6492015-11-02 03:37:02 +0000378 opened for binary reading, an :class:`io.BytesIO` object, or any other
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800379 custom object that meets this interface.
Georg Brandl116aa622007-08-15 14:28:22 +0000380
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000381 Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
Georg Brandl6faee4e2010-09-21 14:48:28 +0000382 which are used to control compatibility support for pickle stream generated
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800383 by Python 2. If *fix_imports* is true, pickle will try to map the old
384 Python 2 names to the new names used in Python 3. The *encoding* and
Antoine Pitroud9dfaa92009-06-04 20:32:06 +0000385 *errors* tell pickle how to decode 8-bit string instances pickled by Python
Alexandre Vassalottid05c9ff2013-12-07 01:09:27 -0800386 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
Sebastian Pucilowskia8d25a12017-12-21 20:00:49 +1100387 be 'bytes' to read these 8-bit string instances as bytes objects.
Georg Brandl116aa622007-08-15 14:28:22 +0000388
Benjamin Petersone41251e2008-04-25 01:59:09 +0000389 .. method:: load()
Georg Brandl116aa622007-08-15 14:28:22 +0000390
Benjamin Petersone41251e2008-04-25 01:59:09 +0000391 Read a pickled object representation from the open file object given in
392 the constructor, and return the reconstituted object hierarchy specified
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000393 therein. Bytes past the pickled object's representation are ignored.
Georg Brandl116aa622007-08-15 14:28:22 +0000394
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000395 .. method:: persistent_load(pid)
Georg Brandl116aa622007-08-15 14:28:22 +0000396
Ezio Melottie62aad32011-11-18 13:51:10 +0200397 Raise an :exc:`UnpicklingError` by default.
Georg Brandl116aa622007-08-15 14:28:22 +0000398
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000399 If defined, :meth:`persistent_load` should return the object specified by
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000400 the persistent ID *pid*. If an invalid persistent ID is encountered, an
Ezio Melottie62aad32011-11-18 13:51:10 +0200401 :exc:`UnpicklingError` should be raised.
Georg Brandl116aa622007-08-15 14:28:22 +0000402
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000403 See :ref:`pickle-persistent` for details and examples of uses.
404
405 .. method:: find_class(module, name)
406
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000407 Import *module* if necessary and return the object called *name* from it,
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000408 where the *module* and *name* arguments are :class:`str` objects. Note,
409 unlike its name suggests, :meth:`find_class` is also used for finding
410 functions.
Georg Brandl116aa622007-08-15 14:28:22 +0000411
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000412 Subclasses may override this to gain control over what type of objects and
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000413 how they can be loaded, potentially reducing security risks. Refer to
414 :ref:`pickle-restrict` for details.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000415
416
417.. _pickle-picklable:
Georg Brandl116aa622007-08-15 14:28:22 +0000418
419What can be pickled and unpickled?
420----------------------------------
421
422The following types can be pickled:
423
424* ``None``, ``True``, and ``False``
425
Georg Brandlba956ae2007-11-29 17:24:34 +0000426* integers, floating point numbers, complex numbers
Georg Brandl116aa622007-08-15 14:28:22 +0000427
Georg Brandlf6945182008-02-01 11:56:49 +0000428* strings, bytes, bytearrays
Georg Brandl116aa622007-08-15 14:28:22 +0000429
430* tuples, lists, sets, and dictionaries containing only picklable objects
431
Ethan Furman2498d9e2013-10-18 00:45:40 -0700432* functions defined at the top level of a module (using :keyword:`def`, not
433 :keyword:`lambda`)
Georg Brandl116aa622007-08-15 14:28:22 +0000434
435* built-in functions defined at the top level of a module
436
437* classes that are defined at the top level of a module
438
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300439* instances of such classes whose :attr:`~object.__dict__` or the result of
440 calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800441 details).
Georg Brandl116aa622007-08-15 14:28:22 +0000442
443Attempts to pickle unpicklable objects will raise the :exc:`PicklingError`
444exception; when this happens, an unspecified number of bytes may have already
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000445been written to the underlying file. Trying to pickle a highly recursive data
Yury Selivanovf488fb42015-07-03 01:04:23 -0400446structure may exceed the maximum recursion depth, a :exc:`RecursionError` will be
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000447raised in this case. You can carefully raise this limit with
Georg Brandl116aa622007-08-15 14:28:22 +0000448:func:`sys.setrecursionlimit`.
449
450Note that functions (built-in and user-defined) are pickled by "fully qualified"
Ethan Furman2498d9e2013-10-18 00:45:40 -0700451name reference, not by value. [#]_ This means that only the function name is
Eli Bendersky78f3ce52013-01-02 05:53:59 -0800452pickled, along with the name of the module the function is defined in. Neither
453the function's code, nor any of its function attributes are pickled. Thus the
Georg Brandl116aa622007-08-15 14:28:22 +0000454defining module must be importable in the unpickling environment, and the module
455must contain the named object, otherwise an exception will be raised. [#]_
456
457Similarly, classes are pickled by named reference, so the same restrictions in
458the unpickling environment apply. Note that none of the class's code or data is
459pickled, so in the following example the class attribute ``attr`` is not
460restored in the unpickling environment::
461
462 class Foo:
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000463 attr = 'A class attribute'
Georg Brandl116aa622007-08-15 14:28:22 +0000464
465 picklestring = pickle.dumps(Foo)
466
467These restrictions are why picklable functions and classes must be defined in
468the top level of a module.
469
470Similarly, when class instances are pickled, their class's code and data are not
471pickled along with them. Only the instance data are pickled. This is done on
472purpose, so you can fix bugs in a class or add methods to the class and still
473load objects that were created with an earlier version of the class. If you
474plan to have long-lived objects that will see many versions of a class, it may
475be worthwhile to put a version number in the objects so that suitable
476conversions can be made by the class's :meth:`__setstate__` method.
477
478
Georg Brandl116aa622007-08-15 14:28:22 +0000479.. _pickle-inst:
480
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000481Pickling Class Instances
482------------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000483
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300484.. currentmodule:: None
485
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000486In this section, we describe the general mechanisms available to you to define,
487customize, and control how class instances are pickled and unpickled.
Georg Brandl116aa622007-08-15 14:28:22 +0000488
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000489In most cases, no additional code is needed to make instances picklable. By
490default, pickle will retrieve the class and the attributes of an instance via
491introspection. When a class instance is unpickled, its :meth:`__init__` method
492is usually *not* invoked. The default behaviour first creates an uninitialized
493instance and then restores the saved attributes. The following code shows an
494implementation of this behaviour::
Georg Brandl85eb8c12007-08-31 16:33:38 +0000495
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000496 def save(obj):
497 return (obj.__class__, obj.__dict__)
498
499 def load(cls, attributes):
500 obj = cls.__new__(cls)
501 obj.__dict__.update(attributes)
502 return obj
Georg Brandl116aa622007-08-15 14:28:22 +0000503
Georg Brandl6faee4e2010-09-21 14:48:28 +0000504Classes can alter the default behaviour by providing one or several special
Georg Brandlc8148262010-10-17 11:13:37 +0000505methods:
Georg Brandl116aa622007-08-15 14:28:22 +0000506
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100507.. method:: object.__getnewargs_ex__()
508
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300509 In protocols 2 and newer, classes that implements the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100510 :meth:`__getnewargs_ex__` method can dictate the values passed to the
511 :meth:`__new__` method upon unpickling. The method must return a pair
512 ``(args, kwargs)`` where *args* is a tuple of positional arguments
513 and *kwargs* a dictionary of named arguments for constructing the
514 object. Those will be passed to the :meth:`__new__` method upon
515 unpickling.
516
517 You should implement this method if the :meth:`__new__` method of your
518 class requires keyword-only arguments. Otherwise, it is recommended for
519 compatibility to implement :meth:`__getnewargs__`.
520
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300521 .. versionchanged:: 3.6
522 :meth:`__getnewargs_ex__` is now used in protocols 2 and 3.
523
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100524
Georg Brandlc8148262010-10-17 11:13:37 +0000525.. method:: object.__getnewargs__()
Georg Brandl116aa622007-08-15 14:28:22 +0000526
Andrés Delfino0e0534c2018-06-09 21:41:09 -0300527 This method serves a similar purpose as :meth:`__getnewargs_ex__`, but
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300528 supports only positional arguments. It must return a tuple of arguments
529 ``args`` which will be passed to the :meth:`__new__` method upon unpickling.
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100530
Serhiy Storchakab6d84832015-10-13 21:26:35 +0300531 :meth:`__getnewargs__` will not be called if :meth:`__getnewargs_ex__` is
532 defined.
533
534 .. versionchanged:: 3.6
535 Before Python 3.6, :meth:`__getnewargs__` was called instead of
536 :meth:`__getnewargs_ex__` in protocols 2 and 3.
Georg Brandl116aa622007-08-15 14:28:22 +0000537
Georg Brandl116aa622007-08-15 14:28:22 +0000538
Georg Brandlc8148262010-10-17 11:13:37 +0000539.. method:: object.__getstate__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000540
Georg Brandlc8148262010-10-17 11:13:37 +0000541 Classes can further influence how their instances are pickled; if the class
542 defines the method :meth:`__getstate__`, it is called and the returned object
543 is pickled as the contents for the instance, instead of the contents of the
544 instance's dictionary. If the :meth:`__getstate__` method is absent, the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300545 instance's :attr:`~object.__dict__` is pickled as usual.
Georg Brandl116aa622007-08-15 14:28:22 +0000546
Georg Brandlc8148262010-10-17 11:13:37 +0000547
548.. method:: object.__setstate__(state)
549
550 Upon unpickling, if the class defines :meth:`__setstate__`, it is called with
551 the unpickled state. In that case, there is no requirement for the state
552 object to be a dictionary. Otherwise, the pickled state must be a dictionary
553 and its items are assigned to the new instance's dictionary.
554
555 .. note::
556
557 If :meth:`__getstate__` returns a false value, the :meth:`__setstate__`
558 method will not be called upon unpickling.
559
Georg Brandl116aa622007-08-15 14:28:22 +0000560
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000561Refer to the section :ref:`pickle-state` for more information about how to use
562the methods :meth:`__getstate__` and :meth:`__setstate__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000563
Benjamin Petersond23f8222009-04-05 19:13:16 +0000564.. note::
Georg Brandle720c0a2009-04-27 16:20:50 +0000565
Benjamin Petersond23f8222009-04-05 19:13:16 +0000566 At unpickling time, some methods like :meth:`__getattr__`,
567 :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100568 instance. In case those methods rely on some internal invariant being
569 true, the type should implement :meth:`__getnewargs__` or
570 :meth:`__getnewargs_ex__` to establish such an invariant; otherwise,
571 neither :meth:`__new__` nor :meth:`__init__` will be called.
Benjamin Petersond23f8222009-04-05 19:13:16 +0000572
Georg Brandlc8148262010-10-17 11:13:37 +0000573.. index:: pair: copy; protocol
Christian Heimes05e8be12008-02-23 18:30:17 +0000574
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000575As we shall see, pickle does not use directly the methods described above. In
576fact, these methods are part of the copy protocol which implements the
577:meth:`__reduce__` special method. The copy protocol provides a unified
578interface for retrieving the data necessary for pickling and copying
Georg Brandl48310cd2009-01-03 21:18:54 +0000579objects. [#]_
Georg Brandl116aa622007-08-15 14:28:22 +0000580
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000581Although powerful, implementing :meth:`__reduce__` directly in your classes is
582error prone. For this reason, class designers should use the high-level
Antoine Pitrouc9dc4a22013-11-23 18:59:12 +0100583interface (i.e., :meth:`__getnewargs_ex__`, :meth:`__getstate__` and
Georg Brandlc8148262010-10-17 11:13:37 +0000584:meth:`__setstate__`) whenever possible. We will show, however, cases where
585using :meth:`__reduce__` is the only option or leads to more efficient pickling
586or both.
Georg Brandl116aa622007-08-15 14:28:22 +0000587
Georg Brandlc8148262010-10-17 11:13:37 +0000588.. method:: object.__reduce__()
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000589
Georg Brandlc8148262010-10-17 11:13:37 +0000590 The interface is currently defined as follows. The :meth:`__reduce__` method
591 takes no argument and shall return either a string or preferably a tuple (the
592 returned object is often referred to as the "reduce value").
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000593
Georg Brandlc8148262010-10-17 11:13:37 +0000594 If a string is returned, the string should be interpreted as the name of a
595 global variable. It should be the object's local name relative to its
596 module; the pickle module searches the module namespace to determine the
597 object's module. This behaviour is typically useful for singletons.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000598
Georg Brandlc8148262010-10-17 11:13:37 +0000599 When a tuple is returned, it must be between two and five items long.
600 Optional items can either be omitted, or ``None`` can be provided as their
601 value. The semantics of each item are in order:
Georg Brandl116aa622007-08-15 14:28:22 +0000602
Georg Brandlc8148262010-10-17 11:13:37 +0000603 .. XXX Mention __newobj__ special-case?
Georg Brandl116aa622007-08-15 14:28:22 +0000604
Georg Brandlc8148262010-10-17 11:13:37 +0000605 * A callable object that will be called to create the initial version of the
606 object.
Georg Brandl116aa622007-08-15 14:28:22 +0000607
Georg Brandlc8148262010-10-17 11:13:37 +0000608 * A tuple of arguments for the callable object. An empty tuple must be given
609 if the callable does not accept any argument.
Georg Brandl116aa622007-08-15 14:28:22 +0000610
Georg Brandlc8148262010-10-17 11:13:37 +0000611 * Optionally, the object's state, which will be passed to the object's
612 :meth:`__setstate__` method as previously described. If the object has no
613 such method then, the value must be a dictionary and it will be added to
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300614 the object's :attr:`~object.__dict__` attribute.
Georg Brandl116aa622007-08-15 14:28:22 +0000615
Georg Brandlc8148262010-10-17 11:13:37 +0000616 * Optionally, an iterator (and not a sequence) yielding successive items.
617 These items will be appended to the object either using
618 ``obj.append(item)`` or, in batch, using ``obj.extend(list_of_items)``.
619 This is primarily used for list subclasses, but may be used by other
620 classes as long as they have :meth:`append` and :meth:`extend` methods with
621 the appropriate signature. (Whether :meth:`append` or :meth:`extend` is
622 used depends on which pickle protocol version is used as well as the number
623 of items to append, so both must be supported.)
Georg Brandl116aa622007-08-15 14:28:22 +0000624
Georg Brandlc8148262010-10-17 11:13:37 +0000625 * Optionally, an iterator (not a sequence) yielding successive key-value
626 pairs. These items will be stored to the object using ``obj[key] =
627 value``. This is primarily used for dictionary subclasses, but may be used
628 by other classes as long as they implement :meth:`__setitem__`.
Georg Brandl116aa622007-08-15 14:28:22 +0000629
Georg Brandlc8148262010-10-17 11:13:37 +0000630
631.. method:: object.__reduce_ex__(protocol)
632
633 Alternatively, a :meth:`__reduce_ex__` method may be defined. The only
634 difference is this method should take a single integer argument, the protocol
635 version. When defined, pickle will prefer it over the :meth:`__reduce__`
636 method. In addition, :meth:`__reduce__` automatically becomes a synonym for
637 the extended version. The main use for this method is to provide
638 backwards-compatible reduce values for older Python releases.
Georg Brandl116aa622007-08-15 14:28:22 +0000639
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300640.. currentmodule:: pickle
641
Alexandre Vassalotti758bca62008-10-18 19:25:07 +0000642.. _pickle-persistent:
643
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000644Persistence of External Objects
645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Georg Brandl116aa622007-08-15 14:28:22 +0000646
Christian Heimes05e8be12008-02-23 18:30:17 +0000647.. index::
648 single: persistent_id (pickle protocol)
649 single: persistent_load (pickle protocol)
650
Georg Brandl116aa622007-08-15 14:28:22 +0000651For the benefit of object persistence, the :mod:`pickle` module supports the
652notion of a reference to an object outside the pickled data stream. Such
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000653objects are referenced by a persistent ID, which should be either a string of
654alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for
655any newer protocol).
Georg Brandl116aa622007-08-15 14:28:22 +0000656
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000657The resolution of such persistent IDs is not defined by the :mod:`pickle`
658module; it will delegate this resolution to the user defined methods on the
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300659pickler and unpickler, :meth:`~Pickler.persistent_id` and
660:meth:`~Unpickler.persistent_load` respectively.
Georg Brandl116aa622007-08-15 14:28:22 +0000661
662To pickle objects that have an external persistent id, the pickler must have a
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300663custom :meth:`~Pickler.persistent_id` method that takes an object as an
664argument and returns either ``None`` or the persistent id for that object.
665When ``None`` is returned, the pickler simply pickles the object as normal.
666When a persistent ID string is returned, the pickler will pickle that object,
667along with a marker so that the unpickler will recognize it as a persistent ID.
Georg Brandl116aa622007-08-15 14:28:22 +0000668
669To unpickle external objects, the unpickler must have a custom
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300670:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and
671returns the referenced object.
Georg Brandl116aa622007-08-15 14:28:22 +0000672
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000673Here is a comprehensive example presenting how persistent ID can be used to
674pickle external objects by reference.
Georg Brandl116aa622007-08-15 14:28:22 +0000675
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000676.. literalinclude:: ../includes/dbpickle.py
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000677
Antoine Pitrou8d3c2902012-03-04 18:31:48 +0100678.. _pickle-dispatch:
679
680Dispatch Tables
681^^^^^^^^^^^^^^^
682
683If one wants to customize pickling of some classes without disturbing
684any other code which depends on pickling, then one can create a
685pickler with a private dispatch table.
686
687The global dispatch table managed by the :mod:`copyreg` module is
688available as :data:`copyreg.dispatch_table`. Therefore, one may
689choose to use a modified copy of :data:`copyreg.dispatch_table` as a
690private dispatch table.
691
692For example ::
693
694 f = io.BytesIO()
695 p = pickle.Pickler(f)
696 p.dispatch_table = copyreg.dispatch_table.copy()
697 p.dispatch_table[SomeClass] = reduce_SomeClass
698
699creates an instance of :class:`pickle.Pickler` with a private dispatch
700table which handles the ``SomeClass`` class specially. Alternatively,
701the code ::
702
703 class MyPickler(pickle.Pickler):
704 dispatch_table = copyreg.dispatch_table.copy()
705 dispatch_table[SomeClass] = reduce_SomeClass
706 f = io.BytesIO()
707 p = MyPickler(f)
708
709does the same, but all instances of ``MyPickler`` will by default
710share the same dispatch table. The equivalent code using the
711:mod:`copyreg` module is ::
712
713 copyreg.pickle(SomeClass, reduce_SomeClass)
714 f = io.BytesIO()
715 p = pickle.Pickler(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000716
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000717.. _pickle-state:
718
719Handling Stateful Objects
720^^^^^^^^^^^^^^^^^^^^^^^^^
721
722.. index::
723 single: __getstate__() (copy protocol)
724 single: __setstate__() (copy protocol)
725
726Here's an example that shows how to modify pickling behavior for a class.
727The :class:`TextReader` class opens a text file, and returns the line number and
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300728line contents each time its :meth:`!readline` method is called. If a
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000729:class:`TextReader` instance is pickled, all attributes *except* the file object
730member are saved. When the instance is unpickled, the file is reopened, and
731reading resumes from the last location. The :meth:`__setstate__` and
732:meth:`__getstate__` methods are used to implement this behavior. ::
733
734 class TextReader:
735 """Print and number lines in a text file."""
736
737 def __init__(self, filename):
738 self.filename = filename
739 self.file = open(filename)
740 self.lineno = 0
741
742 def readline(self):
743 self.lineno += 1
744 line = self.file.readline()
745 if not line:
746 return None
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000747 if line.endswith('\n'):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000748 line = line[:-1]
749 return "%i: %s" % (self.lineno, line)
750
751 def __getstate__(self):
752 # Copy the object's state from self.__dict__ which contains
753 # all our instance attributes. Always use the dict.copy()
754 # method to avoid modifying the original state.
755 state = self.__dict__.copy()
756 # Remove the unpicklable entries.
757 del state['file']
758 return state
759
760 def __setstate__(self, state):
761 # Restore instance attributes (i.e., filename and lineno).
762 self.__dict__.update(state)
763 # Restore the previously opened file's state. To do so, we need to
764 # reopen it and read from it until the line count is restored.
765 file = open(self.filename)
766 for _ in range(self.lineno):
767 file.readline()
768 # Finally, save the file.
769 self.file = file
770
771
772A sample usage might be something like this::
773
774 >>> reader = TextReader("hello.txt")
775 >>> reader.readline()
776 '1: Hello world!'
777 >>> reader.readline()
778 '2: I am line number two.'
779 >>> new_reader = pickle.loads(pickle.dumps(reader))
780 >>> new_reader.readline()
781 '3: Goodbye!'
782
783
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000784.. _pickle-restrict:
Georg Brandl116aa622007-08-15 14:28:22 +0000785
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000786Restricting Globals
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000787-------------------
Georg Brandl116aa622007-08-15 14:28:22 +0000788
Christian Heimes05e8be12008-02-23 18:30:17 +0000789.. index::
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000790 single: find_class() (pickle protocol)
Christian Heimes05e8be12008-02-23 18:30:17 +0000791
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000792By default, unpickling will import any class or function that it finds in the
793pickle data. For many applications, this behaviour is unacceptable as it
794permits the unpickler to import and invoke arbitrary code. Just consider what
795this hand-crafted pickle data stream does when loaded::
Georg Brandl116aa622007-08-15 14:28:22 +0000796
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000797 >>> import pickle
798 >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
799 hello world
800 0
Georg Brandl116aa622007-08-15 14:28:22 +0000801
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000802In this example, the unpickler imports the :func:`os.system` function and then
803apply the string argument "echo hello world". Although this example is
804inoffensive, it is not difficult to imagine one that could damage your system.
Georg Brandl116aa622007-08-15 14:28:22 +0000805
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000806For this reason, you may want to control what gets unpickled by customizing
Serhiy Storchaka5bbbc942013-10-14 10:43:46 +0300807:meth:`Unpickler.find_class`. Unlike its name suggests,
808:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or
809a function) is requested. Thus it is possible to either completely forbid
810globals or restrict them to a safe subset.
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000811
812Here is an example of an unpickler allowing only few safe classes from the
813:mod:`builtins` module to be loaded::
814
815 import builtins
816 import io
817 import pickle
818
819 safe_builtins = {
820 'range',
821 'complex',
822 'set',
823 'frozenset',
824 'slice',
825 }
826
827 class RestrictedUnpickler(pickle.Unpickler):
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000828
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000829 def find_class(self, module, name):
830 # Only allow safe classes from builtins.
831 if module == "builtins" and name in safe_builtins:
832 return getattr(builtins, name)
833 # Forbid everything else.
834 raise pickle.UnpicklingError("global '%s.%s' is forbidden" %
835 (module, name))
836
837 def restricted_loads(s):
838 """Helper function analogous to pickle.loads()."""
839 return RestrictedUnpickler(io.BytesIO(s)).load()
840
841A sample usage of our unpickler working has intended::
842
843 >>> restricted_loads(pickle.dumps([1, 2, range(15)]))
844 [1, 2, range(0, 15)]
845 >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.")
846 Traceback (most recent call last):
847 ...
848 pickle.UnpicklingError: global 'os.system' is forbidden
849 >>> restricted_loads(b'cbuiltins\neval\n'
850 ... b'(S\'getattr(__import__("os"), "system")'
851 ... b'("echo hello world")\'\ntR.')
852 Traceback (most recent call last):
853 ...
854 pickle.UnpicklingError: global 'builtins.eval' is forbidden
855
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000856
857.. XXX Add note about how extension codes could evade our protection
Georg Brandl48310cd2009-01-03 21:18:54 +0000858 mechanism (e.g. cached classes do not invokes find_class()).
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000859
860As our examples shows, you have to be careful with what you allow to be
861unpickled. Therefore if security is a concern, you may want to consider
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000862alternatives such as the marshalling API in :mod:`xmlrpc.client` or
863third-party solutions.
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000864
Georg Brandl116aa622007-08-15 14:28:22 +0000865
Antoine Pitroud4d60552013-12-07 00:56:59 +0100866Performance
867-----------
868
869Recent versions of the pickle protocol (from protocol 2 and upwards) feature
870efficient binary encodings for several common features and built-in types.
871Also, the :mod:`pickle` module has a transparent optimizer written in C.
872
873
Georg Brandl116aa622007-08-15 14:28:22 +0000874.. _pickle-example:
875
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000876Examples
877--------
Georg Brandl116aa622007-08-15 14:28:22 +0000878
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000879For the simplest code, use the :func:`dump` and :func:`load` functions. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000880
881 import pickle
882
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000883 # An arbitrary collection of objects supported by pickle.
884 data = {
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000885 'a': [1, 2.0, 3, 4+6j],
886 'b': ("character string", b"byte string"),
Raymond Hettingerdf1b6992014-11-09 15:56:33 -0800887 'c': {None, True, False}
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000888 }
Georg Brandl116aa622007-08-15 14:28:22 +0000889
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000890 with open('data.pickle', 'wb') as f:
891 # Pickle the 'data' dictionary using the highest protocol available.
892 pickle.dump(data, f, pickle.HIGHEST_PROTOCOL)
Georg Brandl116aa622007-08-15 14:28:22 +0000893
Georg Brandl116aa622007-08-15 14:28:22 +0000894
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000895The following example reads the resulting pickled data. ::
Georg Brandl116aa622007-08-15 14:28:22 +0000896
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000897 import pickle
Georg Brandl116aa622007-08-15 14:28:22 +0000898
Alexandre Vassalottibcd1e3a2009-01-23 05:28:16 +0000899 with open('data.pickle', 'rb') as f:
900 # The protocol version used is detected automatically, so we do not
901 # have to specify it.
902 data = pickle.load(f)
Georg Brandl116aa622007-08-15 14:28:22 +0000903
Georg Brandl116aa622007-08-15 14:28:22 +0000904
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000905.. XXX: Add examples showing how to optimize pickles for size (like using
906.. pickletools.optimize() or the gzip module).
907
908
Georg Brandl116aa622007-08-15 14:28:22 +0000909.. seealso::
910
Alexandre Vassalottif7fa63d2008-05-11 08:55:36 +0000911 Module :mod:`copyreg`
Georg Brandl116aa622007-08-15 14:28:22 +0000912 Pickle interface constructor registration for extension types.
913
Alexandre Vassalotti9d7665d2009-04-03 06:13:29 +0000914 Module :mod:`pickletools`
915 Tools for working with and analyzing pickled data.
916
Georg Brandl116aa622007-08-15 14:28:22 +0000917 Module :mod:`shelve`
918 Indexed databases of objects; uses :mod:`pickle`.
919
920 Module :mod:`copy`
921 Shallow and deep object copying.
922
923 Module :mod:`marshal`
924 High-performance serialization of built-in types.
925
926
Georg Brandl116aa622007-08-15 14:28:22 +0000927.. rubric:: Footnotes
928
929.. [#] Don't confuse this with the :mod:`marshal` module
930
Ethan Furman2498d9e2013-10-18 00:45:40 -0700931.. [#] This is why :keyword:`lambda` functions cannot be pickled: all
932 :keyword:`lambda` functions share the same name: ``<lambda>``.
933
Georg Brandl116aa622007-08-15 14:28:22 +0000934.. [#] The exception raised will likely be an :exc:`ImportError` or an
935 :exc:`AttributeError` but it could be something else.
936
Alexandre Vassalotti73b90a82008-10-29 23:32:33 +0000937.. [#] The :mod:`copy` module uses this protocol for shallow and deep copying
938 operations.
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000939
Alexandre Vassalottid0392862008-10-24 01:32:40 +0000940.. [#] The limitation on alphanumeric characters is due to the fact
941 the persistent IDs, in protocol 0, are delimited by the newline
942 character. Therefore if any kind of newline characters occurs in
Alexandre Vassalotti5f3b63a2008-10-18 20:47:58 +0000943 persistent IDs, the resulting pickle will become unreadable.