| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 1 | :mod:`pickle` --- Python object serialization | 
|  | 2 | ============================================= | 
|  | 3 |  | 
|  | 4 | .. index:: | 
|  | 5 | single: persistence | 
|  | 6 | pair: persistent; objects | 
|  | 7 | pair: serializing; objects | 
|  | 8 | pair: marshalling; objects | 
|  | 9 | pair: flattening; objects | 
|  | 10 | pair: pickling; objects | 
|  | 11 |  | 
|  | 12 | .. module:: pickle | 
|  | 13 | :synopsis: Convert Python objects to streams of bytes and back. | 
| Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 14 | .. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>. | 
|  | 15 | .. sectionauthor:: Barry Warsaw <barry@zope.com> | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 16 |  | 
|  | 17 | The :mod:`pickle` module implements a fundamental, but powerful algorithm for | 
|  | 18 | serializing and de-serializing a Python object structure.  "Pickling" is the | 
|  | 19 | process whereby a Python object hierarchy is converted into a byte stream, and | 
|  | 20 | "unpickling" is the inverse operation, whereby a byte stream is converted back | 
|  | 21 | into an object hierarchy.  Pickling (and unpickling) is alternatively known as | 
|  | 22 | "serialization", "marshalling," [#]_ or "flattening", however, to avoid | 
|  | 23 | confusion, the terms used here are "pickling" and "unpickling". | 
|  | 24 |  | 
|  | 25 | This documentation describes both the :mod:`pickle` module and the | 
|  | 26 | :mod:`cPickle` module. | 
|  | 27 |  | 
| Georg Brandl | b8d0e36 | 2010-11-26 07:53:50 +0000 | [diff] [blame] | 28 | .. warning:: | 
|  | 29 |  | 
|  | 30 | The :mod:`pickle` module is not intended to be secure against erroneous or | 
|  | 31 | maliciously constructed data.  Never unpickle data received from an untrusted | 
|  | 32 | or unauthenticated source. | 
|  | 33 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 34 |  | 
|  | 35 | Relationship to other Python modules | 
|  | 36 | ------------------------------------ | 
|  | 37 |  | 
|  | 38 | The :mod:`pickle` module has an optimized cousin called the :mod:`cPickle` | 
|  | 39 | module.  As its name implies, :mod:`cPickle` is written in C, so it can be up to | 
|  | 40 | 1000 times faster than :mod:`pickle`.  However it does not support subclassing | 
|  | 41 | of the :func:`Pickler` and :func:`Unpickler` classes, because in :mod:`cPickle` | 
|  | 42 | these are functions, not classes.  Most applications have no need for this | 
|  | 43 | functionality, and can benefit from the improved performance of :mod:`cPickle`. | 
|  | 44 | Other than that, the interfaces of the two modules are nearly identical; the | 
|  | 45 | common interface is described in this manual and differences are pointed out | 
|  | 46 | where necessary.  In the following discussions, we use the term "pickle" to | 
|  | 47 | collectively describe the :mod:`pickle` and :mod:`cPickle` modules. | 
|  | 48 |  | 
|  | 49 | The data streams the two modules produce are guaranteed to be interchangeable. | 
|  | 50 |  | 
|  | 51 | Python has a more primitive serialization module called :mod:`marshal`, but in | 
|  | 52 | general :mod:`pickle` should always be the preferred way to serialize Python | 
|  | 53 | objects.  :mod:`marshal` exists primarily to support Python's :file:`.pyc` | 
|  | 54 | files. | 
|  | 55 |  | 
| Georg Brandl | 52f8395 | 2011-02-25 10:39:23 +0000 | [diff] [blame] | 56 | The :mod:`pickle` module differs from :mod:`marshal` in several significant ways: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 57 |  | 
|  | 58 | * The :mod:`pickle` module keeps track of the objects it has already serialized, | 
|  | 59 | so that later references to the same object won't be serialized again. | 
|  | 60 | :mod:`marshal` doesn't do this. | 
|  | 61 |  | 
|  | 62 | This has implications both for recursive objects and object sharing.  Recursive | 
|  | 63 | objects are objects that contain references to themselves.  These are not | 
|  | 64 | handled by marshal, and in fact, attempting to marshal recursive objects will | 
|  | 65 | crash your Python interpreter.  Object sharing happens when there are multiple | 
|  | 66 | references to the same object in different places in the object hierarchy being | 
|  | 67 | serialized.  :mod:`pickle` stores such objects only once, and ensures that all | 
|  | 68 | other references point to the master copy.  Shared objects remain shared, which | 
|  | 69 | can be very important for mutable objects. | 
|  | 70 |  | 
|  | 71 | * :mod:`marshal` cannot be used to serialize user-defined classes and their | 
|  | 72 | instances.  :mod:`pickle` can save and restore class instances transparently, | 
|  | 73 | however the class definition must be importable and live in the same module as | 
|  | 74 | when the object was stored. | 
|  | 75 |  | 
|  | 76 | * The :mod:`marshal` serialization format is not guaranteed to be portable | 
|  | 77 | across Python versions.  Because its primary job in life is to support | 
|  | 78 | :file:`.pyc` files, the Python implementers reserve the right to change the | 
|  | 79 | serialization format in non-backwards compatible ways should the need arise. | 
|  | 80 | The :mod:`pickle` serialization format is guaranteed to be backwards compatible | 
|  | 81 | across Python releases. | 
|  | 82 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 83 | Note that serialization is a more primitive notion than persistence; although | 
|  | 84 | :mod:`pickle` reads and writes file objects, it does not handle the issue of | 
|  | 85 | naming persistent objects, nor the (even more complicated) issue of concurrent | 
|  | 86 | access to persistent objects.  The :mod:`pickle` module can transform a complex | 
|  | 87 | object into a byte stream and it can transform the byte stream into an object | 
|  | 88 | with the same internal structure.  Perhaps the most obvious thing to do with | 
|  | 89 | these byte streams is to write them onto a file, but it is also conceivable to | 
|  | 90 | send them across a network or store them in a database.  The module | 
|  | 91 | :mod:`shelve` provides a simple interface to pickle and unpickle objects on | 
|  | 92 | DBM-style database files. | 
|  | 93 |  | 
|  | 94 |  | 
|  | 95 | Data stream format | 
|  | 96 | ------------------ | 
|  | 97 |  | 
|  | 98 | .. index:: | 
|  | 99 | single: XDR | 
|  | 100 | single: External Data Representation | 
|  | 101 |  | 
|  | 102 | The data format used by :mod:`pickle` is Python-specific.  This has the | 
|  | 103 | advantage that there are no restrictions imposed by external standards such as | 
|  | 104 | XDR (which can't represent pointer sharing); however it means that non-Python | 
|  | 105 | programs may not be able to reconstruct pickled Python objects. | 
|  | 106 |  | 
|  | 107 | By default, the :mod:`pickle` data format uses a printable ASCII representation. | 
|  | 108 | This is slightly more voluminous than a binary representation.  The big | 
|  | 109 | advantage of using printable ASCII (and of some other characteristics of | 
|  | 110 | :mod:`pickle`'s representation) is that for debugging or recovery purposes it is | 
|  | 111 | possible for a human to read the pickled file with a standard text editor. | 
|  | 112 |  | 
|  | 113 | There are currently 3 different protocols which can be used for pickling. | 
|  | 114 |  | 
|  | 115 | * Protocol version 0 is the original ASCII protocol and is backwards compatible | 
|  | 116 | with earlier versions of Python. | 
|  | 117 |  | 
|  | 118 | * Protocol version 1 is the old binary format which is also compatible with | 
|  | 119 | earlier versions of Python. | 
|  | 120 |  | 
|  | 121 | * Protocol version 2 was introduced in Python 2.3.  It provides much more | 
| Georg Brandl | a739503 | 2007-10-21 12:15:05 +0000 | [diff] [blame] | 122 | efficient pickling of :term:`new-style class`\es. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 123 |  | 
|  | 124 | Refer to :pep:`307` for more information. | 
|  | 125 |  | 
|  | 126 | If a *protocol* is not specified, protocol 0 is used. If *protocol* is specified | 
|  | 127 | as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol version | 
|  | 128 | available will be used. | 
|  | 129 |  | 
|  | 130 | .. versionchanged:: 2.3 | 
|  | 131 | Introduced the *protocol* parameter. | 
|  | 132 |  | 
|  | 133 | A binary format, which is slightly more efficient, can be chosen by specifying a | 
|  | 134 | *protocol* version >= 1. | 
|  | 135 |  | 
|  | 136 |  | 
|  | 137 | Usage | 
|  | 138 | ----- | 
|  | 139 |  | 
|  | 140 | To serialize an object hierarchy, you first create a pickler, then you call the | 
|  | 141 | pickler's :meth:`dump` method.  To de-serialize a data stream, you first create | 
|  | 142 | an unpickler, then you call the unpickler's :meth:`load` method.  The | 
|  | 143 | :mod:`pickle` module provides the following constant: | 
|  | 144 |  | 
|  | 145 |  | 
|  | 146 | .. data:: HIGHEST_PROTOCOL | 
|  | 147 |  | 
|  | 148 | The highest protocol version available.  This value can be passed as a | 
|  | 149 | *protocol* value. | 
|  | 150 |  | 
|  | 151 | .. versionadded:: 2.3 | 
|  | 152 |  | 
|  | 153 | .. note:: | 
|  | 154 |  | 
|  | 155 | Be sure to always open pickle files created with protocols >= 1 in binary mode. | 
|  | 156 | For the old ASCII-based pickle protocol 0 you can use either text mode or binary | 
|  | 157 | mode as long as you stay consistent. | 
|  | 158 |  | 
|  | 159 | A pickle file written with protocol 0 in binary mode will contain lone linefeeds | 
|  | 160 | as line terminators and therefore will look "funny" when viewed in Notepad or | 
|  | 161 | other editors which do not support this format. | 
|  | 162 |  | 
|  | 163 | The :mod:`pickle` module provides the following functions to make the pickling | 
|  | 164 | process more convenient: | 
|  | 165 |  | 
|  | 166 |  | 
|  | 167 | .. function:: dump(obj, file[, protocol]) | 
|  | 168 |  | 
|  | 169 | Write a pickled representation of *obj* to the open file object *file*.  This is | 
|  | 170 | equivalent to ``Pickler(file, protocol).dump(obj)``. | 
|  | 171 |  | 
|  | 172 | If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is | 
|  | 173 | specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol | 
|  | 174 | version will be used. | 
|  | 175 |  | 
|  | 176 | .. versionchanged:: 2.3 | 
|  | 177 | Introduced the *protocol* parameter. | 
|  | 178 |  | 
|  | 179 | *file* must have a :meth:`write` method that accepts a single string argument. | 
|  | 180 | It can thus be a file object opened for writing, a :mod:`StringIO` object, or | 
|  | 181 | any other custom object that meets this interface. | 
|  | 182 |  | 
|  | 183 |  | 
|  | 184 | .. function:: load(file) | 
|  | 185 |  | 
|  | 186 | Read a string from the open file object *file* and interpret it as a pickle data | 
|  | 187 | stream, reconstructing and returning the original object hierarchy.  This is | 
|  | 188 | equivalent to ``Unpickler(file).load()``. | 
|  | 189 |  | 
|  | 190 | *file* must have two methods, a :meth:`read` method that takes an integer | 
|  | 191 | argument, and a :meth:`readline` method that requires no arguments.  Both | 
|  | 192 | methods should return a string.  Thus *file* can be a file object opened for | 
|  | 193 | reading, a :mod:`StringIO` object, or any other custom object that meets this | 
|  | 194 | interface. | 
|  | 195 |  | 
|  | 196 | This function automatically determines whether the data stream was written in | 
|  | 197 | binary mode or not. | 
|  | 198 |  | 
|  | 199 |  | 
|  | 200 | .. function:: dumps(obj[, protocol]) | 
|  | 201 |  | 
|  | 202 | Return the pickled representation of the object as a string, instead of writing | 
|  | 203 | it to a file. | 
|  | 204 |  | 
|  | 205 | If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is | 
|  | 206 | specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol | 
|  | 207 | version will be used. | 
|  | 208 |  | 
|  | 209 | .. versionchanged:: 2.3 | 
|  | 210 | The *protocol* parameter was added. | 
|  | 211 |  | 
|  | 212 |  | 
|  | 213 | .. function:: loads(string) | 
|  | 214 |  | 
|  | 215 | Read a pickled object hierarchy from a string.  Characters in the string past | 
|  | 216 | the pickled object's representation are ignored. | 
|  | 217 |  | 
|  | 218 | The :mod:`pickle` module also defines three exceptions: | 
|  | 219 |  | 
|  | 220 |  | 
|  | 221 | .. exception:: PickleError | 
|  | 222 |  | 
|  | 223 | A common base class for the other exceptions defined below.  This inherits from | 
|  | 224 | :exc:`Exception`. | 
|  | 225 |  | 
|  | 226 |  | 
|  | 227 | .. exception:: PicklingError | 
|  | 228 |  | 
|  | 229 | This exception is raised when an unpicklable object is passed to the | 
|  | 230 | :meth:`dump` method. | 
|  | 231 |  | 
|  | 232 |  | 
|  | 233 | .. exception:: UnpicklingError | 
|  | 234 |  | 
|  | 235 | This exception is raised when there is a problem unpickling an object. Note that | 
|  | 236 | other exceptions may also be raised during unpickling, including (but not | 
|  | 237 | necessarily limited to) :exc:`AttributeError`, :exc:`EOFError`, | 
|  | 238 | :exc:`ImportError`, and :exc:`IndexError`. | 
|  | 239 |  | 
|  | 240 | The :mod:`pickle` module also exports two callables [#]_, :class:`Pickler` and | 
|  | 241 | :class:`Unpickler`: | 
|  | 242 |  | 
|  | 243 |  | 
|  | 244 | .. class:: Pickler(file[, protocol]) | 
|  | 245 |  | 
|  | 246 | This takes a file-like object to which it will write a pickle data stream. | 
|  | 247 |  | 
|  | 248 | If the *protocol* parameter is omitted, protocol 0 is used. If *protocol* is | 
|  | 249 | specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest | 
|  | 250 | protocol version will be used. | 
|  | 251 |  | 
|  | 252 | .. versionchanged:: 2.3 | 
|  | 253 | Introduced the *protocol* parameter. | 
|  | 254 |  | 
|  | 255 | *file* must have a :meth:`write` method that accepts a single string argument. | 
|  | 256 | It can thus be an open file object, a :mod:`StringIO` object, or any other | 
|  | 257 | custom object that meets this interface. | 
|  | 258 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 259 | :class:`Pickler` objects define one (or two) public methods: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 260 |  | 
|  | 261 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 262 | .. method:: dump(obj) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 263 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 264 | Write a pickled representation of *obj* to the open file object given in the | 
|  | 265 | constructor.  Either the binary or ASCII format will be used, depending on the | 
|  | 266 | value of the *protocol* argument passed to the constructor. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 267 |  | 
|  | 268 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 269 | .. method:: clear_memo() | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 270 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 271 | Clears the pickler's "memo".  The memo is the data structure that remembers | 
|  | 272 | which objects the pickler has already seen, so that shared or recursive objects | 
|  | 273 | pickled by reference and not by value.  This method is useful when re-using | 
|  | 274 | picklers. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 275 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 276 | .. note:: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 277 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 278 | Prior to Python 2.3, :meth:`clear_memo` was only available on the picklers | 
|  | 279 | created by :mod:`cPickle`.  In the :mod:`pickle` module, picklers have an | 
|  | 280 | instance variable called :attr:`memo` which is a Python dictionary.  So to clear | 
|  | 281 | the memo for a :mod:`pickle` module pickler, you could do the following:: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 282 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 283 | mypickler.memo.clear() | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 284 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 285 | Code that does not need to support older versions of Python should simply use | 
|  | 286 | :meth:`clear_memo`. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 287 |  | 
|  | 288 | It is possible to make multiple calls to the :meth:`dump` method of the same | 
|  | 289 | :class:`Pickler` instance.  These must then be matched to the same number of | 
|  | 290 | calls to the :meth:`load` method of the corresponding :class:`Unpickler` | 
|  | 291 | instance.  If the same object is pickled by multiple :meth:`dump` calls, the | 
|  | 292 | :meth:`load` will all yield references to the same object. [#]_ | 
|  | 293 |  | 
|  | 294 | :class:`Unpickler` objects are defined as: | 
|  | 295 |  | 
|  | 296 |  | 
|  | 297 | .. class:: Unpickler(file) | 
|  | 298 |  | 
|  | 299 | This takes a file-like object from which it will read a pickle data stream. | 
|  | 300 | This class automatically determines whether the data stream was written in | 
|  | 301 | binary mode or not, so it does not need a flag as in the :class:`Pickler` | 
|  | 302 | factory. | 
|  | 303 |  | 
|  | 304 | *file* must have two methods, a :meth:`read` method that takes an integer | 
|  | 305 | argument, and a :meth:`readline` method that requires no arguments.  Both | 
|  | 306 | methods should return a string.  Thus *file* can be a file object opened for | 
|  | 307 | reading, a :mod:`StringIO` object, or any other custom object that meets this | 
|  | 308 | interface. | 
|  | 309 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 310 | :class:`Unpickler` objects have one (or two) public methods: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 311 |  | 
|  | 312 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 313 | .. method:: load() | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 314 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 315 | Read a pickled object representation from the open file object given in | 
|  | 316 | the constructor, and return the reconstituted object hierarchy specified | 
|  | 317 | therein. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 318 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 319 | This method automatically determines whether the data stream was written | 
|  | 320 | in binary mode or not. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 321 |  | 
|  | 322 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 323 | .. method:: noload() | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 324 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 325 | This is just like :meth:`load` except that it doesn't actually create any | 
|  | 326 | objects.  This is useful primarily for finding what's called "persistent | 
|  | 327 | ids" that may be referenced in a pickle data stream.  See section | 
|  | 328 | :ref:`pickle-protocol` below for more details. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 329 |  | 
| Benjamin Peterson | c7b0592 | 2008-04-25 01:29:10 +0000 | [diff] [blame] | 330 | **Note:** the :meth:`noload` method is currently only available on | 
|  | 331 | :class:`Unpickler` objects created with the :mod:`cPickle` module. | 
|  | 332 | :mod:`pickle` module :class:`Unpickler`\ s do not have the :meth:`noload` | 
|  | 333 | method. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 334 |  | 
|  | 335 |  | 
|  | 336 | What can be pickled and unpickled? | 
|  | 337 | ---------------------------------- | 
|  | 338 |  | 
|  | 339 | The following types can be pickled: | 
|  | 340 |  | 
|  | 341 | * ``None``, ``True``, and ``False`` | 
|  | 342 |  | 
|  | 343 | * integers, long integers, floating point numbers, complex numbers | 
|  | 344 |  | 
|  | 345 | * normal and Unicode strings | 
|  | 346 |  | 
|  | 347 | * tuples, lists, sets, and dictionaries containing only picklable objects | 
|  | 348 |  | 
|  | 349 | * functions defined at the top level of a module | 
|  | 350 |  | 
|  | 351 | * built-in functions defined at the top level of a module | 
|  | 352 |  | 
|  | 353 | * classes that are defined at the top level of a module | 
|  | 354 |  | 
| Eli Bendersky | f29abd3 | 2013-01-02 06:02:23 -0800 | [diff] [blame^] | 355 | * instances of such classes whose :attr:`__dict__` or the result of calling | 
|  | 356 | :meth:`__getstate__` is picklable  (see section :ref:`pickle-protocol` for | 
|  | 357 | details). | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 358 |  | 
|  | 359 | Attempts to pickle unpicklable objects will raise the :exc:`PicklingError` | 
|  | 360 | exception; when this happens, an unspecified number of bytes may have already | 
|  | 361 | been written to the underlying file. Trying to pickle a highly recursive data | 
|  | 362 | structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be | 
|  | 363 | raised in this case. You can carefully raise this limit with | 
|  | 364 | :func:`sys.setrecursionlimit`. | 
|  | 365 |  | 
|  | 366 | Note that functions (built-in and user-defined) are pickled by "fully qualified" | 
|  | 367 | name reference, not by value.  This means that only the function name is | 
| Eli Bendersky | f29abd3 | 2013-01-02 06:02:23 -0800 | [diff] [blame^] | 368 | pickled, along with the name of the module the function is defined in.  Neither | 
|  | 369 | the function's code, nor any of its function attributes are pickled.  Thus the | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 370 | defining module must be importable in the unpickling environment, and the module | 
|  | 371 | must contain the named object, otherwise an exception will be raised. [#]_ | 
|  | 372 |  | 
|  | 373 | Similarly, classes are pickled by named reference, so the same restrictions in | 
|  | 374 | the unpickling environment apply.  Note that none of the class's code or data is | 
|  | 375 | pickled, so in the following example the class attribute ``attr`` is not | 
|  | 376 | restored in the unpickling environment:: | 
|  | 377 |  | 
|  | 378 | class Foo: | 
|  | 379 | attr = 'a class attr' | 
|  | 380 |  | 
|  | 381 | picklestring = pickle.dumps(Foo) | 
|  | 382 |  | 
|  | 383 | These restrictions are why picklable functions and classes must be defined in | 
|  | 384 | the top level of a module. | 
|  | 385 |  | 
|  | 386 | Similarly, when class instances are pickled, their class's code and data are not | 
|  | 387 | pickled along with them.  Only the instance data are pickled.  This is done on | 
|  | 388 | purpose, so you can fix bugs in a class or add methods to the class and still | 
|  | 389 | load objects that were created with an earlier version of the class.  If you | 
|  | 390 | plan to have long-lived objects that will see many versions of a class, it may | 
|  | 391 | be worthwhile to put a version number in the objects so that suitable | 
|  | 392 | conversions can be made by the class's :meth:`__setstate__` method. | 
|  | 393 |  | 
|  | 394 |  | 
|  | 395 | .. _pickle-protocol: | 
|  | 396 |  | 
|  | 397 | The pickle protocol | 
|  | 398 | ------------------- | 
|  | 399 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 400 | .. currentmodule:: None | 
|  | 401 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 402 | This section describes the "pickling protocol" that defines the interface | 
|  | 403 | between the pickler/unpickler and the objects that are being serialized.  This | 
|  | 404 | protocol provides a standard way for you to define, customize, and control how | 
|  | 405 | your objects are serialized and de-serialized.  The description in this section | 
|  | 406 | doesn't cover specific customizations that you can employ to make the unpickling | 
|  | 407 | environment slightly safer from untrusted pickle data streams; see section | 
|  | 408 | :ref:`pickle-sub` for more details. | 
|  | 409 |  | 
|  | 410 |  | 
|  | 411 | .. _pickle-inst: | 
|  | 412 |  | 
|  | 413 | Pickling and unpickling normal class instances | 
|  | 414 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 415 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 416 | .. method:: object.__getinitargs__() | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 417 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 418 | When a pickled class instance is unpickled, its :meth:`__init__` method is | 
|  | 419 | normally *not* invoked.  If it is desirable that the :meth:`__init__` method | 
|  | 420 | be called on unpickling, an old-style class can define a method | 
|  | 421 | :meth:`__getinitargs__`, which should return a *tuple* containing the | 
|  | 422 | arguments to be passed to the class constructor (:meth:`__init__` for | 
|  | 423 | example).  The :meth:`__getinitargs__` method is called at pickle time; the | 
|  | 424 | tuple it returns is incorporated in the pickle for the instance. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 425 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 426 | .. method:: object.__getnewargs__() | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 427 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 428 | New-style types can provide a :meth:`__getnewargs__` method that is used for | 
|  | 429 | protocol 2.  Implementing this method is needed if the type establishes some | 
|  | 430 | internal invariants when the instance is created, or if the memory allocation | 
|  | 431 | is affected by the values passed to the :meth:`__new__` method for the type | 
|  | 432 | (as it is for tuples and strings).  Instances of a :term:`new-style class` | 
|  | 433 | ``C`` are created using :: | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 434 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 435 | obj = C.__new__(C, *args) | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 436 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 437 | where *args* is the result of calling :meth:`__getnewargs__` on the original | 
|  | 438 | object; if there is no :meth:`__getnewargs__`, an empty tuple is assumed. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 439 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 440 | .. method:: object.__getstate__() | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 441 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 442 | Classes can further influence how their instances are pickled; if the class | 
|  | 443 | defines the method :meth:`__getstate__`, it is called and the return state is | 
|  | 444 | pickled as the contents for the instance, instead of the contents of the | 
|  | 445 | instance's dictionary.  If there is no :meth:`__getstate__` method, the | 
|  | 446 | instance's :attr:`__dict__` is pickled. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 447 |  | 
| Georg Brandl | c463c8a | 2010-10-17 11:07:40 +0000 | [diff] [blame] | 448 | .. method:: object.__setstate__(state) | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 449 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 450 | Upon unpickling, if the class also defines the method :meth:`__setstate__`, | 
|  | 451 | it is called with the unpickled state. [#]_ If there is no | 
|  | 452 | :meth:`__setstate__` method, the pickled state must be a dictionary and its | 
|  | 453 | items are assigned to the new instance's dictionary.  If a class defines both | 
|  | 454 | :meth:`__getstate__` and :meth:`__setstate__`, the state object needn't be a | 
|  | 455 | dictionary and these methods can do what they want. [#]_ | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 456 |  | 
| Georg Brandl | 16a57f6 | 2009-04-27 15:29:09 +0000 | [diff] [blame] | 457 | .. note:: | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 458 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 459 | For :term:`new-style class`\es, if :meth:`__getstate__` returns a false | 
|  | 460 | value, the :meth:`__setstate__` method will not be called. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 461 |  | 
| Georg Brandl | a7ec072 | 2009-04-05 14:40:06 +0000 | [diff] [blame] | 462 | .. note:: | 
|  | 463 |  | 
|  | 464 | At unpickling time, some methods like :meth:`__getattr__`, | 
|  | 465 | :meth:`__getattribute__`, or :meth:`__setattr__` may be called upon the | 
|  | 466 | instance.  In case those methods rely on some internal invariant being | 
|  | 467 | true, the type should implement either :meth:`__getinitargs__` or | 
|  | 468 | :meth:`__getnewargs__` to establish such an invariant; otherwise, neither | 
|  | 469 | :meth:`__new__` nor :meth:`__init__` will be called. | 
|  | 470 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 471 |  | 
|  | 472 | Pickling and unpickling extension types | 
|  | 473 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 474 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 475 | .. method:: object.__reduce__() | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 476 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 477 | When the :class:`Pickler` encounters an object of a type it knows nothing | 
|  | 478 | about --- such as an extension type --- it looks in two places for a hint of | 
|  | 479 | how to pickle it.  One alternative is for the object to implement a | 
|  | 480 | :meth:`__reduce__` method.  If provided, at pickling time :meth:`__reduce__` | 
|  | 481 | will be called with no arguments, and it must return either a string or a | 
|  | 482 | tuple. | 
| Andrew M. Kuchling | 8887e54 | 2008-02-23 16:39:43 +0000 | [diff] [blame] | 483 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 484 | If a string is returned, it names a global variable whose contents are | 
|  | 485 | pickled as normal.  The string returned by :meth:`__reduce__` should be the | 
|  | 486 | object's local name relative to its module; the pickle module searches the | 
|  | 487 | module namespace to determine the object's module. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 488 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 489 | When a tuple is returned, it must be between two and five elements long. | 
|  | 490 | Optional elements can either be omitted, or ``None`` can be provided as their | 
|  | 491 | value.  The contents of this tuple are pickled as normal and used to | 
|  | 492 | reconstruct the object at unpickling time.  The semantics of each element | 
|  | 493 | are: | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 494 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 495 | * A callable object that will be called to create the initial version of the | 
|  | 496 | object.  The next element of the tuple will provide arguments for this | 
|  | 497 | callable, and later elements provide additional state information that will | 
|  | 498 | subsequently be used to fully reconstruct the pickled data. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 499 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 500 | In the unpickling environment this object must be either a class, a | 
|  | 501 | callable registered as a "safe constructor" (see below), or it must have an | 
|  | 502 | attribute :attr:`__safe_for_unpickling__` with a true value. Otherwise, an | 
|  | 503 | :exc:`UnpicklingError` will be raised in the unpickling environment.  Note | 
|  | 504 | that as usual, the callable itself is pickled by name. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 505 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 506 | * A tuple of arguments for the callable object. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 507 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 508 | .. versionchanged:: 2.5 | 
|  | 509 | Formerly, this argument could also be ``None``. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 510 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 511 | * Optionally, the object's state, which will be passed to the object's | 
|  | 512 | :meth:`__setstate__` method as described in section :ref:`pickle-inst`.  If | 
|  | 513 | the object has no :meth:`__setstate__` method, then, as above, the value | 
|  | 514 | must be a dictionary and it will be added to the object's :attr:`__dict__`. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 515 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 516 | * Optionally, an iterator (and not a sequence) yielding successive list | 
|  | 517 | items.  These list items will be pickled, and appended to the object using | 
|  | 518 | either ``obj.append(item)`` or ``obj.extend(list_of_items)``.  This is | 
|  | 519 | primarily used for list subclasses, but may be used by other classes as | 
|  | 520 | long as they have :meth:`append` and :meth:`extend` methods with the | 
|  | 521 | appropriate signature.  (Whether :meth:`append` or :meth:`extend` is used | 
|  | 522 | depends on which pickle protocol version is used as well as the number of | 
|  | 523 | items to append, so both must be supported.) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 524 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 525 | * Optionally, an iterator (not a sequence) yielding successive dictionary | 
|  | 526 | items, which should be tuples of the form ``(key, value)``.  These items | 
|  | 527 | will be pickled and stored to the object using ``obj[key] = value``. This | 
|  | 528 | is primarily used for dictionary subclasses, but may be used by other | 
|  | 529 | classes as long as they implement :meth:`__setitem__`. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 530 |  | 
| Georg Brandl | c62ef8b | 2009-01-03 20:55:06 +0000 | [diff] [blame] | 531 | .. method:: object.__reduce_ex__(protocol) | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 532 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 533 | It is sometimes useful to know the protocol version when implementing | 
|  | 534 | :meth:`__reduce__`.  This can be done by implementing a method named | 
|  | 535 | :meth:`__reduce_ex__` instead of :meth:`__reduce__`. :meth:`__reduce_ex__`, | 
|  | 536 | when it exists, is called in preference over :meth:`__reduce__` (you may | 
|  | 537 | still provide :meth:`__reduce__` for backwards compatibility).  The | 
|  | 538 | :meth:`__reduce_ex__` method will be called with a single integer argument, | 
|  | 539 | the protocol version. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 540 |  | 
| Georg Brandl | 66ef83b | 2008-07-04 17:22:53 +0000 | [diff] [blame] | 541 | The :class:`object` class implements both :meth:`__reduce__` and | 
|  | 542 | :meth:`__reduce_ex__`; however, if a subclass overrides :meth:`__reduce__` | 
|  | 543 | but not :meth:`__reduce_ex__`, the :meth:`__reduce_ex__` implementation | 
|  | 544 | detects this and calls :meth:`__reduce__`. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 545 |  | 
|  | 546 | An alternative to implementing a :meth:`__reduce__` method on the object to be | 
| Georg Brandl | dffbf5f | 2008-05-20 07:49:57 +0000 | [diff] [blame] | 547 | pickled, is to register the callable with the :mod:`copy_reg` module.  This | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 548 | module provides a way for programs to register "reduction functions" and | 
|  | 549 | constructors for user-defined types.   Reduction functions have the same | 
|  | 550 | semantics and interface as the :meth:`__reduce__` method described above, except | 
|  | 551 | that they are called with a single argument, the object to be pickled. | 
|  | 552 |  | 
|  | 553 | The registered constructor is deemed a "safe constructor" for purposes of | 
|  | 554 | unpickling as described above. | 
|  | 555 |  | 
|  | 556 |  | 
|  | 557 | Pickling and unpickling external objects | 
|  | 558 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 559 |  | 
| Andrew M. Kuchling | 8887e54 | 2008-02-23 16:39:43 +0000 | [diff] [blame] | 560 | .. index:: | 
|  | 561 | single: persistent_id (pickle protocol) | 
|  | 562 | single: persistent_load (pickle protocol) | 
|  | 563 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 564 | For the benefit of object persistence, the :mod:`pickle` module supports the | 
|  | 565 | notion of a reference to an object outside the pickled data stream.  Such | 
|  | 566 | objects are referenced by a "persistent id", which is just an arbitrary string | 
|  | 567 | of printable ASCII characters. The resolution of such names is not defined by | 
|  | 568 | the :mod:`pickle` module; it will delegate this resolution to user defined | 
|  | 569 | functions on the pickler and unpickler. [#]_ | 
|  | 570 |  | 
|  | 571 | To define external persistent id resolution, you need to set the | 
|  | 572 | :attr:`persistent_id` attribute of the pickler object and the | 
|  | 573 | :attr:`persistent_load` attribute of the unpickler object. | 
|  | 574 |  | 
|  | 575 | To pickle objects that have an external persistent id, the pickler must have a | 
|  | 576 | custom :func:`persistent_id` method that takes an object as an argument and | 
|  | 577 | returns either ``None`` or the persistent id for that object.  When ``None`` is | 
|  | 578 | returned, the pickler simply pickles the object as normal.  When a persistent id | 
|  | 579 | string is returned, the pickler will pickle that string, along with a marker so | 
|  | 580 | that the unpickler will recognize the string as a persistent id. | 
|  | 581 |  | 
|  | 582 | To unpickle external objects, the unpickler must have a custom | 
|  | 583 | :func:`persistent_load` function that takes a persistent id string and returns | 
|  | 584 | the referenced object. | 
|  | 585 |  | 
|  | 586 | Here's a silly example that *might* shed more light:: | 
|  | 587 |  | 
|  | 588 | import pickle | 
|  | 589 | from cStringIO import StringIO | 
|  | 590 |  | 
|  | 591 | src = StringIO() | 
|  | 592 | p = pickle.Pickler(src) | 
|  | 593 |  | 
|  | 594 | def persistent_id(obj): | 
|  | 595 | if hasattr(obj, 'x'): | 
|  | 596 | return 'the value %d' % obj.x | 
|  | 597 | else: | 
|  | 598 | return None | 
|  | 599 |  | 
|  | 600 | p.persistent_id = persistent_id | 
|  | 601 |  | 
|  | 602 | class Integer: | 
|  | 603 | def __init__(self, x): | 
|  | 604 | self.x = x | 
|  | 605 | def __str__(self): | 
|  | 606 | return 'My name is integer %d' % self.x | 
|  | 607 |  | 
|  | 608 | i = Integer(7) | 
|  | 609 | print i | 
|  | 610 | p.dump(i) | 
|  | 611 |  | 
|  | 612 | datastream = src.getvalue() | 
|  | 613 | print repr(datastream) | 
|  | 614 | dst = StringIO(datastream) | 
|  | 615 |  | 
|  | 616 | up = pickle.Unpickler(dst) | 
|  | 617 |  | 
|  | 618 | class FancyInteger(Integer): | 
|  | 619 | def __str__(self): | 
|  | 620 | return 'I am the integer %d' % self.x | 
|  | 621 |  | 
|  | 622 | def persistent_load(persid): | 
|  | 623 | if persid.startswith('the value '): | 
|  | 624 | value = int(persid.split()[2]) | 
|  | 625 | return FancyInteger(value) | 
|  | 626 | else: | 
|  | 627 | raise pickle.UnpicklingError, 'Invalid persistent id' | 
|  | 628 |  | 
|  | 629 | up.persistent_load = persistent_load | 
|  | 630 |  | 
|  | 631 | j = up.load() | 
|  | 632 | print j | 
|  | 633 |  | 
|  | 634 | In the :mod:`cPickle` module, the unpickler's :attr:`persistent_load` attribute | 
|  | 635 | can also be set to a Python list, in which case, when the unpickler reaches a | 
|  | 636 | persistent id, the persistent id string will simply be appended to this list. | 
|  | 637 | This functionality exists so that a pickle data stream can be "sniffed" for | 
|  | 638 | object references without actually instantiating all the objects in a pickle. | 
|  | 639 | [#]_  Setting :attr:`persistent_load` to a list is usually used in conjunction | 
|  | 640 | with the :meth:`noload` method on the Unpickler. | 
|  | 641 |  | 
| Georg Brandl | b19be57 | 2007-12-29 10:57:00 +0000 | [diff] [blame] | 642 | .. BAW: Both pickle and cPickle support something called inst_persistent_id() | 
|  | 643 | which appears to give unknown types a second shot at producing a persistent | 
|  | 644 | id.  Since Jim Fulton can't remember why it was added or what it's for, I'm | 
|  | 645 | leaving it undocumented. | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 646 |  | 
|  | 647 |  | 
|  | 648 | .. _pickle-sub: | 
|  | 649 |  | 
|  | 650 | Subclassing Unpicklers | 
|  | 651 | ---------------------- | 
|  | 652 |  | 
| Andrew M. Kuchling | 8887e54 | 2008-02-23 16:39:43 +0000 | [diff] [blame] | 653 | .. index:: | 
|  | 654 | single: load_global() (pickle protocol) | 
|  | 655 | single: find_global() (pickle protocol) | 
|  | 656 |  | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 657 | By default, unpickling will import any class that it finds in the pickle data. | 
|  | 658 | You can control exactly what gets unpickled and what gets called by customizing | 
|  | 659 | your unpickler.  Unfortunately, exactly how you do this is different depending | 
|  | 660 | on whether you're using :mod:`pickle` or :mod:`cPickle`. [#]_ | 
|  | 661 |  | 
|  | 662 | In the :mod:`pickle` module, you need to derive a subclass from | 
|  | 663 | :class:`Unpickler`, overriding the :meth:`load_global` method. | 
|  | 664 | :meth:`load_global` should read two lines from the pickle data stream where the | 
|  | 665 | first line will the name of the module containing the class and the second line | 
|  | 666 | will be the name of the instance's class.  It then looks up the class, possibly | 
|  | 667 | importing the module and digging out the attribute, then it appends what it | 
|  | 668 | finds to the unpickler's stack.  Later on, this class will be assigned to the | 
|  | 669 | :attr:`__class__` attribute of an empty class, as a way of magically creating an | 
|  | 670 | instance without calling its class's :meth:`__init__`. Your job (should you | 
|  | 671 | choose to accept it), would be to have :meth:`load_global` push onto the | 
|  | 672 | unpickler's stack, a known safe version of any class you deem safe to unpickle. | 
|  | 673 | It is up to you to produce such a class.  Or you could raise an error if you | 
|  | 674 | want to disallow all unpickling of instances.  If this sounds like a hack, | 
|  | 675 | you're right.  Refer to the source code to make this work. | 
|  | 676 |  | 
|  | 677 | Things are a little cleaner with :mod:`cPickle`, but not by much. To control | 
|  | 678 | what gets unpickled, you can set the unpickler's :attr:`find_global` attribute | 
|  | 679 | to a function or ``None``.  If it is ``None`` then any attempts to unpickle | 
|  | 680 | instances will raise an :exc:`UnpicklingError`.  If it is a function, then it | 
|  | 681 | should accept a module name and a class name, and return the corresponding class | 
|  | 682 | object.  It is responsible for looking up the class and performing any necessary | 
|  | 683 | imports, and it may raise an error to prevent instances of the class from being | 
|  | 684 | unpickled. | 
|  | 685 |  | 
|  | 686 | The moral of the story is that you should be really careful about the source of | 
|  | 687 | the strings your application unpickles. | 
|  | 688 |  | 
|  | 689 |  | 
|  | 690 | .. _pickle-example: | 
|  | 691 |  | 
|  | 692 | Example | 
|  | 693 | ------- | 
|  | 694 |  | 
|  | 695 | For the simplest code, use the :func:`dump` and :func:`load` functions.  Note | 
|  | 696 | that a self-referencing list is pickled and restored correctly. :: | 
|  | 697 |  | 
|  | 698 | import pickle | 
|  | 699 |  | 
|  | 700 | data1 = {'a': [1, 2.0, 3, 4+6j], | 
|  | 701 | 'b': ('string', u'Unicode string'), | 
|  | 702 | 'c': None} | 
|  | 703 |  | 
|  | 704 | selfref_list = [1, 2, 3] | 
|  | 705 | selfref_list.append(selfref_list) | 
|  | 706 |  | 
|  | 707 | output = open('data.pkl', 'wb') | 
|  | 708 |  | 
|  | 709 | # Pickle dictionary using protocol 0. | 
|  | 710 | pickle.dump(data1, output) | 
|  | 711 |  | 
|  | 712 | # Pickle the list using the highest protocol available. | 
|  | 713 | pickle.dump(selfref_list, output, -1) | 
|  | 714 |  | 
|  | 715 | output.close() | 
|  | 716 |  | 
|  | 717 | The following example reads the resulting pickled data.  When reading a | 
|  | 718 | pickle-containing file, you should open the file in binary mode because you | 
|  | 719 | can't be sure if the ASCII or binary format was used. :: | 
|  | 720 |  | 
| Benjamin Peterson | a7b55a3 | 2009-02-20 03:31:23 +0000 | [diff] [blame] | 721 | import pprint, pickle | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 722 |  | 
|  | 723 | pkl_file = open('data.pkl', 'rb') | 
|  | 724 |  | 
|  | 725 | data1 = pickle.load(pkl_file) | 
|  | 726 | pprint.pprint(data1) | 
|  | 727 |  | 
|  | 728 | data2 = pickle.load(pkl_file) | 
|  | 729 | pprint.pprint(data2) | 
|  | 730 |  | 
|  | 731 | pkl_file.close() | 
|  | 732 |  | 
|  | 733 | Here's a larger example that shows how to modify pickling behavior for a class. | 
|  | 734 | The :class:`TextReader` class opens a text file, and returns the line number and | 
|  | 735 | line contents each time its :meth:`readline` method is called. If a | 
|  | 736 | :class:`TextReader` instance is pickled, all attributes *except* the file object | 
|  | 737 | member are saved. When the instance is unpickled, the file is reopened, and | 
|  | 738 | reading resumes from the last location. The :meth:`__setstate__` and | 
|  | 739 | :meth:`__getstate__` methods are used to implement this behavior. :: | 
|  | 740 |  | 
|  | 741 | #!/usr/local/bin/python | 
|  | 742 |  | 
|  | 743 | class TextReader: | 
|  | 744 | """Print and number lines in a text file.""" | 
|  | 745 | def __init__(self, file): | 
|  | 746 | self.file = file | 
|  | 747 | self.fh = open(file) | 
|  | 748 | self.lineno = 0 | 
|  | 749 |  | 
|  | 750 | def readline(self): | 
|  | 751 | self.lineno = self.lineno + 1 | 
|  | 752 | line = self.fh.readline() | 
|  | 753 | if not line: | 
|  | 754 | return None | 
|  | 755 | if line.endswith("\n"): | 
|  | 756 | line = line[:-1] | 
|  | 757 | return "%d: %s" % (self.lineno, line) | 
|  | 758 |  | 
|  | 759 | def __getstate__(self): | 
|  | 760 | odict = self.__dict__.copy() # copy the dict since we change it | 
|  | 761 | del odict['fh']              # remove filehandle entry | 
|  | 762 | return odict | 
|  | 763 |  | 
|  | 764 | def __setstate__(self, dict): | 
|  | 765 | fh = open(dict['file'])      # reopen file | 
|  | 766 | count = dict['lineno']       # read from file... | 
|  | 767 | while count:                 # until line count is restored | 
|  | 768 | fh.readline() | 
|  | 769 | count = count - 1 | 
|  | 770 | self.__dict__.update(dict)   # update attributes | 
|  | 771 | self.fh = fh                 # save the file object | 
|  | 772 |  | 
|  | 773 | A sample usage might be something like this:: | 
|  | 774 |  | 
|  | 775 | >>> import TextReader | 
|  | 776 | >>> obj = TextReader.TextReader("TextReader.py") | 
|  | 777 | >>> obj.readline() | 
|  | 778 | '1: #!/usr/local/bin/python' | 
|  | 779 | >>> obj.readline() | 
|  | 780 | '2: ' | 
|  | 781 | >>> obj.readline() | 
|  | 782 | '3: class TextReader:' | 
|  | 783 | >>> import pickle | 
|  | 784 | >>> pickle.dump(obj, open('save.p', 'wb')) | 
|  | 785 |  | 
|  | 786 | If you want to see that :mod:`pickle` works across Python processes, start | 
|  | 787 | another Python session, before continuing.  What follows can happen from either | 
|  | 788 | the same process or a new process. :: | 
|  | 789 |  | 
|  | 790 | >>> import pickle | 
|  | 791 | >>> reader = pickle.load(open('save.p', 'rb')) | 
|  | 792 | >>> reader.readline() | 
|  | 793 | '4:     """Print and number lines in a text file."""' | 
|  | 794 |  | 
|  | 795 |  | 
|  | 796 | .. seealso:: | 
|  | 797 |  | 
| Georg Brandl | dffbf5f | 2008-05-20 07:49:57 +0000 | [diff] [blame] | 798 | Module :mod:`copy_reg` | 
| Georg Brandl | 8ec7f65 | 2007-08-15 14:28:01 +0000 | [diff] [blame] | 799 | Pickle interface constructor registration for extension types. | 
|  | 800 |  | 
|  | 801 | Module :mod:`shelve` | 
|  | 802 | Indexed databases of objects; uses :mod:`pickle`. | 
|  | 803 |  | 
|  | 804 | Module :mod:`copy` | 
|  | 805 | Shallow and deep object copying. | 
|  | 806 |  | 
|  | 807 | Module :mod:`marshal` | 
|  | 808 | High-performance serialization of built-in types. | 
|  | 809 |  | 
|  | 810 |  | 
|  | 811 | :mod:`cPickle` --- A faster :mod:`pickle` | 
|  | 812 | ========================================= | 
|  | 813 |  | 
|  | 814 | .. module:: cPickle | 
|  | 815 | :synopsis: Faster version of pickle, but not subclassable. | 
|  | 816 | .. moduleauthor:: Jim Fulton <jim@zope.com> | 
|  | 817 | .. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> | 
|  | 818 |  | 
|  | 819 |  | 
|  | 820 | .. index:: module: pickle | 
|  | 821 |  | 
|  | 822 | The :mod:`cPickle` module supports serialization and de-serialization of Python | 
|  | 823 | objects, providing an interface and functionality nearly identical to the | 
|  | 824 | :mod:`pickle` module.  There are several differences, the most important being | 
|  | 825 | performance and subclassability. | 
|  | 826 |  | 
|  | 827 | First, :mod:`cPickle` can be up to 1000 times faster than :mod:`pickle` because | 
|  | 828 | the former is implemented in C.  Second, in the :mod:`cPickle` module the | 
|  | 829 | callables :func:`Pickler` and :func:`Unpickler` are functions, not classes. | 
|  | 830 | This means that you cannot use them to derive custom pickling and unpickling | 
|  | 831 | subclasses.  Most applications have no need for this functionality and should | 
|  | 832 | benefit from the greatly improved performance of the :mod:`cPickle` module. | 
|  | 833 |  | 
|  | 834 | The pickle data stream produced by :mod:`pickle` and :mod:`cPickle` are | 
|  | 835 | identical, so it is possible to use :mod:`pickle` and :mod:`cPickle` | 
|  | 836 | interchangeably with existing pickles. [#]_ | 
|  | 837 |  | 
|  | 838 | There are additional minor differences in API between :mod:`cPickle` and | 
|  | 839 | :mod:`pickle`, however for most applications, they are interchangeable.  More | 
|  | 840 | documentation is provided in the :mod:`pickle` module documentation, which | 
|  | 841 | includes a list of the documented differences. | 
|  | 842 |  | 
|  | 843 | .. rubric:: Footnotes | 
|  | 844 |  | 
|  | 845 | .. [#] Don't confuse this with the :mod:`marshal` module | 
|  | 846 |  | 
|  | 847 | .. [#] In the :mod:`pickle` module these callables are classes, which you could | 
|  | 848 | subclass to customize the behavior.  However, in the :mod:`cPickle` module these | 
|  | 849 | callables are factory functions and so cannot be subclassed.  One common reason | 
|  | 850 | to subclass is to control what objects can actually be unpickled.  See section | 
|  | 851 | :ref:`pickle-sub` for more details. | 
|  | 852 |  | 
|  | 853 | .. [#] *Warning*: this is intended for pickling multiple objects without intervening | 
|  | 854 | modifications to the objects or their parts.  If you modify an object and then | 
|  | 855 | pickle it again using the same :class:`Pickler` instance, the object is not | 
|  | 856 | pickled again --- a reference to it is pickled and the :class:`Unpickler` will | 
|  | 857 | return the old value, not the modified one. There are two problems here: (1) | 
|  | 858 | detecting changes, and (2) marshalling a minimal set of changes.  Garbage | 
|  | 859 | Collection may also become a problem here. | 
|  | 860 |  | 
|  | 861 | .. [#] The exception raised will likely be an :exc:`ImportError` or an | 
|  | 862 | :exc:`AttributeError` but it could be something else. | 
|  | 863 |  | 
|  | 864 | .. [#] These methods can also be used to implement copying class instances. | 
|  | 865 |  | 
|  | 866 | .. [#] This protocol is also used by the shallow and deep copying operations defined in | 
|  | 867 | the :mod:`copy` module. | 
|  | 868 |  | 
|  | 869 | .. [#] The actual mechanism for associating these user defined functions is slightly | 
|  | 870 | different for :mod:`pickle` and :mod:`cPickle`.  The description given here | 
|  | 871 | works the same for both implementations.  Users of the :mod:`pickle` module | 
|  | 872 | could also use subclassing to effect the same results, overriding the | 
|  | 873 | :meth:`persistent_id` and :meth:`persistent_load` methods in the derived | 
|  | 874 | classes. | 
|  | 875 |  | 
|  | 876 | .. [#] We'll leave you with the image of Guido and Jim sitting around sniffing pickles | 
|  | 877 | in their living rooms. | 
|  | 878 |  | 
|  | 879 | .. [#] A word of caution: the mechanisms described here use internal attributes and | 
|  | 880 | methods, which are subject to change in future versions of Python.  We intend to | 
|  | 881 | someday provide a common interface for controlling this behavior, which will | 
|  | 882 | work in either :mod:`pickle` or :mod:`cPickle`. | 
|  | 883 |  | 
|  | 884 | .. [#] Since the pickle data format is actually a tiny stack-oriented programming | 
|  | 885 | language, and some freedom is taken in the encodings of certain objects, it is | 
|  | 886 | possible that the two modules produce different data streams for the same input | 
|  | 887 | objects.  However it is guaranteed that they will always be able to read each | 
|  | 888 | other's data streams. | 
|  | 889 |  |