blob: f6374d8ead1e68428f9562ff2d84ca3767059381 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{pickle} ---
Fred Drakeffbe6871999-04-22 21:23:22 +00002 Python object serialization}
Fred Drakeb91e9341998-07-23 17:59:49 +00003
Fred Drakeffbe6871999-04-22 21:23:22 +00004\declaremodule{standard}{pickle}
Fred Drakeb91e9341998-07-23 17:59:49 +00005\modulesynopsis{Convert Python objects to streams of bytes and back.}
6
Guido van Rossumd1883581995-02-15 15:53:08 +00007\index{persistency}
8\indexii{persistent}{objects}
9\indexii{serializing}{objects}
10\indexii{marshalling}{objects}
11\indexii{flattening}{objects}
12\indexii{pickling}{objects}
13
Guido van Rossum470be141995-03-17 16:07:09 +000014
Fred Drake41796911999-07-02 14:25:37 +000015The \module{pickle} module implements a basic but powerful algorithm
16for ``pickling'' (a.k.a.\ serializing, marshalling or flattening)
17nearly arbitrary Python objects. This is the act of converting
18objects to a stream of bytes (and back: ``unpickling''). This is a
19more primitive notion than persistency --- although \module{pickle}
20reads and writes file objects, it does not handle the issue of naming
21persistent objects, nor the (even more complicated) area of concurrent
22access to persistent objects. The \module{pickle} module can
23transform a complex object into a byte stream and it can transform the
24byte stream into an object with the same internal structure. The most
25obvious thing to do with these byte streams is to write them onto a
26file, but it is also conceivable to send them across a network or
27store them in a database. The module
Fred Drakeffbe6871999-04-22 21:23:22 +000028\refmodule{shelve}\refstmodindex{shelve} provides a simple interface
29to pickle and unpickle objects on DBM-style database files.
30
Guido van Rossumd1883581995-02-15 15:53:08 +000031
Fred Drake9b28fe21998-04-04 06:20:28 +000032\strong{Note:} The \module{pickle} module is rather slow. A
Fred Drakeffbe6871999-04-22 21:23:22 +000033reimplementation of the same algorithm in C, which is up to 1000 times
Fred Drake41796911999-07-02 14:25:37 +000034faster, is available as the
35\refmodule{cPickle}\refbimodindex{cPickle} module. This has the same
36interface except that \class{Pickler} and \class{Unpickler} are
37factory functions, not classes (so they cannot be used as base classes
38for inheritance).
Guido van Rossum736fe5e1997-12-09 20:45:08 +000039
Fred Drakeffbe6871999-04-22 21:23:22 +000040Unlike the built-in module \refmodule{marshal}\refbimodindex{marshal},
41\module{pickle} handles the following correctly:
42
Guido van Rossumd1883581995-02-15 15:53:08 +000043
44\begin{itemize}
45
Guido van Rossum470be141995-03-17 16:07:09 +000046\item recursive objects (objects containing references to themselves)
Guido van Rossumd1883581995-02-15 15:53:08 +000047
Guido van Rossum470be141995-03-17 16:07:09 +000048\item object sharing (references to the same object in different places)
Guido van Rossumd1883581995-02-15 15:53:08 +000049
Guido van Rossum470be141995-03-17 16:07:09 +000050\item user-defined classes and their instances
Guido van Rossumd1883581995-02-15 15:53:08 +000051
52\end{itemize}
53
Fred Drake9b28fe21998-04-04 06:20:28 +000054The data format used by \module{pickle} is Python-specific. This has
Guido van Rossumd1883581995-02-15 15:53:08 +000055the advantage that there are no restrictions imposed by external
Fred Drakeffbe6871999-04-22 21:23:22 +000056standards such as
57XDR\index{XDR}\index{External Data Representation} (which can't
58represent pointer sharing); however it means that non-Python programs
59may not be able to reconstruct pickled Python objects.
Guido van Rossumd1883581995-02-15 15:53:08 +000060
Fred Drake9b28fe21998-04-04 06:20:28 +000061By default, the \module{pickle} data format uses a printable \ASCII{}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000062representation. This is slightly more voluminous than a binary
63representation. The big advantage of using printable \ASCII{} (and of
Fred Drake9b28fe21998-04-04 06:20:28 +000064some other characteristics of \module{pickle}'s representation) is that
Guido van Rossum736fe5e1997-12-09 20:45:08 +000065for debugging or recovery purposes it is possible for a human to read
66the pickled file with a standard text editor.
67
68A binary format, which is slightly more efficient, can be chosen by
69specifying a nonzero (true) value for the \var{bin} argument to the
Fred Drake9b28fe21998-04-04 06:20:28 +000070\class{Pickler} constructor or the \function{dump()} and \function{dumps()}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000071functions. The binary format is not the default because of backwards
72compatibility with the Python 1.4 pickle module. In a future version,
73the default may change to binary.
Guido van Rossumd1883581995-02-15 15:53:08 +000074
Fred Drake9b28fe21998-04-04 06:20:28 +000075The \module{pickle} module doesn't handle code objects, which the
Fred Drake41796911999-07-02 14:25:37 +000076\refmodule{marshal}\refbimodindex{marshal} module does. I suppose
77\module{pickle} could, and maybe it should, but there's probably no
78great need for it right now (as long as \refmodule{marshal} continues
79to be used for reading and writing code objects), and at least this
80avoids the possibility of smuggling Trojan horses into a program.
Guido van Rossumd1883581995-02-15 15:53:08 +000081
Fred Drake9b28fe21998-04-04 06:20:28 +000082For the benefit of persistency modules written using \module{pickle}, it
Guido van Rossumd1883581995-02-15 15:53:08 +000083supports the notion of a reference to an object outside the pickled
84data stream. Such objects are referenced by a name, which is an
Guido van Rossum470be141995-03-17 16:07:09 +000085arbitrary string of printable \ASCII{} characters. The resolution of
Fred Drake9b28fe21998-04-04 06:20:28 +000086such names is not defined by the \module{pickle} module --- the
Guido van Rossumd1883581995-02-15 15:53:08 +000087persistent object module will have to implement a method
Fred Drake9b28fe21998-04-04 06:20:28 +000088\method{persistent_load()}. To write references to persistent objects,
89the persistent module must define a method \method{persistent_id()} which
Guido van Rossumd1883581995-02-15 15:53:08 +000090returns either \code{None} or the persistent ID of the object.
91
92There are some restrictions on the pickling of class instances.
93
94First of all, the class must be defined at the top level in a module.
Guido van Rossum736fe5e1997-12-09 20:45:08 +000095Furthermore, all its instance variables must be picklable.
Guido van Rossumd1883581995-02-15 15:53:08 +000096
Fred Drake19479911998-02-13 06:58:54 +000097\setindexsubitem{(pickle protocol)}
Guido van Rossum470be141995-03-17 16:07:09 +000098
Fred Drake9b28fe21998-04-04 06:20:28 +000099When a pickled class instance is unpickled, its \method{__init__()} method
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000100is normally \emph{not} invoked. \strong{Note:} This is a deviation
101from previous versions of this module; the change was introduced in
102Python 1.5b2. The reason for the change is that in many cases it is
103desirable to have a constructor that requires arguments; it is a
Fred Drake9b28fe21998-04-04 06:20:28 +0000104(minor) nuisance to have to provide a \method{__getinitargs__()} method.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000105
Fred Drake9b28fe21998-04-04 06:20:28 +0000106If it is desirable that the \method{__init__()} method be called on
107unpickling, a class can define a method \method{__getinitargs__()},
Fred Drakecf7e8301998-01-09 22:36:51 +0000108which should return a \emph{tuple} containing the arguments to be
Fred Drake9b28fe21998-04-04 06:20:28 +0000109passed to the class constructor (\method{__init__()}). This method is
Guido van Rossum57930391997-12-30 17:44:48 +0000110called at pickle time; the tuple it returns is incorporated in the
111pickle for the instance.
Fred Drake41796911999-07-02 14:25:37 +0000112\withsubitem{(copy protocol)}{\ttindex{__getinitargs__()}}
113\withsubitem{(instance constructor)}{\ttindex{__init__()}}
Guido van Rossumd1883581995-02-15 15:53:08 +0000114
Fred Drake41796911999-07-02 14:25:37 +0000115Classes can further influence how their instances are pickled --- if
116the class
117\withsubitem{(copy protocol)}{
118 \ttindex{__getstate__()}\ttindex{__setstate__()}}
119\withsubitem{(instance attribute)}{
120 \ttindex{__dict__}}
Fred Drake9b28fe21998-04-04 06:20:28 +0000121defines the method \method{__getstate__()}, it is called and the return
Guido van Rossumd1883581995-02-15 15:53:08 +0000122state is pickled as the contents for the instance, and if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000123defines the method \method{__setstate__()}, it is called with the
Guido van Rossumd1883581995-02-15 15:53:08 +0000124unpickled state. (Note that these methods can also be used to
125implement copying class instances.) If there is no
Fred Drake9b28fe21998-04-04 06:20:28 +0000126\method{__getstate__()} method, the instance's \member{__dict__} is
127pickled. If there is no \method{__setstate__()} method, the pickled
Guido van Rossumd1883581995-02-15 15:53:08 +0000128object must be a dictionary and its items are assigned to the new
Fred Drake9b28fe21998-04-04 06:20:28 +0000129instance's dictionary. (If a class defines both \method{__getstate__()}
130and \method{__setstate__()}, the state object needn't be a dictionary
Guido van Rossumd1883581995-02-15 15:53:08 +0000131--- these methods can do what they want.) This protocol is also used
Fred Drakeffbe6871999-04-22 21:23:22 +0000132by the shallow and deep copying operations defined in the
133\refmodule{copy}\refstmodindex{copy} module.
Guido van Rossumd1883581995-02-15 15:53:08 +0000134
135Note that when class instances are pickled, their class's code and
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000136data are not pickled along with them. Only the instance data are
Guido van Rossumd1883581995-02-15 15:53:08 +0000137pickled. This is done on purpose, so you can fix bugs in a class or
138add methods and still load objects that were created with an earlier
139version of the class. If you plan to have long-lived objects that
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000140will see many versions of a class, it may be worthwhile to put a version
Guido van Rossumd1883581995-02-15 15:53:08 +0000141number in the objects so that suitable conversions can be made by the
Fred Drake9b28fe21998-04-04 06:20:28 +0000142class's \method{__setstate__()} method.
Guido van Rossumd1883581995-02-15 15:53:08 +0000143
Guido van Rossum470be141995-03-17 16:07:09 +0000144When a class itself is pickled, only its name is pickled --- the class
145definition is not pickled, but re-imported by the unpickling process.
146Therefore, the restriction that the class must be defined at the top
147level in a module applies to pickled classes as well.
148
Fred Drake19479911998-02-13 06:58:54 +0000149\setindexsubitem{(in module pickle)}
Guido van Rossum470be141995-03-17 16:07:09 +0000150
Guido van Rossumd1883581995-02-15 15:53:08 +0000151The interface can be summarized as follows.
152
153To pickle an object \code{x} onto a file \code{f}, open for writing:
154
Fred Drake19479911998-02-13 06:58:54 +0000155\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000156p = pickle.Pickler(f)
157p.dump(x)
Fred Drake19479911998-02-13 06:58:54 +0000158\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000159
Guido van Rossum470be141995-03-17 16:07:09 +0000160A shorthand for this is:
161
Fred Drake19479911998-02-13 06:58:54 +0000162\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000163pickle.dump(x, f)
Fred Drake19479911998-02-13 06:58:54 +0000164\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000165
Guido van Rossumd1883581995-02-15 15:53:08 +0000166To unpickle an object \code{x} from a file \code{f}, open for reading:
167
Fred Drake19479911998-02-13 06:58:54 +0000168\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000169u = pickle.Unpickler(f)
Guido van Rossum96628a91995-04-10 11:34:00 +0000170x = u.load()
Fred Drake19479911998-02-13 06:58:54 +0000171\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000172
Guido van Rossum470be141995-03-17 16:07:09 +0000173A shorthand is:
174
Fred Drake19479911998-02-13 06:58:54 +0000175\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000176x = pickle.load(f)
Fred Drake19479911998-02-13 06:58:54 +0000177\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000178
179The \class{Pickler} class only calls the method \code{f.write()} with a
Fred Drake41796911999-07-02 14:25:37 +0000180\withsubitem{(class in pickle)}{
181 \ttindex{Unpickler}\ttindex{Pickler}}
Fred Drake9b28fe21998-04-04 06:20:28 +0000182string argument. The \class{Unpickler} calls the methods \code{f.read()}
Fred Drakecf7e8301998-01-09 22:36:51 +0000183(with an integer argument) and \code{f.readline()} (without argument),
Guido van Rossumd1883581995-02-15 15:53:08 +0000184both returning a string. It is explicitly allowed to pass non-file
185objects here, as long as they have the right methods.
186
Fred Drake9b28fe21998-04-04 06:20:28 +0000187The constructor for the \class{Pickler} class has an optional second
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000188argument, \var{bin}. If this is present and nonzero, the binary
189pickle format is used; if it is zero or absent, the (less efficient,
190but backwards compatible) text pickle format is used. The
Fred Drake9b28fe21998-04-04 06:20:28 +0000191\class{Unpickler} class does not have an argument to distinguish
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000192between binary and text pickle formats; it accepts either format.
193
Guido van Rossumd1883581995-02-15 15:53:08 +0000194The following types can be pickled:
Fred Drake41796911999-07-02 14:25:37 +0000195
Guido van Rossumd1883581995-02-15 15:53:08 +0000196\begin{itemize}
197
198\item \code{None}
199
200\item integers, long integers, floating point numbers
201
202\item strings
203
204\item tuples, lists and dictionaries containing only picklable objects
205
Guido van Rossum470be141995-03-17 16:07:09 +0000206\item classes that are defined at the top level in a module
207
Fred Drake9b28fe21998-04-04 06:20:28 +0000208\item instances of such classes whose \member{__dict__} or
209\method{__setstate__()} is picklable
Guido van Rossumd1883581995-02-15 15:53:08 +0000210
211\end{itemize}
212
Guido van Rossum470be141995-03-17 16:07:09 +0000213Attempts to pickle unpicklable objects will raise the
Fred Drake9b28fe21998-04-04 06:20:28 +0000214\exception{PicklingError} exception; when this happens, an unspecified
Guido van Rossum470be141995-03-17 16:07:09 +0000215number of bytes may have been written to the file.
Guido van Rossumd1883581995-02-15 15:53:08 +0000216
Fred Drake9b28fe21998-04-04 06:20:28 +0000217It is possible to make multiple calls to the \method{dump()} method of
218the same \class{Pickler} instance. These must then be matched to the
219same number of calls to the \method{load()} method of the
220corresponding \class{Unpickler} instance. If the same object is
221pickled by multiple \method{dump()} calls, the \method{load()} will all
Fred Drakecf7e8301998-01-09 22:36:51 +0000222yield references to the same object. \emph{Warning}: this is intended
Guido van Rossum470be141995-03-17 16:07:09 +0000223for pickling multiple objects without intervening modifications to the
224objects or their parts. If you modify an object and then pickle it
Fred Drake9b28fe21998-04-04 06:20:28 +0000225again using the same \class{Pickler} instance, the object is not
Guido van Rossum470be141995-03-17 16:07:09 +0000226pickled again --- a reference to it is pickled and the
Fred Drake9b28fe21998-04-04 06:20:28 +0000227\class{Unpickler} will return the old value, not the modified one.
Guido van Rossum470be141995-03-17 16:07:09 +0000228(There are two problems here: (a) detecting changes, and (b)
229marshalling a minimal set of changes. I have no answers. Garbage
230Collection may also become a problem here.)
231
Fred Drake9b28fe21998-04-04 06:20:28 +0000232Apart from the \class{Pickler} and \class{Unpickler} classes, the
Guido van Rossum470be141995-03-17 16:07:09 +0000233module defines the following functions, and an exception:
234
Fred Drakecce10901998-03-17 06:33:25 +0000235\begin{funcdesc}{dump}{object, file\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000236Write a pickled representation of \var{obect} to the open file object
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000237\var{file}. This is equivalent to
Fred Drake9b28fe21998-04-04 06:20:28 +0000238\samp{Pickler(\var{file}, \var{bin}).dump(\var{object})}.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000239If the optional \var{bin} argument is present and nonzero, the binary
240pickle format is used; if it is zero or absent, the (less efficient)
241text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000242\end{funcdesc}
243
244\begin{funcdesc}{load}{file}
245Read a pickled object from the open file object \var{file}. This is
Fred Drake9b28fe21998-04-04 06:20:28 +0000246equivalent to \samp{Unpickler(\var{file}).load()}.
Guido van Rossum470be141995-03-17 16:07:09 +0000247\end{funcdesc}
248
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000249\begin{funcdesc}{dumps}{object\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000250Return the pickled representation of the object as a string, instead
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000251of writing it to a file. If the optional \var{bin} argument is
252present and nonzero, the binary pickle format is used; if it is zero
253or absent, the (less efficient) text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000254\end{funcdesc}
255
256\begin{funcdesc}{loads}{string}
257Read a pickled object from a string instead of a file. Characters in
258the string past the pickled object's representation are ignored.
259\end{funcdesc}
260
261\begin{excdesc}{PicklingError}
262This exception is raised when an unpicklable object is passed to
Fred Drake41796911999-07-02 14:25:37 +0000263\method{Pickler.dump()}.
Guido van Rossum470be141995-03-17 16:07:09 +0000264\end{excdesc}
Fred Drake40748961998-03-06 21:27:14 +0000265
266
267\begin{seealso}
Fred Drakeffbe6871999-04-22 21:23:22 +0000268 \seemodule[copyreg]{copy_reg}{pickle interface constructor
269 registration}
Fred Drake9b28fe21998-04-04 06:20:28 +0000270
Fred Drakeffbe6871999-04-22 21:23:22 +0000271 \seemodule{shelve}{indexed databases of objects; uses \module{pickle}}
Fred Drake17e56401998-04-11 20:43:51 +0000272
Fred Drakeffbe6871999-04-22 21:23:22 +0000273 \seemodule{copy}{shallow and deep object copying}
Fred Drake17e56401998-04-11 20:43:51 +0000274
Fred Drakeffbe6871999-04-22 21:23:22 +0000275 \seemodule{marshal}{high-performance serialization of built-in types}
Fred Drake40748961998-03-06 21:27:14 +0000276\end{seealso}
Fred Drake9463de21998-04-11 20:05:43 +0000277
278
Fred Drake295da241998-08-10 19:42:37 +0000279\section{\module{cPickle} ---
Fred Drakeffbe6871999-04-22 21:23:22 +0000280 Alternate implementation of \module{pickle}}
281
Fred Drakeb91e9341998-07-23 17:59:49 +0000282\declaremodule{builtin}{cPickle}
Fred Drakeb91e9341998-07-23 17:59:49 +0000283\modulesynopsis{Faster version of \module{pickle}, but not subclassable.}
Fred Drakeffbe6871999-04-22 21:23:22 +0000284\moduleauthor{Jim Fulton}{jfulton@digicool.com}
285\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
Fred Drakeb91e9341998-07-23 17:59:49 +0000286
Fred Drake9463de21998-04-11 20:05:43 +0000287
Fred Drake9463de21998-04-11 20:05:43 +0000288The \module{cPickle} module provides a similar interface and identical
Fred Drake41796911999-07-02 14:25:37 +0000289functionality as the \refmodule{pickle}\refstmodindex{pickle} module,
290but can be up to 1000 times faster since it is implemented in C. The
291only other important difference to note is that \function{Pickler()}
292and \function{Unpickler()} are functions and not classes, and so
293cannot be subclassed. This should not be an issue in most cases.
Fred Drake9463de21998-04-11 20:05:43 +0000294
295The format of the pickle data is identical to that produced using the
Fred Drakeffbe6871999-04-22 21:23:22 +0000296\refmodule{pickle} module, so it is possible to use \refmodule{pickle} and
Fred Drake9463de21998-04-11 20:05:43 +0000297\module{cPickle} interchangably with existing pickles.
Guido van Rossumcf3ce921999-01-06 23:34:39 +0000298
299(Since the pickle data format is actually a tiny stack-oriented
300programming language, and there are some freedoms in the encodings of
301certain objects, it's possible that the two modules produce different
302pickled data for the same input objects; however they will always be
303able to read each others pickles back in.)