blob: 6f9ece7c6c7c1860f4b3e2dc035b12769d5f3d04 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{pickle} ---
2 Python object serialization.}
Fred Drakeb91e9341998-07-23 17:59:49 +00003\declaremodule{standard}{pickle}
4
5\modulesynopsis{Convert Python objects to streams of bytes and back.}
6
Guido van Rossumd1883581995-02-15 15:53:08 +00007\index{persistency}
8\indexii{persistent}{objects}
9\indexii{serializing}{objects}
10\indexii{marshalling}{objects}
11\indexii{flattening}{objects}
12\indexii{pickling}{objects}
13
Guido van Rossum470be141995-03-17 16:07:09 +000014
Fred Drake9b28fe21998-04-04 06:20:28 +000015The \module{pickle} module implements a basic but powerful algorithm for
Guido van Rossum6bb1adc1995-03-13 10:03:32 +000016``pickling'' (a.k.a.\ serializing, marshalling or flattening) nearly
Guido van Rossumecde7811995-03-28 13:35:14 +000017arbitrary Python objects. This is the act of converting objects to a
18stream of bytes (and back: ``unpickling'').
19This is a more primitive notion than
Fred Drake9b28fe21998-04-04 06:20:28 +000020persistency --- although \module{pickle} reads and writes file objects,
Guido van Rossumd1883581995-02-15 15:53:08 +000021it does not handle the issue of naming persistent objects, nor the
22(even more complicated) area of concurrent access to persistent
Fred Drake9b28fe21998-04-04 06:20:28 +000023objects. The \module{pickle} module can transform a complex object into
Guido van Rossumd1883581995-02-15 15:53:08 +000024a byte stream and it can transform the byte stream into an object with
25the same internal structure. The most obvious thing to do with these
26byte streams is to write them onto a file, but it is also conceivable
27to send them across a network or store them in a database. The module
Fred Drake9b28fe21998-04-04 06:20:28 +000028\module{shelve} provides a simple interface to pickle and unpickle
Guido van Rossumd1883581995-02-15 15:53:08 +000029objects on ``dbm''-style database files.
Fred Drake54820dc1997-12-15 21:56:05 +000030\refstmodindex{shelve}
Guido van Rossumd1883581995-02-15 15:53:08 +000031
Fred Drake9b28fe21998-04-04 06:20:28 +000032\strong{Note:} The \module{pickle} module is rather slow. A
33reimplementation of the same algorithm in \C{}, which is up to 1000 times
34faster, is available as the \module{cPickle}\refbimodindex{cPickle}
Fred Drakecf7e8301998-01-09 22:36:51 +000035module. This has the same interface except that \code{Pickler} and
36\code{Unpickler} are factory functions, not classes (so they cannot be
Fred Drake9b28fe21998-04-04 06:20:28 +000037used as base classes for inheritance).
Guido van Rossum736fe5e1997-12-09 20:45:08 +000038
Fred Drake9b28fe21998-04-04 06:20:28 +000039Unlike the built-in module \module{marshal}, \module{pickle} handles
40the following correctly:
Fred Drake54820dc1997-12-15 21:56:05 +000041\refbimodindex{marshal}
Guido van Rossumd1883581995-02-15 15:53:08 +000042
43\begin{itemize}
44
Guido van Rossum470be141995-03-17 16:07:09 +000045\item recursive objects (objects containing references to themselves)
Guido van Rossumd1883581995-02-15 15:53:08 +000046
Guido van Rossum470be141995-03-17 16:07:09 +000047\item object sharing (references to the same object in different places)
Guido van Rossumd1883581995-02-15 15:53:08 +000048
Guido van Rossum470be141995-03-17 16:07:09 +000049\item user-defined classes and their instances
Guido van Rossumd1883581995-02-15 15:53:08 +000050
51\end{itemize}
52
Fred Drake9b28fe21998-04-04 06:20:28 +000053The data format used by \module{pickle} is Python-specific. This has
Guido van Rossumd1883581995-02-15 15:53:08 +000054the advantage that there are no restrictions imposed by external
Fred Drakecf7e8301998-01-09 22:36:51 +000055standards such as XDR%
56\index{XDR}
57\index{External Data Representation}
58(which can't represent pointer sharing); however
59it means that non-Python programs may not be able to reconstruct
60pickled Python objects.
Guido van Rossumd1883581995-02-15 15:53:08 +000061
Fred Drake9b28fe21998-04-04 06:20:28 +000062By default, the \module{pickle} data format uses a printable \ASCII{}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000063representation. This is slightly more voluminous than a binary
64representation. The big advantage of using printable \ASCII{} (and of
Fred Drake9b28fe21998-04-04 06:20:28 +000065some other characteristics of \module{pickle}'s representation) is that
Guido van Rossum736fe5e1997-12-09 20:45:08 +000066for debugging or recovery purposes it is possible for a human to read
67the pickled file with a standard text editor.
68
69A binary format, which is slightly more efficient, can be chosen by
70specifying a nonzero (true) value for the \var{bin} argument to the
Fred Drake9b28fe21998-04-04 06:20:28 +000071\class{Pickler} constructor or the \function{dump()} and \function{dumps()}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000072functions. The binary format is not the default because of backwards
73compatibility with the Python 1.4 pickle module. In a future version,
74the default may change to binary.
Guido van Rossumd1883581995-02-15 15:53:08 +000075
Fred Drake9b28fe21998-04-04 06:20:28 +000076The \module{pickle} module doesn't handle code objects, which the
77\module{marshal} module does. I suppose \module{pickle} could, and maybe
Guido van Rossumd1883581995-02-15 15:53:08 +000078it should, but there's probably no great need for it right now (as
Fred Drake9b28fe21998-04-04 06:20:28 +000079long as \module{marshal} continues to be used for reading and writing
Guido van Rossumd1883581995-02-15 15:53:08 +000080code objects), and at least this avoids the possibility of smuggling
81Trojan horses into a program.
Fred Drake54820dc1997-12-15 21:56:05 +000082\refbimodindex{marshal}
Guido van Rossumd1883581995-02-15 15:53:08 +000083
Fred Drake9b28fe21998-04-04 06:20:28 +000084For the benefit of persistency modules written using \module{pickle}, it
Guido van Rossumd1883581995-02-15 15:53:08 +000085supports the notion of a reference to an object outside the pickled
86data stream. Such objects are referenced by a name, which is an
Guido van Rossum470be141995-03-17 16:07:09 +000087arbitrary string of printable \ASCII{} characters. The resolution of
Fred Drake9b28fe21998-04-04 06:20:28 +000088such names is not defined by the \module{pickle} module --- the
Guido van Rossumd1883581995-02-15 15:53:08 +000089persistent object module will have to implement a method
Fred Drake9b28fe21998-04-04 06:20:28 +000090\method{persistent_load()}. To write references to persistent objects,
91the persistent module must define a method \method{persistent_id()} which
Guido van Rossumd1883581995-02-15 15:53:08 +000092returns either \code{None} or the persistent ID of the object.
93
94There are some restrictions on the pickling of class instances.
95
96First of all, the class must be defined at the top level in a module.
Guido van Rossum736fe5e1997-12-09 20:45:08 +000097Furthermore, all its instance variables must be picklable.
Guido van Rossumd1883581995-02-15 15:53:08 +000098
Fred Drake19479911998-02-13 06:58:54 +000099\setindexsubitem{(pickle protocol)}
Guido van Rossum470be141995-03-17 16:07:09 +0000100
Fred Drake9b28fe21998-04-04 06:20:28 +0000101When a pickled class instance is unpickled, its \method{__init__()} method
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000102is normally \emph{not} invoked. \strong{Note:} This is a deviation
103from previous versions of this module; the change was introduced in
104Python 1.5b2. The reason for the change is that in many cases it is
105desirable to have a constructor that requires arguments; it is a
Fred Drake9b28fe21998-04-04 06:20:28 +0000106(minor) nuisance to have to provide a \method{__getinitargs__()} method.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000107
Fred Drake9b28fe21998-04-04 06:20:28 +0000108If it is desirable that the \method{__init__()} method be called on
109unpickling, a class can define a method \method{__getinitargs__()},
Fred Drakecf7e8301998-01-09 22:36:51 +0000110which should return a \emph{tuple} containing the arguments to be
Fred Drake9b28fe21998-04-04 06:20:28 +0000111passed to the class constructor (\method{__init__()}). This method is
Guido van Rossum57930391997-12-30 17:44:48 +0000112called at pickle time; the tuple it returns is incorporated in the
113pickle for the instance.
Fred Drake9b28fe21998-04-04 06:20:28 +0000114\ttindex{__getinitargs__()}
115\ttindex{__init__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000116
Guido van Rossum470be141995-03-17 16:07:09 +0000117Classes can further influence how their instances are pickled --- if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000118defines the method \method{__getstate__()}, it is called and the return
Guido van Rossumd1883581995-02-15 15:53:08 +0000119state is pickled as the contents for the instance, and if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000120defines the method \method{__setstate__()}, it is called with the
Guido van Rossumd1883581995-02-15 15:53:08 +0000121unpickled state. (Note that these methods can also be used to
122implement copying class instances.) If there is no
Fred Drake9b28fe21998-04-04 06:20:28 +0000123\method{__getstate__()} method, the instance's \member{__dict__} is
124pickled. If there is no \method{__setstate__()} method, the pickled
Guido van Rossumd1883581995-02-15 15:53:08 +0000125object must be a dictionary and its items are assigned to the new
Fred Drake9b28fe21998-04-04 06:20:28 +0000126instance's dictionary. (If a class defines both \method{__getstate__()}
127and \method{__setstate__()}, the state object needn't be a dictionary
Guido van Rossumd1883581995-02-15 15:53:08 +0000128--- these methods can do what they want.) This protocol is also used
Fred Drake9b28fe21998-04-04 06:20:28 +0000129by the shallow and deep copying operations defined in the \module{copy}
130module.\refstmodindex{copy}
131\ttindex{__getstate__()}
132\ttindex{__setstate__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000133\ttindex{__dict__}
134
135Note that when class instances are pickled, their class's code and
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000136data are not pickled along with them. Only the instance data are
Guido van Rossumd1883581995-02-15 15:53:08 +0000137pickled. This is done on purpose, so you can fix bugs in a class or
138add methods and still load objects that were created with an earlier
139version of the class. If you plan to have long-lived objects that
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000140will see many versions of a class, it may be worthwhile to put a version
Guido van Rossumd1883581995-02-15 15:53:08 +0000141number in the objects so that suitable conversions can be made by the
Fred Drake9b28fe21998-04-04 06:20:28 +0000142class's \method{__setstate__()} method.
Guido van Rossumd1883581995-02-15 15:53:08 +0000143
Guido van Rossum470be141995-03-17 16:07:09 +0000144When a class itself is pickled, only its name is pickled --- the class
145definition is not pickled, but re-imported by the unpickling process.
146Therefore, the restriction that the class must be defined at the top
147level in a module applies to pickled classes as well.
148
Fred Drake19479911998-02-13 06:58:54 +0000149\setindexsubitem{(in module pickle)}
Guido van Rossum470be141995-03-17 16:07:09 +0000150
Guido van Rossumd1883581995-02-15 15:53:08 +0000151The interface can be summarized as follows.
152
153To pickle an object \code{x} onto a file \code{f}, open for writing:
154
Fred Drake19479911998-02-13 06:58:54 +0000155\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000156p = pickle.Pickler(f)
157p.dump(x)
Fred Drake19479911998-02-13 06:58:54 +0000158\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000159
Guido van Rossum470be141995-03-17 16:07:09 +0000160A shorthand for this is:
161
Fred Drake19479911998-02-13 06:58:54 +0000162\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000163pickle.dump(x, f)
Fred Drake19479911998-02-13 06:58:54 +0000164\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000165
Guido van Rossumd1883581995-02-15 15:53:08 +0000166To unpickle an object \code{x} from a file \code{f}, open for reading:
167
Fred Drake19479911998-02-13 06:58:54 +0000168\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000169u = pickle.Unpickler(f)
Guido van Rossum96628a91995-04-10 11:34:00 +0000170x = u.load()
Fred Drake19479911998-02-13 06:58:54 +0000171\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000172
Guido van Rossum470be141995-03-17 16:07:09 +0000173A shorthand is:
174
Fred Drake19479911998-02-13 06:58:54 +0000175\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000176x = pickle.load(f)
Fred Drake19479911998-02-13 06:58:54 +0000177\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000178
179The \class{Pickler} class only calls the method \code{f.write()} with a
180string argument. The \class{Unpickler} calls the methods \code{f.read()}
Fred Drakecf7e8301998-01-09 22:36:51 +0000181(with an integer argument) and \code{f.readline()} (without argument),
Guido van Rossumd1883581995-02-15 15:53:08 +0000182both returning a string. It is explicitly allowed to pass non-file
183objects here, as long as they have the right methods.
Guido van Rossum470be141995-03-17 16:07:09 +0000184\ttindex{Unpickler}
185\ttindex{Pickler}
Guido van Rossumd1883581995-02-15 15:53:08 +0000186
Fred Drake9b28fe21998-04-04 06:20:28 +0000187The constructor for the \class{Pickler} class has an optional second
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000188argument, \var{bin}. If this is present and nonzero, the binary
189pickle format is used; if it is zero or absent, the (less efficient,
190but backwards compatible) text pickle format is used. The
Fred Drake9b28fe21998-04-04 06:20:28 +0000191\class{Unpickler} class does not have an argument to distinguish
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000192between binary and text pickle formats; it accepts either format.
193
Guido van Rossumd1883581995-02-15 15:53:08 +0000194The following types can be pickled:
195\begin{itemize}
196
197\item \code{None}
198
199\item integers, long integers, floating point numbers
200
201\item strings
202
203\item tuples, lists and dictionaries containing only picklable objects
204
Guido van Rossum470be141995-03-17 16:07:09 +0000205\item classes that are defined at the top level in a module
206
Fred Drake9b28fe21998-04-04 06:20:28 +0000207\item instances of such classes whose \member{__dict__} or
208\method{__setstate__()} is picklable
Guido van Rossumd1883581995-02-15 15:53:08 +0000209
210\end{itemize}
211
Guido van Rossum470be141995-03-17 16:07:09 +0000212Attempts to pickle unpicklable objects will raise the
Fred Drake9b28fe21998-04-04 06:20:28 +0000213\exception{PicklingError} exception; when this happens, an unspecified
Guido van Rossum470be141995-03-17 16:07:09 +0000214number of bytes may have been written to the file.
Guido van Rossumd1883581995-02-15 15:53:08 +0000215
Fred Drake9b28fe21998-04-04 06:20:28 +0000216It is possible to make multiple calls to the \method{dump()} method of
217the same \class{Pickler} instance. These must then be matched to the
218same number of calls to the \method{load()} method of the
219corresponding \class{Unpickler} instance. If the same object is
220pickled by multiple \method{dump()} calls, the \method{load()} will all
Fred Drakecf7e8301998-01-09 22:36:51 +0000221yield references to the same object. \emph{Warning}: this is intended
Guido van Rossum470be141995-03-17 16:07:09 +0000222for pickling multiple objects without intervening modifications to the
223objects or their parts. If you modify an object and then pickle it
Fred Drake9b28fe21998-04-04 06:20:28 +0000224again using the same \class{Pickler} instance, the object is not
Guido van Rossum470be141995-03-17 16:07:09 +0000225pickled again --- a reference to it is pickled and the
Fred Drake9b28fe21998-04-04 06:20:28 +0000226\class{Unpickler} will return the old value, not the modified one.
Guido van Rossum470be141995-03-17 16:07:09 +0000227(There are two problems here: (a) detecting changes, and (b)
228marshalling a minimal set of changes. I have no answers. Garbage
229Collection may also become a problem here.)
230
Fred Drake9b28fe21998-04-04 06:20:28 +0000231Apart from the \class{Pickler} and \class{Unpickler} classes, the
Guido van Rossum470be141995-03-17 16:07:09 +0000232module defines the following functions, and an exception:
233
Fred Drakecce10901998-03-17 06:33:25 +0000234\begin{funcdesc}{dump}{object, file\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000235Write a pickled representation of \var{obect} to the open file object
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000236\var{file}. This is equivalent to
Fred Drake9b28fe21998-04-04 06:20:28 +0000237\samp{Pickler(\var{file}, \var{bin}).dump(\var{object})}.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000238If the optional \var{bin} argument is present and nonzero, the binary
239pickle format is used; if it is zero or absent, the (less efficient)
240text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000241\end{funcdesc}
242
243\begin{funcdesc}{load}{file}
244Read a pickled object from the open file object \var{file}. This is
Fred Drake9b28fe21998-04-04 06:20:28 +0000245equivalent to \samp{Unpickler(\var{file}).load()}.
Guido van Rossum470be141995-03-17 16:07:09 +0000246\end{funcdesc}
247
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000248\begin{funcdesc}{dumps}{object\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000249Return the pickled representation of the object as a string, instead
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000250of writing it to a file. If the optional \var{bin} argument is
251present and nonzero, the binary pickle format is used; if it is zero
252or absent, the (less efficient) text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000253\end{funcdesc}
254
255\begin{funcdesc}{loads}{string}
256Read a pickled object from a string instead of a file. Characters in
257the string past the pickled object's representation are ignored.
258\end{funcdesc}
259
260\begin{excdesc}{PicklingError}
261This exception is raised when an unpicklable object is passed to
262\code{Pickler.dump()}.
263\end{excdesc}
Fred Drake40748961998-03-06 21:27:14 +0000264
265
266\begin{seealso}
267\seemodule[copyreg]{copy_reg}{pickle interface constructor
268registration}
Fred Drake9b28fe21998-04-04 06:20:28 +0000269
Fred Drake9b28fe21998-04-04 06:20:28 +0000270\seemodule{shelve}{indexed databases of objects; uses \module{pickle}}
Fred Drake17e56401998-04-11 20:43:51 +0000271
272\seemodule{copy}{shallow and deep object copying}
273
274\seemodule{marshal}{high-performance serialization of built-in types}
Fred Drake40748961998-03-06 21:27:14 +0000275\end{seealso}
Fred Drake9463de21998-04-11 20:05:43 +0000276
277
Fred Drake295da241998-08-10 19:42:37 +0000278\section{\module{cPickle} ---
279 Alternate implementation of \module{pickle}.}
Fred Drakeb91e9341998-07-23 17:59:49 +0000280\declaremodule{builtin}{cPickle}
281
282\modulesynopsis{Faster version of \module{pickle}, but not subclassable.}
283
Fred Drake9463de21998-04-11 20:05:43 +0000284
285% This section was written by Fred L. Drake, Jr. <fdrake@acm.org>
286
287The \module{cPickle} module provides a similar interface and identical
288functionality as the \module{pickle} module, but can be up to 1000
289times faster since it is implemented in \C{}. The only other
290important difference to note is that \function{Pickler()} and
291\function{Unpickler()} are functions and not classes, and so cannot be
292subclassed. This should not be an issue in most cases.
293
294The format of the pickle data is identical to that produced using the
295\module{pickle} module, so it is possible to use \module{pickle} and
296\module{cPickle} interchangably with existing pickles.