blob: 60999ebeb5708629068ce4e87284d5ccffa1d481 [file] [log] [blame]
Fred Drake3a0351c1998-04-04 07:23:21 +00001\section{Standard Module \module{pickle}}
Fred Drakeb91e9341998-07-23 17:59:49 +00002\declaremodule{standard}{pickle}
3
4\modulesynopsis{Convert Python objects to streams of bytes and back.}
5
Guido van Rossumd1883581995-02-15 15:53:08 +00006\index{persistency}
7\indexii{persistent}{objects}
8\indexii{serializing}{objects}
9\indexii{marshalling}{objects}
10\indexii{flattening}{objects}
11\indexii{pickling}{objects}
12
Guido van Rossum470be141995-03-17 16:07:09 +000013
Fred Drake9b28fe21998-04-04 06:20:28 +000014The \module{pickle} module implements a basic but powerful algorithm for
Guido van Rossum6bb1adc1995-03-13 10:03:32 +000015``pickling'' (a.k.a.\ serializing, marshalling or flattening) nearly
Guido van Rossumecde7811995-03-28 13:35:14 +000016arbitrary Python objects. This is the act of converting objects to a
17stream of bytes (and back: ``unpickling'').
18This is a more primitive notion than
Fred Drake9b28fe21998-04-04 06:20:28 +000019persistency --- although \module{pickle} reads and writes file objects,
Guido van Rossumd1883581995-02-15 15:53:08 +000020it does not handle the issue of naming persistent objects, nor the
21(even more complicated) area of concurrent access to persistent
Fred Drake9b28fe21998-04-04 06:20:28 +000022objects. The \module{pickle} module can transform a complex object into
Guido van Rossumd1883581995-02-15 15:53:08 +000023a byte stream and it can transform the byte stream into an object with
24the same internal structure. The most obvious thing to do with these
25byte streams is to write them onto a file, but it is also conceivable
26to send them across a network or store them in a database. The module
Fred Drake9b28fe21998-04-04 06:20:28 +000027\module{shelve} provides a simple interface to pickle and unpickle
Guido van Rossumd1883581995-02-15 15:53:08 +000028objects on ``dbm''-style database files.
Fred Drake54820dc1997-12-15 21:56:05 +000029\refstmodindex{shelve}
Guido van Rossumd1883581995-02-15 15:53:08 +000030
Fred Drake9b28fe21998-04-04 06:20:28 +000031\strong{Note:} The \module{pickle} module is rather slow. A
32reimplementation of the same algorithm in \C{}, which is up to 1000 times
33faster, is available as the \module{cPickle}\refbimodindex{cPickle}
Fred Drakecf7e8301998-01-09 22:36:51 +000034module. This has the same interface except that \code{Pickler} and
35\code{Unpickler} are factory functions, not classes (so they cannot be
Fred Drake9b28fe21998-04-04 06:20:28 +000036used as base classes for inheritance).
Guido van Rossum736fe5e1997-12-09 20:45:08 +000037
Fred Drake9b28fe21998-04-04 06:20:28 +000038Unlike the built-in module \module{marshal}, \module{pickle} handles
39the following correctly:
Fred Drake54820dc1997-12-15 21:56:05 +000040\refbimodindex{marshal}
Guido van Rossumd1883581995-02-15 15:53:08 +000041
42\begin{itemize}
43
Guido van Rossum470be141995-03-17 16:07:09 +000044\item recursive objects (objects containing references to themselves)
Guido van Rossumd1883581995-02-15 15:53:08 +000045
Guido van Rossum470be141995-03-17 16:07:09 +000046\item object sharing (references to the same object in different places)
Guido van Rossumd1883581995-02-15 15:53:08 +000047
Guido van Rossum470be141995-03-17 16:07:09 +000048\item user-defined classes and their instances
Guido van Rossumd1883581995-02-15 15:53:08 +000049
50\end{itemize}
51
Fred Drake9b28fe21998-04-04 06:20:28 +000052The data format used by \module{pickle} is Python-specific. This has
Guido van Rossumd1883581995-02-15 15:53:08 +000053the advantage that there are no restrictions imposed by external
Fred Drakecf7e8301998-01-09 22:36:51 +000054standards such as XDR%
55\index{XDR}
56\index{External Data Representation}
57(which can't represent pointer sharing); however
58it means that non-Python programs may not be able to reconstruct
59pickled Python objects.
Guido van Rossumd1883581995-02-15 15:53:08 +000060
Fred Drake9b28fe21998-04-04 06:20:28 +000061By default, the \module{pickle} data format uses a printable \ASCII{}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000062representation. This is slightly more voluminous than a binary
63representation. The big advantage of using printable \ASCII{} (and of
Fred Drake9b28fe21998-04-04 06:20:28 +000064some other characteristics of \module{pickle}'s representation) is that
Guido van Rossum736fe5e1997-12-09 20:45:08 +000065for debugging or recovery purposes it is possible for a human to read
66the pickled file with a standard text editor.
67
68A binary format, which is slightly more efficient, can be chosen by
69specifying a nonzero (true) value for the \var{bin} argument to the
Fred Drake9b28fe21998-04-04 06:20:28 +000070\class{Pickler} constructor or the \function{dump()} and \function{dumps()}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000071functions. The binary format is not the default because of backwards
72compatibility with the Python 1.4 pickle module. In a future version,
73the default may change to binary.
Guido van Rossumd1883581995-02-15 15:53:08 +000074
Fred Drake9b28fe21998-04-04 06:20:28 +000075The \module{pickle} module doesn't handle code objects, which the
76\module{marshal} module does. I suppose \module{pickle} could, and maybe
Guido van Rossumd1883581995-02-15 15:53:08 +000077it should, but there's probably no great need for it right now (as
Fred Drake9b28fe21998-04-04 06:20:28 +000078long as \module{marshal} continues to be used for reading and writing
Guido van Rossumd1883581995-02-15 15:53:08 +000079code objects), and at least this avoids the possibility of smuggling
80Trojan horses into a program.
Fred Drake54820dc1997-12-15 21:56:05 +000081\refbimodindex{marshal}
Guido van Rossumd1883581995-02-15 15:53:08 +000082
Fred Drake9b28fe21998-04-04 06:20:28 +000083For the benefit of persistency modules written using \module{pickle}, it
Guido van Rossumd1883581995-02-15 15:53:08 +000084supports the notion of a reference to an object outside the pickled
85data stream. Such objects are referenced by a name, which is an
Guido van Rossum470be141995-03-17 16:07:09 +000086arbitrary string of printable \ASCII{} characters. The resolution of
Fred Drake9b28fe21998-04-04 06:20:28 +000087such names is not defined by the \module{pickle} module --- the
Guido van Rossumd1883581995-02-15 15:53:08 +000088persistent object module will have to implement a method
Fred Drake9b28fe21998-04-04 06:20:28 +000089\method{persistent_load()}. To write references to persistent objects,
90the persistent module must define a method \method{persistent_id()} which
Guido van Rossumd1883581995-02-15 15:53:08 +000091returns either \code{None} or the persistent ID of the object.
92
93There are some restrictions on the pickling of class instances.
94
95First of all, the class must be defined at the top level in a module.
Guido van Rossum736fe5e1997-12-09 20:45:08 +000096Furthermore, all its instance variables must be picklable.
Guido van Rossumd1883581995-02-15 15:53:08 +000097
Fred Drake19479911998-02-13 06:58:54 +000098\setindexsubitem{(pickle protocol)}
Guido van Rossum470be141995-03-17 16:07:09 +000099
Fred Drake9b28fe21998-04-04 06:20:28 +0000100When a pickled class instance is unpickled, its \method{__init__()} method
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000101is normally \emph{not} invoked. \strong{Note:} This is a deviation
102from previous versions of this module; the change was introduced in
103Python 1.5b2. The reason for the change is that in many cases it is
104desirable to have a constructor that requires arguments; it is a
Fred Drake9b28fe21998-04-04 06:20:28 +0000105(minor) nuisance to have to provide a \method{__getinitargs__()} method.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000106
Fred Drake9b28fe21998-04-04 06:20:28 +0000107If it is desirable that the \method{__init__()} method be called on
108unpickling, a class can define a method \method{__getinitargs__()},
Fred Drakecf7e8301998-01-09 22:36:51 +0000109which should return a \emph{tuple} containing the arguments to be
Fred Drake9b28fe21998-04-04 06:20:28 +0000110passed to the class constructor (\method{__init__()}). This method is
Guido van Rossum57930391997-12-30 17:44:48 +0000111called at pickle time; the tuple it returns is incorporated in the
112pickle for the instance.
Fred Drake9b28fe21998-04-04 06:20:28 +0000113\ttindex{__getinitargs__()}
114\ttindex{__init__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000115
Guido van Rossum470be141995-03-17 16:07:09 +0000116Classes can further influence how their instances are pickled --- if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000117defines the method \method{__getstate__()}, it is called and the return
Guido van Rossumd1883581995-02-15 15:53:08 +0000118state is pickled as the contents for the instance, and if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000119defines the method \method{__setstate__()}, it is called with the
Guido van Rossumd1883581995-02-15 15:53:08 +0000120unpickled state. (Note that these methods can also be used to
121implement copying class instances.) If there is no
Fred Drake9b28fe21998-04-04 06:20:28 +0000122\method{__getstate__()} method, the instance's \member{__dict__} is
123pickled. If there is no \method{__setstate__()} method, the pickled
Guido van Rossumd1883581995-02-15 15:53:08 +0000124object must be a dictionary and its items are assigned to the new
Fred Drake9b28fe21998-04-04 06:20:28 +0000125instance's dictionary. (If a class defines both \method{__getstate__()}
126and \method{__setstate__()}, the state object needn't be a dictionary
Guido van Rossumd1883581995-02-15 15:53:08 +0000127--- these methods can do what they want.) This protocol is also used
Fred Drake9b28fe21998-04-04 06:20:28 +0000128by the shallow and deep copying operations defined in the \module{copy}
129module.\refstmodindex{copy}
130\ttindex{__getstate__()}
131\ttindex{__setstate__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000132\ttindex{__dict__}
133
134Note that when class instances are pickled, their class's code and
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000135data are not pickled along with them. Only the instance data are
Guido van Rossumd1883581995-02-15 15:53:08 +0000136pickled. This is done on purpose, so you can fix bugs in a class or
137add methods and still load objects that were created with an earlier
138version of the class. If you plan to have long-lived objects that
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000139will see many versions of a class, it may be worthwhile to put a version
Guido van Rossumd1883581995-02-15 15:53:08 +0000140number in the objects so that suitable conversions can be made by the
Fred Drake9b28fe21998-04-04 06:20:28 +0000141class's \method{__setstate__()} method.
Guido van Rossumd1883581995-02-15 15:53:08 +0000142
Guido van Rossum470be141995-03-17 16:07:09 +0000143When a class itself is pickled, only its name is pickled --- the class
144definition is not pickled, but re-imported by the unpickling process.
145Therefore, the restriction that the class must be defined at the top
146level in a module applies to pickled classes as well.
147
Fred Drake19479911998-02-13 06:58:54 +0000148\setindexsubitem{(in module pickle)}
Guido van Rossum470be141995-03-17 16:07:09 +0000149
Guido van Rossumd1883581995-02-15 15:53:08 +0000150The interface can be summarized as follows.
151
152To pickle an object \code{x} onto a file \code{f}, open for writing:
153
Fred Drake19479911998-02-13 06:58:54 +0000154\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000155p = pickle.Pickler(f)
156p.dump(x)
Fred Drake19479911998-02-13 06:58:54 +0000157\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000158
Guido van Rossum470be141995-03-17 16:07:09 +0000159A shorthand for this is:
160
Fred Drake19479911998-02-13 06:58:54 +0000161\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000162pickle.dump(x, f)
Fred Drake19479911998-02-13 06:58:54 +0000163\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000164
Guido van Rossumd1883581995-02-15 15:53:08 +0000165To unpickle an object \code{x} from a file \code{f}, open for reading:
166
Fred Drake19479911998-02-13 06:58:54 +0000167\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000168u = pickle.Unpickler(f)
Guido van Rossum96628a91995-04-10 11:34:00 +0000169x = u.load()
Fred Drake19479911998-02-13 06:58:54 +0000170\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000171
Guido van Rossum470be141995-03-17 16:07:09 +0000172A shorthand is:
173
Fred Drake19479911998-02-13 06:58:54 +0000174\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000175x = pickle.load(f)
Fred Drake19479911998-02-13 06:58:54 +0000176\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000177
178The \class{Pickler} class only calls the method \code{f.write()} with a
179string argument. The \class{Unpickler} calls the methods \code{f.read()}
Fred Drakecf7e8301998-01-09 22:36:51 +0000180(with an integer argument) and \code{f.readline()} (without argument),
Guido van Rossumd1883581995-02-15 15:53:08 +0000181both returning a string. It is explicitly allowed to pass non-file
182objects here, as long as they have the right methods.
Guido van Rossum470be141995-03-17 16:07:09 +0000183\ttindex{Unpickler}
184\ttindex{Pickler}
Guido van Rossumd1883581995-02-15 15:53:08 +0000185
Fred Drake9b28fe21998-04-04 06:20:28 +0000186The constructor for the \class{Pickler} class has an optional second
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000187argument, \var{bin}. If this is present and nonzero, the binary
188pickle format is used; if it is zero or absent, the (less efficient,
189but backwards compatible) text pickle format is used. The
Fred Drake9b28fe21998-04-04 06:20:28 +0000190\class{Unpickler} class does not have an argument to distinguish
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000191between binary and text pickle formats; it accepts either format.
192
Guido van Rossumd1883581995-02-15 15:53:08 +0000193The following types can be pickled:
194\begin{itemize}
195
196\item \code{None}
197
198\item integers, long integers, floating point numbers
199
200\item strings
201
202\item tuples, lists and dictionaries containing only picklable objects
203
Guido van Rossum470be141995-03-17 16:07:09 +0000204\item classes that are defined at the top level in a module
205
Fred Drake9b28fe21998-04-04 06:20:28 +0000206\item instances of such classes whose \member{__dict__} or
207\method{__setstate__()} is picklable
Guido van Rossumd1883581995-02-15 15:53:08 +0000208
209\end{itemize}
210
Guido van Rossum470be141995-03-17 16:07:09 +0000211Attempts to pickle unpicklable objects will raise the
Fred Drake9b28fe21998-04-04 06:20:28 +0000212\exception{PicklingError} exception; when this happens, an unspecified
Guido van Rossum470be141995-03-17 16:07:09 +0000213number of bytes may have been written to the file.
Guido van Rossumd1883581995-02-15 15:53:08 +0000214
Fred Drake9b28fe21998-04-04 06:20:28 +0000215It is possible to make multiple calls to the \method{dump()} method of
216the same \class{Pickler} instance. These must then be matched to the
217same number of calls to the \method{load()} method of the
218corresponding \class{Unpickler} instance. If the same object is
219pickled by multiple \method{dump()} calls, the \method{load()} will all
Fred Drakecf7e8301998-01-09 22:36:51 +0000220yield references to the same object. \emph{Warning}: this is intended
Guido van Rossum470be141995-03-17 16:07:09 +0000221for pickling multiple objects without intervening modifications to the
222objects or their parts. If you modify an object and then pickle it
Fred Drake9b28fe21998-04-04 06:20:28 +0000223again using the same \class{Pickler} instance, the object is not
Guido van Rossum470be141995-03-17 16:07:09 +0000224pickled again --- a reference to it is pickled and the
Fred Drake9b28fe21998-04-04 06:20:28 +0000225\class{Unpickler} will return the old value, not the modified one.
Guido van Rossum470be141995-03-17 16:07:09 +0000226(There are two problems here: (a) detecting changes, and (b)
227marshalling a minimal set of changes. I have no answers. Garbage
228Collection may also become a problem here.)
229
Fred Drake9b28fe21998-04-04 06:20:28 +0000230Apart from the \class{Pickler} and \class{Unpickler} classes, the
Guido van Rossum470be141995-03-17 16:07:09 +0000231module defines the following functions, and an exception:
232
Fred Drakecce10901998-03-17 06:33:25 +0000233\begin{funcdesc}{dump}{object, file\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000234Write a pickled representation of \var{obect} to the open file object
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000235\var{file}. This is equivalent to
Fred Drake9b28fe21998-04-04 06:20:28 +0000236\samp{Pickler(\var{file}, \var{bin}).dump(\var{object})}.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000237If the optional \var{bin} argument is present and nonzero, the binary
238pickle format is used; if it is zero or absent, the (less efficient)
239text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000240\end{funcdesc}
241
242\begin{funcdesc}{load}{file}
243Read a pickled object from the open file object \var{file}. This is
Fred Drake9b28fe21998-04-04 06:20:28 +0000244equivalent to \samp{Unpickler(\var{file}).load()}.
Guido van Rossum470be141995-03-17 16:07:09 +0000245\end{funcdesc}
246
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000247\begin{funcdesc}{dumps}{object\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000248Return the pickled representation of the object as a string, instead
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000249of writing it to a file. If the optional \var{bin} argument is
250present and nonzero, the binary pickle format is used; if it is zero
251or absent, the (less efficient) text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000252\end{funcdesc}
253
254\begin{funcdesc}{loads}{string}
255Read a pickled object from a string instead of a file. Characters in
256the string past the pickled object's representation are ignored.
257\end{funcdesc}
258
259\begin{excdesc}{PicklingError}
260This exception is raised when an unpicklable object is passed to
261\code{Pickler.dump()}.
262\end{excdesc}
Fred Drake40748961998-03-06 21:27:14 +0000263
264
265\begin{seealso}
266\seemodule[copyreg]{copy_reg}{pickle interface constructor
267registration}
Fred Drake9b28fe21998-04-04 06:20:28 +0000268
Fred Drake9b28fe21998-04-04 06:20:28 +0000269\seemodule{shelve}{indexed databases of objects; uses \module{pickle}}
Fred Drake17e56401998-04-11 20:43:51 +0000270
271\seemodule{copy}{shallow and deep object copying}
272
273\seemodule{marshal}{high-performance serialization of built-in types}
Fred Drake40748961998-03-06 21:27:14 +0000274\end{seealso}
Fred Drake9463de21998-04-11 20:05:43 +0000275
276
277\section{Built-in Module \module{cPickle}}
Fred Drakeb91e9341998-07-23 17:59:49 +0000278\declaremodule{builtin}{cPickle}
279
280\modulesynopsis{Faster version of \module{pickle}, but not subclassable.}
281
Fred Drake9463de21998-04-11 20:05:43 +0000282
283% This section was written by Fred L. Drake, Jr. <fdrake@acm.org>
284
285The \module{cPickle} module provides a similar interface and identical
286functionality as the \module{pickle} module, but can be up to 1000
287times faster since it is implemented in \C{}. The only other
288important difference to note is that \function{Pickler()} and
289\function{Unpickler()} are functions and not classes, and so cannot be
290subclassed. This should not be an issue in most cases.
291
292The format of the pickle data is identical to that produced using the
293\module{pickle} module, so it is possible to use \module{pickle} and
294\module{cPickle} interchangably with existing pickles.