blob: 3c1eaa6dcefe62eaa7cbcc47de54d8737451bb61 [file] [log] [blame]
Guido van Rossum470be141995-03-17 16:07:09 +00001\section{Standard Module \sectcode{pickle}}
Guido van Rossume47da0a1997-07-17 16:34:52 +00002\label{module-pickle}
Guido van Rossumd1883581995-02-15 15:53:08 +00003\stmodindex{pickle}
4\index{persistency}
5\indexii{persistent}{objects}
6\indexii{serializing}{objects}
7\indexii{marshalling}{objects}
8\indexii{flattening}{objects}
9\indexii{pickling}{objects}
10
Guido van Rossum470be141995-03-17 16:07:09 +000011
Fred Drake9b28fe21998-04-04 06:20:28 +000012The \module{pickle} module implements a basic but powerful algorithm for
Guido van Rossum6bb1adc1995-03-13 10:03:32 +000013``pickling'' (a.k.a.\ serializing, marshalling or flattening) nearly
Guido van Rossumecde7811995-03-28 13:35:14 +000014arbitrary Python objects. This is the act of converting objects to a
15stream of bytes (and back: ``unpickling'').
16This is a more primitive notion than
Fred Drake9b28fe21998-04-04 06:20:28 +000017persistency --- although \module{pickle} reads and writes file objects,
Guido van Rossumd1883581995-02-15 15:53:08 +000018it does not handle the issue of naming persistent objects, nor the
19(even more complicated) area of concurrent access to persistent
Fred Drake9b28fe21998-04-04 06:20:28 +000020objects. The \module{pickle} module can transform a complex object into
Guido van Rossumd1883581995-02-15 15:53:08 +000021a byte stream and it can transform the byte stream into an object with
22the same internal structure. The most obvious thing to do with these
23byte streams is to write them onto a file, but it is also conceivable
24to send them across a network or store them in a database. The module
Fred Drake9b28fe21998-04-04 06:20:28 +000025\module{shelve} provides a simple interface to pickle and unpickle
Guido van Rossumd1883581995-02-15 15:53:08 +000026objects on ``dbm''-style database files.
Fred Drake54820dc1997-12-15 21:56:05 +000027\refstmodindex{shelve}
Guido van Rossumd1883581995-02-15 15:53:08 +000028
Fred Drake9b28fe21998-04-04 06:20:28 +000029\strong{Note:} The \module{pickle} module is rather slow. A
30reimplementation of the same algorithm in \C{}, which is up to 1000 times
31faster, is available as the \module{cPickle}\refbimodindex{cPickle}
Fred Drakecf7e8301998-01-09 22:36:51 +000032module. This has the same interface except that \code{Pickler} and
33\code{Unpickler} are factory functions, not classes (so they cannot be
Fred Drake9b28fe21998-04-04 06:20:28 +000034used as base classes for inheritance).
Guido van Rossum736fe5e1997-12-09 20:45:08 +000035
Fred Drake9b28fe21998-04-04 06:20:28 +000036Unlike the built-in module \module{marshal}, \module{pickle} handles
37the following correctly:
Fred Drake54820dc1997-12-15 21:56:05 +000038\refbimodindex{marshal}
Guido van Rossumd1883581995-02-15 15:53:08 +000039
40\begin{itemize}
41
Guido van Rossum470be141995-03-17 16:07:09 +000042\item recursive objects (objects containing references to themselves)
Guido van Rossumd1883581995-02-15 15:53:08 +000043
Guido van Rossum470be141995-03-17 16:07:09 +000044\item object sharing (references to the same object in different places)
Guido van Rossumd1883581995-02-15 15:53:08 +000045
Guido van Rossum470be141995-03-17 16:07:09 +000046\item user-defined classes and their instances
Guido van Rossumd1883581995-02-15 15:53:08 +000047
48\end{itemize}
49
Fred Drake9b28fe21998-04-04 06:20:28 +000050The data format used by \module{pickle} is Python-specific. This has
Guido van Rossumd1883581995-02-15 15:53:08 +000051the advantage that there are no restrictions imposed by external
Fred Drakecf7e8301998-01-09 22:36:51 +000052standards such as XDR%
53\index{XDR}
54\index{External Data Representation}
55(which can't represent pointer sharing); however
56it means that non-Python programs may not be able to reconstruct
57pickled Python objects.
Guido van Rossumd1883581995-02-15 15:53:08 +000058
Fred Drake9b28fe21998-04-04 06:20:28 +000059By default, the \module{pickle} data format uses a printable \ASCII{}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000060representation. This is slightly more voluminous than a binary
61representation. The big advantage of using printable \ASCII{} (and of
Fred Drake9b28fe21998-04-04 06:20:28 +000062some other characteristics of \module{pickle}'s representation) is that
Guido van Rossum736fe5e1997-12-09 20:45:08 +000063for debugging or recovery purposes it is possible for a human to read
64the pickled file with a standard text editor.
65
66A binary format, which is slightly more efficient, can be chosen by
67specifying a nonzero (true) value for the \var{bin} argument to the
Fred Drake9b28fe21998-04-04 06:20:28 +000068\class{Pickler} constructor or the \function{dump()} and \function{dumps()}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000069functions. The binary format is not the default because of backwards
70compatibility with the Python 1.4 pickle module. In a future version,
71the default may change to binary.
Guido van Rossumd1883581995-02-15 15:53:08 +000072
Fred Drake9b28fe21998-04-04 06:20:28 +000073The \module{pickle} module doesn't handle code objects, which the
74\module{marshal} module does. I suppose \module{pickle} could, and maybe
Guido van Rossumd1883581995-02-15 15:53:08 +000075it should, but there's probably no great need for it right now (as
Fred Drake9b28fe21998-04-04 06:20:28 +000076long as \module{marshal} continues to be used for reading and writing
Guido van Rossumd1883581995-02-15 15:53:08 +000077code objects), and at least this avoids the possibility of smuggling
78Trojan horses into a program.
Fred Drake54820dc1997-12-15 21:56:05 +000079\refbimodindex{marshal}
Guido van Rossumd1883581995-02-15 15:53:08 +000080
Fred Drake9b28fe21998-04-04 06:20:28 +000081For the benefit of persistency modules written using \module{pickle}, it
Guido van Rossumd1883581995-02-15 15:53:08 +000082supports the notion of a reference to an object outside the pickled
83data stream. Such objects are referenced by a name, which is an
Guido van Rossum470be141995-03-17 16:07:09 +000084arbitrary string of printable \ASCII{} characters. The resolution of
Fred Drake9b28fe21998-04-04 06:20:28 +000085such names is not defined by the \module{pickle} module --- the
Guido van Rossumd1883581995-02-15 15:53:08 +000086persistent object module will have to implement a method
Fred Drake9b28fe21998-04-04 06:20:28 +000087\method{persistent_load()}. To write references to persistent objects,
88the persistent module must define a method \method{persistent_id()} which
Guido van Rossumd1883581995-02-15 15:53:08 +000089returns either \code{None} or the persistent ID of the object.
90
91There are some restrictions on the pickling of class instances.
92
93First of all, the class must be defined at the top level in a module.
Guido van Rossum736fe5e1997-12-09 20:45:08 +000094Furthermore, all its instance variables must be picklable.
Guido van Rossumd1883581995-02-15 15:53:08 +000095
Fred Drake19479911998-02-13 06:58:54 +000096\setindexsubitem{(pickle protocol)}
Guido van Rossum470be141995-03-17 16:07:09 +000097
Fred Drake9b28fe21998-04-04 06:20:28 +000098When a pickled class instance is unpickled, its \method{__init__()} method
Guido van Rossum736fe5e1997-12-09 20:45:08 +000099is normally \emph{not} invoked. \strong{Note:} This is a deviation
100from previous versions of this module; the change was introduced in
101Python 1.5b2. The reason for the change is that in many cases it is
102desirable to have a constructor that requires arguments; it is a
Fred Drake9b28fe21998-04-04 06:20:28 +0000103(minor) nuisance to have to provide a \method{__getinitargs__()} method.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000104
Fred Drake9b28fe21998-04-04 06:20:28 +0000105If it is desirable that the \method{__init__()} method be called on
106unpickling, a class can define a method \method{__getinitargs__()},
Fred Drakecf7e8301998-01-09 22:36:51 +0000107which should return a \emph{tuple} containing the arguments to be
Fred Drake9b28fe21998-04-04 06:20:28 +0000108passed to the class constructor (\method{__init__()}). This method is
Guido van Rossum57930391997-12-30 17:44:48 +0000109called at pickle time; the tuple it returns is incorporated in the
110pickle for the instance.
Fred Drake9b28fe21998-04-04 06:20:28 +0000111\ttindex{__getinitargs__()}
112\ttindex{__init__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000113
Guido van Rossum470be141995-03-17 16:07:09 +0000114Classes can further influence how their instances are pickled --- if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000115defines the method \method{__getstate__()}, it is called and the return
Guido van Rossumd1883581995-02-15 15:53:08 +0000116state is pickled as the contents for the instance, and if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000117defines the method \method{__setstate__()}, it is called with the
Guido van Rossumd1883581995-02-15 15:53:08 +0000118unpickled state. (Note that these methods can also be used to
119implement copying class instances.) If there is no
Fred Drake9b28fe21998-04-04 06:20:28 +0000120\method{__getstate__()} method, the instance's \member{__dict__} is
121pickled. If there is no \method{__setstate__()} method, the pickled
Guido van Rossumd1883581995-02-15 15:53:08 +0000122object must be a dictionary and its items are assigned to the new
Fred Drake9b28fe21998-04-04 06:20:28 +0000123instance's dictionary. (If a class defines both \method{__getstate__()}
124and \method{__setstate__()}, the state object needn't be a dictionary
Guido van Rossumd1883581995-02-15 15:53:08 +0000125--- these methods can do what they want.) This protocol is also used
Fred Drake9b28fe21998-04-04 06:20:28 +0000126by the shallow and deep copying operations defined in the \module{copy}
127module.\refstmodindex{copy}
128\ttindex{__getstate__()}
129\ttindex{__setstate__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000130\ttindex{__dict__}
131
132Note that when class instances are pickled, their class's code and
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000133data are not pickled along with them. Only the instance data are
Guido van Rossumd1883581995-02-15 15:53:08 +0000134pickled. This is done on purpose, so you can fix bugs in a class or
135add methods and still load objects that were created with an earlier
136version of the class. If you plan to have long-lived objects that
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000137will see many versions of a class, it may be worthwhile to put a version
Guido van Rossumd1883581995-02-15 15:53:08 +0000138number in the objects so that suitable conversions can be made by the
Fred Drake9b28fe21998-04-04 06:20:28 +0000139class's \method{__setstate__()} method.
Guido van Rossumd1883581995-02-15 15:53:08 +0000140
Guido van Rossum470be141995-03-17 16:07:09 +0000141When a class itself is pickled, only its name is pickled --- the class
142definition is not pickled, but re-imported by the unpickling process.
143Therefore, the restriction that the class must be defined at the top
144level in a module applies to pickled classes as well.
145
Fred Drake19479911998-02-13 06:58:54 +0000146\setindexsubitem{(in module pickle)}
Guido van Rossum470be141995-03-17 16:07:09 +0000147
Guido van Rossumd1883581995-02-15 15:53:08 +0000148The interface can be summarized as follows.
149
150To pickle an object \code{x} onto a file \code{f}, open for writing:
151
Fred Drake19479911998-02-13 06:58:54 +0000152\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000153p = pickle.Pickler(f)
154p.dump(x)
Fred Drake19479911998-02-13 06:58:54 +0000155\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000156
Guido van Rossum470be141995-03-17 16:07:09 +0000157A shorthand for this is:
158
Fred Drake19479911998-02-13 06:58:54 +0000159\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000160pickle.dump(x, f)
Fred Drake19479911998-02-13 06:58:54 +0000161\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000162
Guido van Rossumd1883581995-02-15 15:53:08 +0000163To unpickle an object \code{x} from a file \code{f}, open for reading:
164
Fred Drake19479911998-02-13 06:58:54 +0000165\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000166u = pickle.Unpickler(f)
Guido van Rossum96628a91995-04-10 11:34:00 +0000167x = u.load()
Fred Drake19479911998-02-13 06:58:54 +0000168\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000169
Guido van Rossum470be141995-03-17 16:07:09 +0000170A shorthand is:
171
Fred Drake19479911998-02-13 06:58:54 +0000172\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000173x = pickle.load(f)
Fred Drake19479911998-02-13 06:58:54 +0000174\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000175
176The \class{Pickler} class only calls the method \code{f.write()} with a
177string argument. The \class{Unpickler} calls the methods \code{f.read()}
Fred Drakecf7e8301998-01-09 22:36:51 +0000178(with an integer argument) and \code{f.readline()} (without argument),
Guido van Rossumd1883581995-02-15 15:53:08 +0000179both returning a string. It is explicitly allowed to pass non-file
180objects here, as long as they have the right methods.
Guido van Rossum470be141995-03-17 16:07:09 +0000181\ttindex{Unpickler}
182\ttindex{Pickler}
Guido van Rossumd1883581995-02-15 15:53:08 +0000183
Fred Drake9b28fe21998-04-04 06:20:28 +0000184The constructor for the \class{Pickler} class has an optional second
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000185argument, \var{bin}. If this is present and nonzero, the binary
186pickle format is used; if it is zero or absent, the (less efficient,
187but backwards compatible) text pickle format is used. The
Fred Drake9b28fe21998-04-04 06:20:28 +0000188\class{Unpickler} class does not have an argument to distinguish
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000189between binary and text pickle formats; it accepts either format.
190
Guido van Rossumd1883581995-02-15 15:53:08 +0000191The following types can be pickled:
192\begin{itemize}
193
194\item \code{None}
195
196\item integers, long integers, floating point numbers
197
198\item strings
199
200\item tuples, lists and dictionaries containing only picklable objects
201
Guido van Rossum470be141995-03-17 16:07:09 +0000202\item classes that are defined at the top level in a module
203
Fred Drake9b28fe21998-04-04 06:20:28 +0000204\item instances of such classes whose \member{__dict__} or
205\method{__setstate__()} is picklable
Guido van Rossumd1883581995-02-15 15:53:08 +0000206
207\end{itemize}
208
Guido van Rossum470be141995-03-17 16:07:09 +0000209Attempts to pickle unpicklable objects will raise the
Fred Drake9b28fe21998-04-04 06:20:28 +0000210\exception{PicklingError} exception; when this happens, an unspecified
Guido van Rossum470be141995-03-17 16:07:09 +0000211number of bytes may have been written to the file.
Guido van Rossumd1883581995-02-15 15:53:08 +0000212
Fred Drake9b28fe21998-04-04 06:20:28 +0000213It is possible to make multiple calls to the \method{dump()} method of
214the same \class{Pickler} instance. These must then be matched to the
215same number of calls to the \method{load()} method of the
216corresponding \class{Unpickler} instance. If the same object is
217pickled by multiple \method{dump()} calls, the \method{load()} will all
Fred Drakecf7e8301998-01-09 22:36:51 +0000218yield references to the same object. \emph{Warning}: this is intended
Guido van Rossum470be141995-03-17 16:07:09 +0000219for pickling multiple objects without intervening modifications to the
220objects or their parts. If you modify an object and then pickle it
Fred Drake9b28fe21998-04-04 06:20:28 +0000221again using the same \class{Pickler} instance, the object is not
Guido van Rossum470be141995-03-17 16:07:09 +0000222pickled again --- a reference to it is pickled and the
Fred Drake9b28fe21998-04-04 06:20:28 +0000223\class{Unpickler} will return the old value, not the modified one.
Guido van Rossum470be141995-03-17 16:07:09 +0000224(There are two problems here: (a) detecting changes, and (b)
225marshalling a minimal set of changes. I have no answers. Garbage
226Collection may also become a problem here.)
227
Fred Drake9b28fe21998-04-04 06:20:28 +0000228Apart from the \class{Pickler} and \class{Unpickler} classes, the
Guido van Rossum470be141995-03-17 16:07:09 +0000229module defines the following functions, and an exception:
230
Fred Drakecce10901998-03-17 06:33:25 +0000231\begin{funcdesc}{dump}{object, file\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000232Write a pickled representation of \var{obect} to the open file object
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000233\var{file}. This is equivalent to
Fred Drake9b28fe21998-04-04 06:20:28 +0000234\samp{Pickler(\var{file}, \var{bin}).dump(\var{object})}.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000235If the optional \var{bin} argument is present and nonzero, the binary
236pickle format is used; if it is zero or absent, the (less efficient)
237text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000238\end{funcdesc}
239
240\begin{funcdesc}{load}{file}
241Read a pickled object from the open file object \var{file}. This is
Fred Drake9b28fe21998-04-04 06:20:28 +0000242equivalent to \samp{Unpickler(\var{file}).load()}.
Guido van Rossum470be141995-03-17 16:07:09 +0000243\end{funcdesc}
244
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000245\begin{funcdesc}{dumps}{object\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000246Return the pickled representation of the object as a string, instead
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000247of writing it to a file. If the optional \var{bin} argument is
248present and nonzero, the binary pickle format is used; if it is zero
249or absent, the (less efficient) text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000250\end{funcdesc}
251
252\begin{funcdesc}{loads}{string}
253Read a pickled object from a string instead of a file. Characters in
254the string past the pickled object's representation are ignored.
255\end{funcdesc}
256
257\begin{excdesc}{PicklingError}
258This exception is raised when an unpicklable object is passed to
259\code{Pickler.dump()}.
260\end{excdesc}
Fred Drake40748961998-03-06 21:27:14 +0000261
262
263\begin{seealso}
Fred Drake9b28fe21998-04-04 06:20:28 +0000264\seemodule{copy}{shallow and deep object copying}
265
Fred Drake40748961998-03-06 21:27:14 +0000266\seemodule[copyreg]{copy_reg}{pickle interface constructor
267registration}
Fred Drake9b28fe21998-04-04 06:20:28 +0000268
269\seemodule{marshal}{high-performance serialization of built-in types}
270
271\seemodule{shelve}{indexed databases of objects; uses \module{pickle}}
Fred Drake40748961998-03-06 21:27:14 +0000272\end{seealso}