blob: c49a05f304ac89ae719bbc1036fcaadf647214c4 [file] [log] [blame]
Fred Drake295da241998-08-10 19:42:37 +00001\section{\module{pickle} ---
Fred Drakeffbe6871999-04-22 21:23:22 +00002 Python object serialization}
Fred Drakeb91e9341998-07-23 17:59:49 +00003
Fred Drakeffbe6871999-04-22 21:23:22 +00004\declaremodule{standard}{pickle}
Fred Drakeb91e9341998-07-23 17:59:49 +00005\modulesynopsis{Convert Python objects to streams of bytes and back.}
6
Guido van Rossumd1883581995-02-15 15:53:08 +00007\index{persistency}
8\indexii{persistent}{objects}
9\indexii{serializing}{objects}
10\indexii{marshalling}{objects}
11\indexii{flattening}{objects}
12\indexii{pickling}{objects}
13
Guido van Rossum470be141995-03-17 16:07:09 +000014
Fred Drake9b28fe21998-04-04 06:20:28 +000015The \module{pickle} module implements a basic but powerful algorithm for
Guido van Rossum6bb1adc1995-03-13 10:03:32 +000016``pickling'' (a.k.a.\ serializing, marshalling or flattening) nearly
Guido van Rossumecde7811995-03-28 13:35:14 +000017arbitrary Python objects. This is the act of converting objects to a
18stream of bytes (and back: ``unpickling'').
19This is a more primitive notion than
Fred Drake9b28fe21998-04-04 06:20:28 +000020persistency --- although \module{pickle} reads and writes file objects,
Guido van Rossumd1883581995-02-15 15:53:08 +000021it does not handle the issue of naming persistent objects, nor the
22(even more complicated) area of concurrent access to persistent
Fred Drake9b28fe21998-04-04 06:20:28 +000023objects. The \module{pickle} module can transform a complex object into
Guido van Rossumd1883581995-02-15 15:53:08 +000024a byte stream and it can transform the byte stream into an object with
25the same internal structure. The most obvious thing to do with these
26byte streams is to write them onto a file, but it is also conceivable
27to send them across a network or store them in a database. The module
Fred Drakeffbe6871999-04-22 21:23:22 +000028\refmodule{shelve}\refstmodindex{shelve} provides a simple interface
29to pickle and unpickle objects on DBM-style database files.
30
Guido van Rossumd1883581995-02-15 15:53:08 +000031
Fred Drake9b28fe21998-04-04 06:20:28 +000032\strong{Note:} The \module{pickle} module is rather slow. A
Fred Drakeffbe6871999-04-22 21:23:22 +000033reimplementation of the same algorithm in C, which is up to 1000 times
34faster, is available as the \refmodule{cPickle}\refbimodindex{cPickle}
Fred Drakecf7e8301998-01-09 22:36:51 +000035module. This has the same interface except that \code{Pickler} and
36\code{Unpickler} are factory functions, not classes (so they cannot be
Fred Drake9b28fe21998-04-04 06:20:28 +000037used as base classes for inheritance).
Guido van Rossum736fe5e1997-12-09 20:45:08 +000038
Fred Drakeffbe6871999-04-22 21:23:22 +000039Unlike the built-in module \refmodule{marshal}\refbimodindex{marshal},
40\module{pickle} handles the following correctly:
41
Guido van Rossumd1883581995-02-15 15:53:08 +000042
43\begin{itemize}
44
Guido van Rossum470be141995-03-17 16:07:09 +000045\item recursive objects (objects containing references to themselves)
Guido van Rossumd1883581995-02-15 15:53:08 +000046
Guido van Rossum470be141995-03-17 16:07:09 +000047\item object sharing (references to the same object in different places)
Guido van Rossumd1883581995-02-15 15:53:08 +000048
Guido van Rossum470be141995-03-17 16:07:09 +000049\item user-defined classes and their instances
Guido van Rossumd1883581995-02-15 15:53:08 +000050
51\end{itemize}
52
Fred Drake9b28fe21998-04-04 06:20:28 +000053The data format used by \module{pickle} is Python-specific. This has
Guido van Rossumd1883581995-02-15 15:53:08 +000054the advantage that there are no restrictions imposed by external
Fred Drakeffbe6871999-04-22 21:23:22 +000055standards such as
56XDR\index{XDR}\index{External Data Representation} (which can't
57represent pointer sharing); however it means that non-Python programs
58may not be able to reconstruct pickled Python objects.
Guido van Rossumd1883581995-02-15 15:53:08 +000059
Fred Drake9b28fe21998-04-04 06:20:28 +000060By default, the \module{pickle} data format uses a printable \ASCII{}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000061representation. This is slightly more voluminous than a binary
62representation. The big advantage of using printable \ASCII{} (and of
Fred Drake9b28fe21998-04-04 06:20:28 +000063some other characteristics of \module{pickle}'s representation) is that
Guido van Rossum736fe5e1997-12-09 20:45:08 +000064for debugging or recovery purposes it is possible for a human to read
65the pickled file with a standard text editor.
66
67A binary format, which is slightly more efficient, can be chosen by
68specifying a nonzero (true) value for the \var{bin} argument to the
Fred Drake9b28fe21998-04-04 06:20:28 +000069\class{Pickler} constructor or the \function{dump()} and \function{dumps()}
Guido van Rossum736fe5e1997-12-09 20:45:08 +000070functions. The binary format is not the default because of backwards
71compatibility with the Python 1.4 pickle module. In a future version,
72the default may change to binary.
Guido van Rossumd1883581995-02-15 15:53:08 +000073
Fred Drake9b28fe21998-04-04 06:20:28 +000074The \module{pickle} module doesn't handle code objects, which the
Fred Drakeffbe6871999-04-22 21:23:22 +000075\refmodule{marshal} module does. I suppose \module{pickle} could, and maybe
Guido van Rossumd1883581995-02-15 15:53:08 +000076it should, but there's probably no great need for it right now (as
Fred Drakeffbe6871999-04-22 21:23:22 +000077long as \refmodule{marshal} continues to be used for reading and writing
Guido van Rossumd1883581995-02-15 15:53:08 +000078code objects), and at least this avoids the possibility of smuggling
79Trojan horses into a program.
Fred Drake54820dc1997-12-15 21:56:05 +000080\refbimodindex{marshal}
Guido van Rossumd1883581995-02-15 15:53:08 +000081
Fred Drake9b28fe21998-04-04 06:20:28 +000082For the benefit of persistency modules written using \module{pickle}, it
Guido van Rossumd1883581995-02-15 15:53:08 +000083supports the notion of a reference to an object outside the pickled
84data stream. Such objects are referenced by a name, which is an
Guido van Rossum470be141995-03-17 16:07:09 +000085arbitrary string of printable \ASCII{} characters. The resolution of
Fred Drake9b28fe21998-04-04 06:20:28 +000086such names is not defined by the \module{pickle} module --- the
Guido van Rossumd1883581995-02-15 15:53:08 +000087persistent object module will have to implement a method
Fred Drake9b28fe21998-04-04 06:20:28 +000088\method{persistent_load()}. To write references to persistent objects,
89the persistent module must define a method \method{persistent_id()} which
Guido van Rossumd1883581995-02-15 15:53:08 +000090returns either \code{None} or the persistent ID of the object.
91
92There are some restrictions on the pickling of class instances.
93
94First of all, the class must be defined at the top level in a module.
Guido van Rossum736fe5e1997-12-09 20:45:08 +000095Furthermore, all its instance variables must be picklable.
Guido van Rossumd1883581995-02-15 15:53:08 +000096
Fred Drake19479911998-02-13 06:58:54 +000097\setindexsubitem{(pickle protocol)}
Guido van Rossum470be141995-03-17 16:07:09 +000098
Fred Drake9b28fe21998-04-04 06:20:28 +000099When a pickled class instance is unpickled, its \method{__init__()} method
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000100is normally \emph{not} invoked. \strong{Note:} This is a deviation
101from previous versions of this module; the change was introduced in
102Python 1.5b2. The reason for the change is that in many cases it is
103desirable to have a constructor that requires arguments; it is a
Fred Drake9b28fe21998-04-04 06:20:28 +0000104(minor) nuisance to have to provide a \method{__getinitargs__()} method.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000105
Fred Drake9b28fe21998-04-04 06:20:28 +0000106If it is desirable that the \method{__init__()} method be called on
107unpickling, a class can define a method \method{__getinitargs__()},
Fred Drakecf7e8301998-01-09 22:36:51 +0000108which should return a \emph{tuple} containing the arguments to be
Fred Drake9b28fe21998-04-04 06:20:28 +0000109passed to the class constructor (\method{__init__()}). This method is
Guido van Rossum57930391997-12-30 17:44:48 +0000110called at pickle time; the tuple it returns is incorporated in the
111pickle for the instance.
Fred Drake9b28fe21998-04-04 06:20:28 +0000112\ttindex{__getinitargs__()}
113\ttindex{__init__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000114
Guido van Rossum470be141995-03-17 16:07:09 +0000115Classes can further influence how their instances are pickled --- if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000116defines the method \method{__getstate__()}, it is called and the return
Guido van Rossumd1883581995-02-15 15:53:08 +0000117state is pickled as the contents for the instance, and if the class
Fred Drake9b28fe21998-04-04 06:20:28 +0000118defines the method \method{__setstate__()}, it is called with the
Guido van Rossumd1883581995-02-15 15:53:08 +0000119unpickled state. (Note that these methods can also be used to
120implement copying class instances.) If there is no
Fred Drake9b28fe21998-04-04 06:20:28 +0000121\method{__getstate__()} method, the instance's \member{__dict__} is
122pickled. If there is no \method{__setstate__()} method, the pickled
Guido van Rossumd1883581995-02-15 15:53:08 +0000123object must be a dictionary and its items are assigned to the new
Fred Drake9b28fe21998-04-04 06:20:28 +0000124instance's dictionary. (If a class defines both \method{__getstate__()}
125and \method{__setstate__()}, the state object needn't be a dictionary
Guido van Rossumd1883581995-02-15 15:53:08 +0000126--- these methods can do what they want.) This protocol is also used
Fred Drakeffbe6871999-04-22 21:23:22 +0000127by the shallow and deep copying operations defined in the
128\refmodule{copy}\refstmodindex{copy} module.
Fred Drake9b28fe21998-04-04 06:20:28 +0000129\ttindex{__getstate__()}
130\ttindex{__setstate__()}
Guido van Rossumd1883581995-02-15 15:53:08 +0000131\ttindex{__dict__}
132
133Note that when class instances are pickled, their class's code and
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000134data are not pickled along with them. Only the instance data are
Guido van Rossumd1883581995-02-15 15:53:08 +0000135pickled. This is done on purpose, so you can fix bugs in a class or
136add methods and still load objects that were created with an earlier
137version of the class. If you plan to have long-lived objects that
Guido van Rossum6bb1adc1995-03-13 10:03:32 +0000138will see many versions of a class, it may be worthwhile to put a version
Guido van Rossumd1883581995-02-15 15:53:08 +0000139number in the objects so that suitable conversions can be made by the
Fred Drake9b28fe21998-04-04 06:20:28 +0000140class's \method{__setstate__()} method.
Guido van Rossumd1883581995-02-15 15:53:08 +0000141
Guido van Rossum470be141995-03-17 16:07:09 +0000142When a class itself is pickled, only its name is pickled --- the class
143definition is not pickled, but re-imported by the unpickling process.
144Therefore, the restriction that the class must be defined at the top
145level in a module applies to pickled classes as well.
146
Fred Drake19479911998-02-13 06:58:54 +0000147\setindexsubitem{(in module pickle)}
Guido van Rossum470be141995-03-17 16:07:09 +0000148
Guido van Rossumd1883581995-02-15 15:53:08 +0000149The interface can be summarized as follows.
150
151To pickle an object \code{x} onto a file \code{f}, open for writing:
152
Fred Drake19479911998-02-13 06:58:54 +0000153\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000154p = pickle.Pickler(f)
155p.dump(x)
Fred Drake19479911998-02-13 06:58:54 +0000156\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000157
Guido van Rossum470be141995-03-17 16:07:09 +0000158A shorthand for this is:
159
Fred Drake19479911998-02-13 06:58:54 +0000160\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000161pickle.dump(x, f)
Fred Drake19479911998-02-13 06:58:54 +0000162\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000163
Guido van Rossumd1883581995-02-15 15:53:08 +0000164To unpickle an object \code{x} from a file \code{f}, open for reading:
165
Fred Drake19479911998-02-13 06:58:54 +0000166\begin{verbatim}
Guido van Rossumd1883581995-02-15 15:53:08 +0000167u = pickle.Unpickler(f)
Guido van Rossum96628a91995-04-10 11:34:00 +0000168x = u.load()
Fred Drake19479911998-02-13 06:58:54 +0000169\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000170
Guido van Rossum470be141995-03-17 16:07:09 +0000171A shorthand is:
172
Fred Drake19479911998-02-13 06:58:54 +0000173\begin{verbatim}
Guido van Rossum470be141995-03-17 16:07:09 +0000174x = pickle.load(f)
Fred Drake19479911998-02-13 06:58:54 +0000175\end{verbatim}
Fred Drake9b28fe21998-04-04 06:20:28 +0000176
177The \class{Pickler} class only calls the method \code{f.write()} with a
178string argument. The \class{Unpickler} calls the methods \code{f.read()}
Fred Drakecf7e8301998-01-09 22:36:51 +0000179(with an integer argument) and \code{f.readline()} (without argument),
Guido van Rossumd1883581995-02-15 15:53:08 +0000180both returning a string. It is explicitly allowed to pass non-file
181objects here, as long as they have the right methods.
Guido van Rossum470be141995-03-17 16:07:09 +0000182\ttindex{Unpickler}
183\ttindex{Pickler}
Guido van Rossumd1883581995-02-15 15:53:08 +0000184
Fred Drake9b28fe21998-04-04 06:20:28 +0000185The constructor for the \class{Pickler} class has an optional second
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000186argument, \var{bin}. If this is present and nonzero, the binary
187pickle format is used; if it is zero or absent, the (less efficient,
188but backwards compatible) text pickle format is used. The
Fred Drake9b28fe21998-04-04 06:20:28 +0000189\class{Unpickler} class does not have an argument to distinguish
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000190between binary and text pickle formats; it accepts either format.
191
Guido van Rossumd1883581995-02-15 15:53:08 +0000192The following types can be pickled:
193\begin{itemize}
194
195\item \code{None}
196
197\item integers, long integers, floating point numbers
198
199\item strings
200
201\item tuples, lists and dictionaries containing only picklable objects
202
Guido van Rossum470be141995-03-17 16:07:09 +0000203\item classes that are defined at the top level in a module
204
Fred Drake9b28fe21998-04-04 06:20:28 +0000205\item instances of such classes whose \member{__dict__} or
206\method{__setstate__()} is picklable
Guido van Rossumd1883581995-02-15 15:53:08 +0000207
208\end{itemize}
209
Guido van Rossum470be141995-03-17 16:07:09 +0000210Attempts to pickle unpicklable objects will raise the
Fred Drake9b28fe21998-04-04 06:20:28 +0000211\exception{PicklingError} exception; when this happens, an unspecified
Guido van Rossum470be141995-03-17 16:07:09 +0000212number of bytes may have been written to the file.
Guido van Rossumd1883581995-02-15 15:53:08 +0000213
Fred Drake9b28fe21998-04-04 06:20:28 +0000214It is possible to make multiple calls to the \method{dump()} method of
215the same \class{Pickler} instance. These must then be matched to the
216same number of calls to the \method{load()} method of the
217corresponding \class{Unpickler} instance. If the same object is
218pickled by multiple \method{dump()} calls, the \method{load()} will all
Fred Drakecf7e8301998-01-09 22:36:51 +0000219yield references to the same object. \emph{Warning}: this is intended
Guido van Rossum470be141995-03-17 16:07:09 +0000220for pickling multiple objects without intervening modifications to the
221objects or their parts. If you modify an object and then pickle it
Fred Drake9b28fe21998-04-04 06:20:28 +0000222again using the same \class{Pickler} instance, the object is not
Guido van Rossum470be141995-03-17 16:07:09 +0000223pickled again --- a reference to it is pickled and the
Fred Drake9b28fe21998-04-04 06:20:28 +0000224\class{Unpickler} will return the old value, not the modified one.
Guido van Rossum470be141995-03-17 16:07:09 +0000225(There are two problems here: (a) detecting changes, and (b)
226marshalling a minimal set of changes. I have no answers. Garbage
227Collection may also become a problem here.)
228
Fred Drake9b28fe21998-04-04 06:20:28 +0000229Apart from the \class{Pickler} and \class{Unpickler} classes, the
Guido van Rossum470be141995-03-17 16:07:09 +0000230module defines the following functions, and an exception:
231
Fred Drakecce10901998-03-17 06:33:25 +0000232\begin{funcdesc}{dump}{object, file\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000233Write a pickled representation of \var{obect} to the open file object
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000234\var{file}. This is equivalent to
Fred Drake9b28fe21998-04-04 06:20:28 +0000235\samp{Pickler(\var{file}, \var{bin}).dump(\var{object})}.
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000236If the optional \var{bin} argument is present and nonzero, the binary
237pickle format is used; if it is zero or absent, the (less efficient)
238text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000239\end{funcdesc}
240
241\begin{funcdesc}{load}{file}
242Read a pickled object from the open file object \var{file}. This is
Fred Drake9b28fe21998-04-04 06:20:28 +0000243equivalent to \samp{Unpickler(\var{file}).load()}.
Guido van Rossum470be141995-03-17 16:07:09 +0000244\end{funcdesc}
245
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000246\begin{funcdesc}{dumps}{object\optional{, bin}}
Guido van Rossum470be141995-03-17 16:07:09 +0000247Return the pickled representation of the object as a string, instead
Guido van Rossum736fe5e1997-12-09 20:45:08 +0000248of writing it to a file. If the optional \var{bin} argument is
249present and nonzero, the binary pickle format is used; if it is zero
250or absent, the (less efficient) text pickle format is used.
Guido van Rossum470be141995-03-17 16:07:09 +0000251\end{funcdesc}
252
253\begin{funcdesc}{loads}{string}
254Read a pickled object from a string instead of a file. Characters in
255the string past the pickled object's representation are ignored.
256\end{funcdesc}
257
258\begin{excdesc}{PicklingError}
259This exception is raised when an unpicklable object is passed to
260\code{Pickler.dump()}.
261\end{excdesc}
Fred Drake40748961998-03-06 21:27:14 +0000262
263
264\begin{seealso}
Fred Drakeffbe6871999-04-22 21:23:22 +0000265 \seemodule[copyreg]{copy_reg}{pickle interface constructor
266 registration}
Fred Drake9b28fe21998-04-04 06:20:28 +0000267
Fred Drakeffbe6871999-04-22 21:23:22 +0000268 \seemodule{shelve}{indexed databases of objects; uses \module{pickle}}
Fred Drake17e56401998-04-11 20:43:51 +0000269
Fred Drakeffbe6871999-04-22 21:23:22 +0000270 \seemodule{copy}{shallow and deep object copying}
Fred Drake17e56401998-04-11 20:43:51 +0000271
Fred Drakeffbe6871999-04-22 21:23:22 +0000272 \seemodule{marshal}{high-performance serialization of built-in types}
Fred Drake40748961998-03-06 21:27:14 +0000273\end{seealso}
Fred Drake9463de21998-04-11 20:05:43 +0000274
275
Fred Drake295da241998-08-10 19:42:37 +0000276\section{\module{cPickle} ---
Fred Drakeffbe6871999-04-22 21:23:22 +0000277 Alternate implementation of \module{pickle}}
278
Fred Drakeb91e9341998-07-23 17:59:49 +0000279\declaremodule{builtin}{cPickle}
Fred Drakeb91e9341998-07-23 17:59:49 +0000280\modulesynopsis{Faster version of \module{pickle}, but not subclassable.}
Fred Drakeffbe6871999-04-22 21:23:22 +0000281\moduleauthor{Jim Fulton}{jfulton@digicool.com}
282\sectionauthor{Fred L. Drake, Jr.}{fdrake@acm.org}
Fred Drakeb91e9341998-07-23 17:59:49 +0000283
Fred Drake9463de21998-04-11 20:05:43 +0000284
Fred Drake9463de21998-04-11 20:05:43 +0000285The \module{cPickle} module provides a similar interface and identical
Fred Drakeffbe6871999-04-22 21:23:22 +0000286functionality as the \refmodule{pickle} module, but can be up to 1000
287times faster since it is implemented in C. The only other
Fred Drake9463de21998-04-11 20:05:43 +0000288important difference to note is that \function{Pickler()} and
289\function{Unpickler()} are functions and not classes, and so cannot be
290subclassed. This should not be an issue in most cases.
291
292The format of the pickle data is identical to that produced using the
Fred Drakeffbe6871999-04-22 21:23:22 +0000293\refmodule{pickle} module, so it is possible to use \refmodule{pickle} and
Fred Drake9463de21998-04-11 20:05:43 +0000294\module{cPickle} interchangably with existing pickles.
Guido van Rossumcf3ce921999-01-06 23:34:39 +0000295
296(Since the pickle data format is actually a tiny stack-oriented
297programming language, and there are some freedoms in the encodings of
298certain objects, it's possible that the two modules produce different
299pickled data for the same input objects; however they will always be
300able to read each others pickles back in.)