blob: f7879f1eda1904d8b12a5d8a220e0dd317fdacb5 [file] [log] [blame]
Guido van Rossum470be141995-03-17 16:07:09 +00001\section{Built-in Module \sectcode{struct}}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +00002\bimodindex{struct}
3\indexii{C}{structures}
4
5This module performs conversions between Python values and C
6structs represented as Python strings. It uses \dfn{format strings}
7(explained below) as compact descriptions of the lay-out of the C
8structs and the intended conversion to/from Python values.
9
Guido van Rossumecde7811995-03-28 13:35:14 +000010See also built-in module \code{array}.
11\bimodindex{array}
12
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000013The module defines the following exception and functions:
14
15\renewcommand{\indexsubitem}{(in module struct)}
16\begin{excdesc}{error}
17 Exception raised on various occasions; argument is a string
18 describing what is wrong.
19\end{excdesc}
20
21\begin{funcdesc}{pack}{fmt\, v1\, v2\, {\rm \ldots}}
22 Return a string containing the values
23 \code{\var{v1}, \var{v2}, {\rm \ldots}} packed according to the given
24 format. The arguments must match the values required by the format
25 exactly.
26\end{funcdesc}
27
28\begin{funcdesc}{unpack}{fmt\, string}
29 Unpack the string (presumably packed by \code{pack(\var{fmt}, {\rm \ldots})})
30 according to the given format. The result is a tuple even if it
31 contains exactly one item. The string must contain exactly the
32 amount of data required by the format (i.e. \code{len(\var{string})} must
33 equal \code{calcsize(\var{fmt})}).
34\end{funcdesc}
35
36\begin{funcdesc}{calcsize}{fmt}
37 Return the size of the struct (and hence of the string)
38 corresponding to the given format.
39\end{funcdesc}
40
41Format characters have the following meaning; the conversion between C
42and Python values should be obvious given their types:
43
44\begin{tableiii}{|c|l|l|}{samp}{Format}{C}{Python}
45 \lineiii{x}{pad byte}{no value}
46 \lineiii{c}{char}{string of length 1}
47 \lineiii{b}{signed char}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000048 \lineiii{B}{unsigned char}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000049 \lineiii{h}{short}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000050 \lineiii{H}{unsigned short}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000051 \lineiii{i}{int}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000052 \lineiii{I}{unsigned int}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000053 \lineiii{l}{long}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000054 \lineiii{L}{unsigned long}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000055 \lineiii{f}{float}{float}
56 \lineiii{d}{double}{float}
Guido van Rossum12543461996-12-31 02:22:14 +000057 \lineiii{s}{char[]}{string}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000058\end{tableiii}
59
Guido van Rossum6c4f0031995-03-07 10:14:09 +000060A format character may be preceded by an integral repeat count; e.g.\
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000061the format string \code{'4h'} means exactly the same as \code{'hhhh'}.
62
Guido van Rossum12543461996-12-31 02:22:14 +000063For the \code{'s'} format character, the count is interpreted as the
64size of the string, not a repeat count like for the other format
65characters; e.g. \code{'10s'} means a single 10-byte string, while
66\code{'10c'} means 10 characters. For packing, the string is
67truncated or padded with null bytes as appropriate to make it fit.
68For unpacking, the resulting string always has exactly the specified
69number of bytes. As a special case, \code{'0s'} means a single, empty
70string (while \code{'0c'} means 0 characters).
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000071
Guido van Rossum12543461996-12-31 02:22:14 +000072For the \code{'I'} and \code{'L'} format characters, the return
Guido van Rossum65307171997-01-03 19:21:53 +000073value is a Python long integer.
Guido van Rossum12543461996-12-31 02:22:14 +000074
75By default, C numbers are represented in the machine's native format
76and byte order, and properly aligned by skipping pad bytes if
77necessary (according to the rules used by the C compiler).
78
79Alternatively, the first character of the format string can be used to
80indicate the byte order, size and alignment of the packed data,
81according to the following table:
82
83\begin{tableiii}{|c|l|l|}{samp}{Character}{Byte order}{Size and alignment}
84 \lineiii{@}{native}{native}
85 \lineiii{=}{native}{standard}
86 \lineiii{<}{little-endian}{standard}
87 \lineiii{>}{big-endian}{standard}
88 \lineiii{!}{network (= big-endian)}{standard}
89\end{tableiii}
90
91If the first character is not one of these, \code{'@'} is assumed.
92
93Native byte order is big-endian or little-endian, depending on the
94host system (e.g. Motorola and Sun are big-endian; Intel and DEC are
95little-endian).
96
97Native size and alignment are determined using the C compiler's sizeof
98expression. This is always combined with native byte order.
99
100Standard size and alignment are as follows: no alignment is required
101for any type (so you have to use pad bytes); short is 2 bytes; int and
Guido van Rossumdbadd551997-01-03 04:20:09 +0000102long are 4 bytes. Float and double are 32-bit and 64-bit IEEE floating
103point numbers, respectively.
Guido van Rossum12543461996-12-31 02:22:14 +0000104
105Note the difference between \code{'@'} and \code{'='}: both use native
106byte order, but the size and alignment of the latter is standardized.
107
108The form \code{'!'} is available for those poor souls who claim they
109can't remember whether network byte order is big-endian or
110little-endian.
111
112There is no way to indicate non-native byte order (i.e. force
113byte-swapping); use the appropriate choice of \code{'<'} or
114\code{'>'}.
115
116Examples (all using native byte order, size and alignment, on a
117big-endian machine):
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000118
119\bcode\begin{verbatim}
Guido van Rossumdbadd551997-01-03 04:20:09 +0000120>>> from struct import *
121>>> pack('hhl', 1, 2, 3)
122'\000\001\000\002\000\000\000\003'
123>>> unpack('hhl', '\000\001\000\002\000\000\000\003')
124(1, 2, 3)
125>>> calcsize('hhl')
1268
127>>>
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000128\end{verbatim}\ecode
129
130Hint: to align the end of a structure to the alignment requirement of
131a particular type, end the format with the code for that type with a
Guido van Rossum6c4f0031995-03-07 10:14:09 +0000132repeat count of zero, e.g.\ the format \code{'llh0l'} specifies two
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000133pad bytes at the end, assuming longs are aligned on 4-byte boundaries.
Guido van Rossum12543461996-12-31 02:22:14 +0000134(This only works when native size and alignment are in effect;
135standard size and alignment does not enforce any alignment.)