blob: f29d83c599538f4891e2d5f0536d7255f52150f6 [file] [log] [blame]
Guido van Rossum470be141995-03-17 16:07:09 +00001\section{Built-in Module \sectcode{struct}}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +00002\bimodindex{struct}
3\indexii{C}{structures}
4
5This module performs conversions between Python values and C
6structs represented as Python strings. It uses \dfn{format strings}
7(explained below) as compact descriptions of the lay-out of the C
8structs and the intended conversion to/from Python values.
9
Guido van Rossumecde7811995-03-28 13:35:14 +000010See also built-in module \code{array}.
11\bimodindex{array}
12
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000013The module defines the following exception and functions:
14
15\renewcommand{\indexsubitem}{(in module struct)}
16\begin{excdesc}{error}
17 Exception raised on various occasions; argument is a string
18 describing what is wrong.
19\end{excdesc}
20
21\begin{funcdesc}{pack}{fmt\, v1\, v2\, {\rm \ldots}}
22 Return a string containing the values
23 \code{\var{v1}, \var{v2}, {\rm \ldots}} packed according to the given
24 format. The arguments must match the values required by the format
25 exactly.
26\end{funcdesc}
27
28\begin{funcdesc}{unpack}{fmt\, string}
29 Unpack the string (presumably packed by \code{pack(\var{fmt}, {\rm \ldots})})
30 according to the given format. The result is a tuple even if it
31 contains exactly one item. The string must contain exactly the
32 amount of data required by the format (i.e. \code{len(\var{string})} must
33 equal \code{calcsize(\var{fmt})}).
34\end{funcdesc}
35
36\begin{funcdesc}{calcsize}{fmt}
37 Return the size of the struct (and hence of the string)
38 corresponding to the given format.
39\end{funcdesc}
40
41Format characters have the following meaning; the conversion between C
42and Python values should be obvious given their types:
43
44\begin{tableiii}{|c|l|l|}{samp}{Format}{C}{Python}
45 \lineiii{x}{pad byte}{no value}
46 \lineiii{c}{char}{string of length 1}
47 \lineiii{b}{signed char}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000048 \lineiii{B}{unsigned char}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000049 \lineiii{h}{short}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000050 \lineiii{H}{unsigned short}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000051 \lineiii{i}{int}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000052 \lineiii{I}{unsigned int}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000053 \lineiii{l}{long}{integer}
Guido van Rossum12543461996-12-31 02:22:14 +000054 \lineiii{L}{unsigned long}{integer}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000055 \lineiii{f}{float}{float}
56 \lineiii{d}{double}{float}
Guido van Rossum12543461996-12-31 02:22:14 +000057 \lineiii{s}{char[]}{string}
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000058\end{tableiii}
59
Guido van Rossum6c4f0031995-03-07 10:14:09 +000060A format character may be preceded by an integral repeat count; e.g.\
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000061the format string \code{'4h'} means exactly the same as \code{'hhhh'}.
62
Guido van Rossum12543461996-12-31 02:22:14 +000063For the \code{'s'} format character, the count is interpreted as the
64size of the string, not a repeat count like for the other format
65characters; e.g. \code{'10s'} means a single 10-byte string, while
66\code{'10c'} means 10 characters. For packing, the string is
67truncated or padded with null bytes as appropriate to make it fit.
68For unpacking, the resulting string always has exactly the specified
69number of bytes. As a special case, \code{'0s'} means a single, empty
70string (while \code{'0c'} means 0 characters).
Guido van Rossum5fdeeea1994-01-02 01:22:07 +000071
Guido van Rossum12543461996-12-31 02:22:14 +000072For the \code{'I'} and \code{'L'} format characters, the return
73value is a Python long integer if a Python plain integer can't
74represent the required range (note: this is dependent on the size of
75the relevant C types only, not of the sign of the actual value).
76
77By default, C numbers are represented in the machine's native format
78and byte order, and properly aligned by skipping pad bytes if
79necessary (according to the rules used by the C compiler).
80
81Alternatively, the first character of the format string can be used to
82indicate the byte order, size and alignment of the packed data,
83according to the following table:
84
85\begin{tableiii}{|c|l|l|}{samp}{Character}{Byte order}{Size and alignment}
86 \lineiii{@}{native}{native}
87 \lineiii{=}{native}{standard}
88 \lineiii{<}{little-endian}{standard}
89 \lineiii{>}{big-endian}{standard}
90 \lineiii{!}{network (= big-endian)}{standard}
91\end{tableiii}
92
93If the first character is not one of these, \code{'@'} is assumed.
94
95Native byte order is big-endian or little-endian, depending on the
96host system (e.g. Motorola and Sun are big-endian; Intel and DEC are
97little-endian).
98
99Native size and alignment are determined using the C compiler's sizeof
100expression. This is always combined with native byte order.
101
102Standard size and alignment are as follows: no alignment is required
103for any type (so you have to use pad bytes); short is 2 bytes; int and
104long are 4 bytes. In this mode, there is no support for float and
105double (\code{'f'} and \code{'d'}).
106
107Note the difference between \code{'@'} and \code{'='}: both use native
108byte order, but the size and alignment of the latter is standardized.
109
110The form \code{'!'} is available for those poor souls who claim they
111can't remember whether network byte order is big-endian or
112little-endian.
113
114There is no way to indicate non-native byte order (i.e. force
115byte-swapping); use the appropriate choice of \code{'<'} or
116\code{'>'}.
117
118Examples (all using native byte order, size and alignment, on a
119big-endian machine):
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000120
121\bcode\begin{verbatim}
Guido van Rossum12543461996-12-31 02:22:14 +0000122from struct import *
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000123pack('hhl', 1, 2, 3) == '\000\001\000\002\000\000\000\003'
124unpack('hhl', '\000\001\000\002\000\000\000\003') == (1, 2, 3)
125calcsize('hhl') == 8
126\end{verbatim}\ecode
127
128Hint: to align the end of a structure to the alignment requirement of
129a particular type, end the format with the code for that type with a
Guido van Rossum6c4f0031995-03-07 10:14:09 +0000130repeat count of zero, e.g.\ the format \code{'llh0l'} specifies two
Guido van Rossum5fdeeea1994-01-02 01:22:07 +0000131pad bytes at the end, assuming longs are aligned on 4-byte boundaries.
Guido van Rossum12543461996-12-31 02:22:14 +0000132(This only works when native size and alignment are in effect;
133standard size and alignment does not enforce any alignment.)