Blame - Doc/libstruct.tex - platform/external/python/cpython3

blob: f29d83c599538f4891e2d5f0536d7255f52150f6 [file] [log] [blame]

Guido van Rossum	470be14	1995-03-17 16:07:09 +0000	[diff] [blame]	1	\section{Built-in Module \sectcode{struct}}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	2	\bimodindex{struct}
				3	\indexii{C}{structures}
				4
				5	This module performs conversions between Python values and C
				6	structs represented as Python strings. It uses \dfn{format strings}
				7	(explained below) as compact descriptions of the lay-out of the C
				8	structs and the intended conversion to/from Python values.
				9
Guido van Rossum	ecde781	1995-03-28 13:35:14 +0000	[diff] [blame]	10	See also built-in module \code{array}.
				11	\bimodindex{array}
				12
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	13	The module defines the following exception and functions:
				14
				15	\renewcommand{\indexsubitem}{(in module struct)}
				16	\begin{excdesc}{error}
				17	Exception raised on various occasions; argument is a string
				18	describing what is wrong.
				19	\end{excdesc}
				20
				21	\begin{funcdesc}{pack}{fmt\, v1\, v2\, {\rm \ldots}}
				22	Return a string containing the values
				23	\code{\var{v1}, \var{v2}, {\rm \ldots}} packed according to the given
				24	format. The arguments must match the values required by the format
				25	exactly.
				26	\end{funcdesc}
				27
				28	\begin{funcdesc}{unpack}{fmt\, string}
				29	Unpack the string (presumably packed by \code{pack(\var{fmt}, {\rm \ldots})})
				30	according to the given format. The result is a tuple even if it
				31	contains exactly one item. The string must contain exactly the
				32	amount of data required by the format (i.e. \code{len(\var{string})} must
				33	equal \code{calcsize(\var{fmt})}).
				34	\end{funcdesc}
				35
				36	\begin{funcdesc}{calcsize}{fmt}
				37	Return the size of the struct (and hence of the string)
				38	corresponding to the given format.
				39	\end{funcdesc}
				40
				41	Format characters have the following meaning; the conversion between C
				42	and Python values should be obvious given their types:
				43
				44	\begin{tableiii}{\|c\|l\|l\|}{samp}{Format}{C}{Python}
				45	\lineiii{x}{pad byte}{no value}
				46	\lineiii{c}{char}{string of length 1}
				47	\lineiii{b}{signed char}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	48	\lineiii{B}{unsigned char}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	49	\lineiii{h}{short}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	50	\lineiii{H}{unsigned short}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	51	\lineiii{i}{int}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	52	\lineiii{I}{unsigned int}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	53	\lineiii{l}{long}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	54	\lineiii{L}{unsigned long}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	55	\lineiii{f}{float}{float}
				56	\lineiii{d}{double}{float}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	57	\lineiii{s}{char[]}{string}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	58	\end{tableiii}
				59
Guido van Rossum	6c4f003	1995-03-07 10:14:09 +0000	[diff] [blame]	60	A format character may be preceded by an integral repeat count; e.g.\
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	61	the format string \code{'4h'} means exactly the same as \code{'hhhh'}.
				62
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	63	For the \code{'s'} format character, the count is interpreted as the
				64	size of the string, not a repeat count like for the other format
				65	characters; e.g. \code{'10s'} means a single 10-byte string, while
				66	\code{'10c'} means 10 characters. For packing, the string is
				67	truncated or padded with null bytes as appropriate to make it fit.
				68	For unpacking, the resulting string always has exactly the specified
				69	number of bytes. As a special case, \code{'0s'} means a single, empty
				70	string (while \code{'0c'} means 0 characters).
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	71
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	72	For the \code{'I'} and \code{'L'} format characters, the return
				73	value is a Python long integer if a Python plain integer can't
				74	represent the required range (note: this is dependent on the size of
				75	the relevant C types only, not of the sign of the actual value).
				76
				77	By default, C numbers are represented in the machine's native format
				78	and byte order, and properly aligned by skipping pad bytes if
				79	necessary (according to the rules used by the C compiler).
				80
				81	Alternatively, the first character of the format string can be used to
				82	indicate the byte order, size and alignment of the packed data,
				83	according to the following table:
				84
				85	\begin{tableiii}{\|c\|l\|l\|}{samp}{Character}{Byte order}{Size and alignment}
				86	\lineiii{@}{native}{native}
				87	\lineiii{=}{native}{standard}
				88	\lineiii{<}{little-endian}{standard}
				89	\lineiii{>}{big-endian}{standard}
				90	\lineiii{!}{network (= big-endian)}{standard}
				91	\end{tableiii}
				92
				93	If the first character is not one of these, \code{'@'} is assumed.
				94
				95	Native byte order is big-endian or little-endian, depending on the
				96	host system (e.g. Motorola and Sun are big-endian; Intel and DEC are
				97	little-endian).
				98
				99	Native size and alignment are determined using the C compiler's sizeof
				100	expression. This is always combined with native byte order.
				101
				102	Standard size and alignment are as follows: no alignment is required
				103	for any type (so you have to use pad bytes); short is 2 bytes; int and
				104	long are 4 bytes. In this mode, there is no support for float and
				105	double (\code{'f'} and \code{'d'}).
				106
				107	Note the difference between \code{'@'} and \code{'='}: both use native
				108	byte order, but the size and alignment of the latter is standardized.
				109
				110	The form \code{'!'} is available for those poor souls who claim they
				111	can't remember whether network byte order is big-endian or
				112	little-endian.
				113
				114	There is no way to indicate non-native byte order (i.e. force
				115	byte-swapping); use the appropriate choice of \code{'<'} or
				116	\code{'>'}.
				117
				118	Examples (all using native byte order, size and alignment, on a
				119	big-endian machine):
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	120
				121	\bcode\begin{verbatim}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	122	from struct import *
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	123	pack('hhl', 1, 2, 3) == '\000\001\000\002\000\000\000\003'
				124	unpack('hhl', '\000\001\000\002\000\000\000\003') == (1, 2, 3)
				125	calcsize('hhl') == 8
				126	\end{verbatim}\ecode
				127
				128	Hint: to align the end of a structure to the alignment requirement of
				129	a particular type, end the format with the code for that type with a
Guido van Rossum	6c4f003	1995-03-07 10:14:09 +0000	[diff] [blame]	130	repeat count of zero, e.g.\ the format \code{'llh0l'} specifies two
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	131	pad bytes at the end, assuming longs are aligned on 4-byte boundaries.
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	132	(This only works when native size and alignment are in effect;
				133	standard size and alignment does not enforce any alignment.)