Blame - Doc/lib/libstruct.tex - platform/external/python/cpython3

blob: d0fde4febce48431bcf58e2aa2135ff6abd4ef18 [file] [log] [blame]

Guido van Rossum	470be14	1995-03-17 16:07:09 +0000	[diff] [blame]	1	\section{Built-in Module \sectcode{struct}}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	2	\label{module-struct}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	3	\bimodindex{struct}
Fred Drake	abdea22	1998-03-16 05:22:08 +0000	[diff] [blame^]	4	\indexii{C@\C{}}{structures}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	5
				6	This module performs conversions between Python values and C
				7	structs represented as Python strings. It uses \dfn{format strings}
				8	(explained below) as compact descriptions of the lay-out of the C
				9	structs and the intended conversion to/from Python values.
				10
				11	The module defines the following exception and functions:
				12
Fred Drake	7ddd043	1998-03-08 07:44:13 +0000	[diff] [blame]	13
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	14	\begin{excdesc}{error}
				15	Exception raised on various occasions; argument is a string
				16	describing what is wrong.
				17	\end{excdesc}
				18
				19	\begin{funcdesc}{pack}{fmt\, v1\, v2\, {\rm \ldots}}
				20	Return a string containing the values
				21	\code{\var{v1}, \var{v2}, {\rm \ldots}} packed according to the given
				22	format. The arguments must match the values required by the format
				23	exactly.
				24	\end{funcdesc}
				25
				26	\begin{funcdesc}{unpack}{fmt\, string}
				27	Unpack the string (presumably packed by \code{pack(\var{fmt}, {\rm \ldots})})
				28	according to the given format. The result is a tuple even if it
				29	contains exactly one item. The string must contain exactly the
				30	amount of data required by the format (i.e. \code{len(\var{string})} must
				31	equal \code{calcsize(\var{fmt})}).
				32	\end{funcdesc}
				33
				34	\begin{funcdesc}{calcsize}{fmt}
				35	Return the size of the struct (and hence of the string)
				36	corresponding to the given format.
				37	\end{funcdesc}
				38
				39	Format characters have the following meaning; the conversion between C
				40	and Python values should be obvious given their types:
				41
				42	\begin{tableiii}{\|c\|l\|l\|}{samp}{Format}{C}{Python}
				43	\lineiii{x}{pad byte}{no value}
				44	\lineiii{c}{char}{string of length 1}
				45	\lineiii{b}{signed char}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	46	\lineiii{B}{unsigned char}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	47	\lineiii{h}{short}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	48	\lineiii{H}{unsigned short}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	49	\lineiii{i}{int}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	50	\lineiii{I}{unsigned int}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	51	\lineiii{l}{long}{integer}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	52	\lineiii{L}{unsigned long}{integer}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	53	\lineiii{f}{float}{float}
				54	\lineiii{d}{double}{float}
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	55	\lineiii{s}{char[]}{string}
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	56	\end{tableiii}
				57
Guido van Rossum	6c4f003	1995-03-07 10:14:09 +0000	[diff] [blame]	58	A format character may be preceded by an integral repeat count; e.g.\
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	59	the format string \code{'4h'} means exactly the same as \code{'hhhh'}.
				60
Guido van Rossum	e20aef5	1997-08-26 20:39:54 +0000	[diff] [blame]	61	Whitespace characters between formats are ignored; a count and its
				62	format must not contain whitespace though.
				63
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	64	For the \code{'s'} format character, the count is interpreted as the
				65	size of the string, not a repeat count like for the other format
				66	characters; e.g. \code{'10s'} means a single 10-byte string, while
				67	\code{'10c'} means 10 characters. For packing, the string is
				68	truncated or padded with null bytes as appropriate to make it fit.
				69	For unpacking, the resulting string always has exactly the specified
				70	number of bytes. As a special case, \code{'0s'} means a single, empty
				71	string (while \code{'0c'} means 0 characters).
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	72
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	73	For the \code{'I'} and \code{'L'} format characters, the return
Guido van Rossum	6530717	1997-01-03 19:21:53 +0000	[diff] [blame]	74	value is a Python long integer.
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	75
				76	By default, C numbers are represented in the machine's native format
				77	and byte order, and properly aligned by skipping pad bytes if
				78	necessary (according to the rules used by the C compiler).
				79
				80	Alternatively, the first character of the format string can be used to
				81	indicate the byte order, size and alignment of the packed data,
				82	according to the following table:
				83
				84	\begin{tableiii}{\|c\|l\|l\|}{samp}{Character}{Byte order}{Size and alignment}
				85	\lineiii{@}{native}{native}
				86	\lineiii{=}{native}{standard}
				87	\lineiii{<}{little-endian}{standard}
				88	\lineiii{>}{big-endian}{standard}
				89	\lineiii{!}{network (= big-endian)}{standard}
				90	\end{tableiii}
				91
				92	If the first character is not one of these, \code{'@'} is assumed.
				93
				94	Native byte order is big-endian or little-endian, depending on the
				95	host system (e.g. Motorola and Sun are big-endian; Intel and DEC are
				96	little-endian).
				97
				98	Native size and alignment are determined using the C compiler's sizeof
				99	expression. This is always combined with native byte order.
				100
				101	Standard size and alignment are as follows: no alignment is required
				102	for any type (so you have to use pad bytes); short is 2 bytes; int and
Guido van Rossum	dbadd55	1997-01-03 04:20:09 +0000	[diff] [blame]	103	long are 4 bytes. Float and double are 32-bit and 64-bit IEEE floating
				104	point numbers, respectively.
Guido van Rossum	1254346	1996-12-31 02:22:14 +0000	[diff] [blame]	105
				106	Note the difference between \code{'@'} and \code{'='}: both use native
				107	byte order, but the size and alignment of the latter is standardized.
				108
				109	The form \code{'!'} is available for those poor souls who claim they
				110	can't remember whether network byte order is big-endian or
				111	little-endian.
				112
				113	There is no way to indicate non-native byte order (i.e. force
				114	byte-swapping); use the appropriate choice of \code{'<'} or
				115	\code{'>'}.
				116
				117	Examples (all using native byte order, size and alignment, on a
				118	big-endian machine):
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	119
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	120	\begin{verbatim}
Guido van Rossum	dbadd55	1997-01-03 04:20:09 +0000	[diff] [blame]	121	>>> from struct import *
				122	>>> pack('hhl', 1, 2, 3)
				123	'\000\001\000\002\000\000\000\003'
				124	>>> unpack('hhl', '\000\001\000\002\000\000\000\003')
				125	(1, 2, 3)
				126	>>> calcsize('hhl')
				127	8
				128	>>>
Fred Drake	1947991	1998-02-13 06:58:54 +0000	[diff] [blame]	129	\end{verbatim}
Guido van Rossum	e47da0a	1997-07-17 16:34:52 +0000	[diff] [blame]	130	%
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	131	Hint: to align the end of a structure to the alignment requirement of
				132	a particular type, end the format with the code for that type with a
Guido van Rossum	6c4f003	1995-03-07 10:14:09 +0000	[diff] [blame]	133	repeat count of zero, e.g.\ the format \code{'llh0l'} specifies two
Guido van Rossum	5fdeeea	1994-01-02 01:22:07 +0000	[diff] [blame]	134	pad bytes at the end, assuming longs are aligned on 4-byte boundaries.
Fred Drake	7ddd043	1998-03-08 07:44:13 +0000	[diff] [blame]	135	This only works when native size and alignment are in effect;
				136	standard size and alignment does not enforce any alignment.
				137
				138	\begin{seealso}
				139	\seemodule{array}{packed binary storage of homogeneous data}
				140	\end{seealso}