Guido van Rossum | 470be14 | 1995-03-17 16:07:09 +0000 | [diff] [blame^] | 1 | \section{Built-in Module \sectcode{struct}} |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 2 | \bimodindex{struct} |
| 3 | \indexii{C}{structures} |
| 4 | |
| 5 | This module performs conversions between Python values and C |
| 6 | structs represented as Python strings. It uses \dfn{format strings} |
| 7 | (explained below) as compact descriptions of the lay-out of the C |
| 8 | structs and the intended conversion to/from Python values. |
| 9 | |
| 10 | The module defines the following exception and functions: |
| 11 | |
| 12 | \renewcommand{\indexsubitem}{(in module struct)} |
| 13 | \begin{excdesc}{error} |
| 14 | Exception raised on various occasions; argument is a string |
| 15 | describing what is wrong. |
| 16 | \end{excdesc} |
| 17 | |
| 18 | \begin{funcdesc}{pack}{fmt\, v1\, v2\, {\rm \ldots}} |
| 19 | Return a string containing the values |
| 20 | \code{\var{v1}, \var{v2}, {\rm \ldots}} packed according to the given |
| 21 | format. The arguments must match the values required by the format |
| 22 | exactly. |
| 23 | \end{funcdesc} |
| 24 | |
| 25 | \begin{funcdesc}{unpack}{fmt\, string} |
| 26 | Unpack the string (presumably packed by \code{pack(\var{fmt}, {\rm \ldots})}) |
| 27 | according to the given format. The result is a tuple even if it |
| 28 | contains exactly one item. The string must contain exactly the |
| 29 | amount of data required by the format (i.e. \code{len(\var{string})} must |
| 30 | equal \code{calcsize(\var{fmt})}). |
| 31 | \end{funcdesc} |
| 32 | |
| 33 | \begin{funcdesc}{calcsize}{fmt} |
| 34 | Return the size of the struct (and hence of the string) |
| 35 | corresponding to the given format. |
| 36 | \end{funcdesc} |
| 37 | |
| 38 | Format characters have the following meaning; the conversion between C |
| 39 | and Python values should be obvious given their types: |
| 40 | |
| 41 | \begin{tableiii}{|c|l|l|}{samp}{Format}{C}{Python} |
| 42 | \lineiii{x}{pad byte}{no value} |
| 43 | \lineiii{c}{char}{string of length 1} |
| 44 | \lineiii{b}{signed char}{integer} |
| 45 | \lineiii{h}{short}{integer} |
| 46 | \lineiii{i}{int}{integer} |
| 47 | \lineiii{l}{long}{integer} |
| 48 | \lineiii{f}{float}{float} |
| 49 | \lineiii{d}{double}{float} |
| 50 | \end{tableiii} |
| 51 | |
Guido van Rossum | 6c4f003 | 1995-03-07 10:14:09 +0000 | [diff] [blame] | 52 | A format character may be preceded by an integral repeat count; e.g.\ |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 53 | the format string \code{'4h'} means exactly the same as \code{'hhhh'}. |
| 54 | |
| 55 | C numbers are represented in the machine's native format and byte |
| 56 | order, and properly aligned by skipping pad bytes if necessary |
| 57 | (according to the rules used by the C compiler). |
| 58 | |
| 59 | Examples (all on a big-endian machine): |
| 60 | |
| 61 | \bcode\begin{verbatim} |
| 62 | pack('hhl', 1, 2, 3) == '\000\001\000\002\000\000\000\003' |
| 63 | unpack('hhl', '\000\001\000\002\000\000\000\003') == (1, 2, 3) |
| 64 | calcsize('hhl') == 8 |
| 65 | \end{verbatim}\ecode |
| 66 | |
| 67 | Hint: to align the end of a structure to the alignment requirement of |
| 68 | a particular type, end the format with the code for that type with a |
Guido van Rossum | 6c4f003 | 1995-03-07 10:14:09 +0000 | [diff] [blame] | 69 | repeat count of zero, e.g.\ the format \code{'llh0l'} specifies two |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 70 | pad bytes at the end, assuming longs are aligned on 4-byte boundaries. |
| 71 | |
Guido van Rossum | 6c4f003 | 1995-03-07 10:14:09 +0000 | [diff] [blame] | 72 | (More format characters are planned, e.g.\ \code{'s'} for character |
Guido van Rossum | 5fdeeea | 1994-01-02 01:22:07 +0000 | [diff] [blame] | 73 | arrays, upper case for unsigned variants, and a way to specify the |
| 74 | byte order, which is useful for [de]constructing network packets and |
| 75 | reading/writing portable binary file formats like TIFF and AIFF.) |