Benjamin Peterson | 4ae1946 | 2008-07-31 15:03:40 +0000 | [diff] [blame] | 1 | :mod:`struct` --- Interpret bytes as packed binary data |
Georg Brandl | 7f01a13 | 2009-09-16 15:58:14 +0000 | [diff] [blame] | 2 | ======================================================= |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 3 | |
| 4 | .. module:: struct |
Benjamin Peterson | 4ae1946 | 2008-07-31 15:03:40 +0000 | [diff] [blame] | 5 | :synopsis: Interpret bytes as packed binary data. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 6 | |
Terry Jan Reedy | fa089b9 | 2016-06-11 15:02:54 -0400 | [diff] [blame] | 7 | **Source code:** :source:`Lib/struct.py` |
| 8 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 9 | .. index:: |
| 10 | pair: C; structures |
| 11 | triple: packing; binary; data |
| 12 | |
Terry Jan Reedy | fa089b9 | 2016-06-11 15:02:54 -0400 | [diff] [blame] | 13 | -------------- |
| 14 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 15 | This module performs conversions between Python values and C structs represented |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 16 | as Python :class:`bytes` objects. This can be used in handling binary data |
| 17 | stored in files or from network connections, among other sources. It uses |
| 18 | :ref:`struct-format-strings` as compact descriptions of the layout of the C |
| 19 | structs and the intended conversion to/from Python values. |
| 20 | |
| 21 | .. note:: |
| 22 | |
| 23 | By default, the result of packing a given C struct includes pad bytes in |
| 24 | order to maintain proper alignment for the C types involved; similarly, |
| 25 | alignment is taken into account when unpacking. This behavior is chosen so |
| 26 | that the bytes of a packed struct correspond exactly to the layout in memory |
Mark Dickinson | cb532f1 | 2010-06-15 08:42:37 +0000 | [diff] [blame] | 27 | of the corresponding C struct. To handle platform-independent data formats |
Senthil Kumaran | 916bd38 | 2010-10-15 12:55:19 +0000 | [diff] [blame] | 28 | or omit implicit pad bytes, use ``standard`` size and alignment instead of |
| 29 | ``native`` size and alignment: see :ref:`struct-alignment` for details. |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 30 | |
Georg Brandl | f30132f | 2014-10-31 09:46:41 +0100 | [diff] [blame] | 31 | Several :mod:`struct` functions (and methods of :class:`Struct`) take a *buffer* |
| 32 | argument. This refers to objects that implement the :ref:`bufferobjects` and |
| 33 | provide either a readable or read-writable buffer. The most common types used |
| 34 | for that purpose are :class:`bytes` and :class:`bytearray`, but many other types |
| 35 | that can be viewed as an array of bytes implement the buffer protocol, so that |
| 36 | they can be read/filled without additional copying from a :class:`bytes` object. |
| 37 | |
| 38 | |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 39 | Functions and Exceptions |
| 40 | ------------------------ |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 41 | |
| 42 | The module defines the following exception and functions: |
| 43 | |
| 44 | |
| 45 | .. exception:: error |
| 46 | |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 47 | Exception raised on various occasions; argument is a string describing what |
| 48 | is wrong. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 49 | |
| 50 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 51 | .. function:: pack(format, v1, v2, ...) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 52 | |
Mark Dickinson | fdb99f1 | 2010-06-12 16:30:53 +0000 | [diff] [blame] | 53 | Return a bytes object containing the values *v1*, *v2*, ... packed according |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 54 | to the format string *format*. The arguments must match the values required by |
Mark Dickinson | fdb99f1 | 2010-06-12 16:30:53 +0000 | [diff] [blame] | 55 | the format exactly. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 56 | |
| 57 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 58 | .. function:: pack_into(format, buffer, offset, v1, v2, ...) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 59 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 60 | Pack the values *v1*, *v2*, ... according to the format string *format* and |
Mark Dickinson | fdb99f1 | 2010-06-12 16:30:53 +0000 | [diff] [blame] | 61 | write the packed bytes into the writable buffer *buffer* starting at |
Georg Brandl | f30132f | 2014-10-31 09:46:41 +0100 | [diff] [blame] | 62 | position *offset*. Note that *offset* is a required argument. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 63 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 64 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 65 | .. function:: unpack(format, buffer) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 66 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 67 | Unpack from the buffer *buffer* (presumably packed by ``pack(format, ...)``) |
| 68 | according to the format string *format*. The result is a tuple even if it |
Martin Panter | b030991 | 2016-04-15 23:03:54 +0000 | [diff] [blame] | 69 | contains exactly one item. The buffer's size in bytes must match the |
| 70 | size required by the format, as reflected by :func:`calcsize`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 71 | |
| 72 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 73 | .. function:: unpack_from(format, buffer, offset=0) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 74 | |
Mark Dickinson | fdb99f1 | 2010-06-12 16:30:53 +0000 | [diff] [blame] | 75 | Unpack from *buffer* starting at position *offset*, according to the format |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 76 | string *format*. The result is a tuple even if it contains exactly one |
Xiang Zhang | c10b288 | 2018-03-11 02:58:52 +0800 | [diff] [blame] | 77 | item. The buffer's size in bytes, starting at position *offset*, must be at |
| 78 | least the size required by the format, as reflected by :func:`calcsize`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 79 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 80 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 81 | .. function:: iter_unpack(format, buffer) |
Antoine Pitrou | 9f14681 | 2013-04-27 00:20:04 +0200 | [diff] [blame] | 82 | |
| 83 | Iteratively unpack from the buffer *buffer* according to the format |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 84 | string *format*. This function returns an iterator which will read |
Antoine Pitrou | 9f14681 | 2013-04-27 00:20:04 +0200 | [diff] [blame] | 85 | equally-sized chunks from the buffer until all its contents have been |
Martin Panter | b030991 | 2016-04-15 23:03:54 +0000 | [diff] [blame] | 86 | consumed. The buffer's size in bytes must be a multiple of the size |
| 87 | required by the format, as reflected by :func:`calcsize`. |
Antoine Pitrou | 9f14681 | 2013-04-27 00:20:04 +0200 | [diff] [blame] | 88 | |
| 89 | Each iteration yields a tuple as specified by the format string. |
| 90 | |
| 91 | .. versionadded:: 3.4 |
| 92 | |
| 93 | |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 94 | .. function:: calcsize(format) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 95 | |
Mark Dickinson | fdb99f1 | 2010-06-12 16:30:53 +0000 | [diff] [blame] | 96 | Return the size of the struct (and hence of the bytes object produced by |
Victor Stinner | 3f2d101 | 2017-02-02 12:09:30 +0100 | [diff] [blame] | 97 | ``pack(format, ...)``) corresponding to the format string *format*. |
| 98 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 99 | |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 100 | .. _struct-format-strings: |
| 101 | |
| 102 | Format Strings |
| 103 | -------------- |
| 104 | |
| 105 | Format strings are the mechanism used to specify the expected layout when |
Mark Dickinson | cfd56f2 | 2010-06-12 18:37:54 +0000 | [diff] [blame] | 106 | packing and unpacking data. They are built up from :ref:`format-characters`, |
| 107 | which specify the type of data being packed/unpacked. In addition, there are |
| 108 | special characters for controlling the :ref:`struct-alignment`. |
| 109 | |
| 110 | |
| 111 | .. _struct-alignment: |
| 112 | |
| 113 | Byte Order, Size, and Alignment |
| 114 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 115 | |
| 116 | By default, C types are represented in the machine's native format and byte |
| 117 | order, and properly aligned by skipping pad bytes if necessary (according to the |
| 118 | rules used by the C compiler). |
| 119 | |
| 120 | Alternatively, the first character of the format string can be used to indicate |
| 121 | the byte order, size and alignment of the packed data, according to the |
| 122 | following table: |
| 123 | |
Mark Dickinson | cb532f1 | 2010-06-15 08:42:37 +0000 | [diff] [blame] | 124 | +-----------+------------------------+----------+-----------+ |
| 125 | | Character | Byte order | Size | Alignment | |
| 126 | +===========+========================+==========+===========+ |
| 127 | | ``@`` | native | native | native | |
| 128 | +-----------+------------------------+----------+-----------+ |
| 129 | | ``=`` | native | standard | none | |
| 130 | +-----------+------------------------+----------+-----------+ |
| 131 | | ``<`` | little-endian | standard | none | |
| 132 | +-----------+------------------------+----------+-----------+ |
| 133 | | ``>`` | big-endian | standard | none | |
| 134 | +-----------+------------------------+----------+-----------+ |
| 135 | | ``!`` | network (= big-endian) | standard | none | |
| 136 | +-----------+------------------------+----------+-----------+ |
Mark Dickinson | cfd56f2 | 2010-06-12 18:37:54 +0000 | [diff] [blame] | 137 | |
| 138 | If the first character is not one of these, ``'@'`` is assumed. |
| 139 | |
| 140 | Native byte order is big-endian or little-endian, depending on the host |
| 141 | system. For example, Intel x86 and AMD64 (x86-64) are little-endian; |
| 142 | Motorola 68000 and PowerPC G5 are big-endian; ARM and Intel Itanium feature |
| 143 | switchable endianness (bi-endian). Use ``sys.byteorder`` to check the |
| 144 | endianness of your system. |
| 145 | |
| 146 | Native size and alignment are determined using the C compiler's |
| 147 | ``sizeof`` expression. This is always combined with native byte order. |
| 148 | |
Mark Dickinson | cb532f1 | 2010-06-15 08:42:37 +0000 | [diff] [blame] | 149 | Standard size depends only on the format character; see the table in |
| 150 | the :ref:`format-characters` section. |
Mark Dickinson | cfd56f2 | 2010-06-12 18:37:54 +0000 | [diff] [blame] | 151 | |
| 152 | Note the difference between ``'@'`` and ``'='``: both use native byte order, but |
| 153 | the size and alignment of the latter is standardized. |
| 154 | |
| 155 | The form ``'!'`` is available for those poor souls who claim they can't remember |
| 156 | whether network byte order is big-endian or little-endian. |
| 157 | |
| 158 | There is no way to indicate non-native byte order (force byte-swapping); use the |
| 159 | appropriate choice of ``'<'`` or ``'>'``. |
| 160 | |
Mark Dickinson | cfd56f2 | 2010-06-12 18:37:54 +0000 | [diff] [blame] | 161 | Notes: |
| 162 | |
| 163 | (1) Padding is only automatically added between successive structure members. |
| 164 | No padding is added at the beginning or the end of the encoded struct. |
| 165 | |
| 166 | (2) No padding is added when using non-native size and alignment, e.g. |
| 167 | with '<', '>', '=', and '!'. |
| 168 | |
| 169 | (3) To align the end of a structure to the alignment requirement of a |
| 170 | particular type, end the format with the code for that type with a repeat |
| 171 | count of zero. See :ref:`struct-examples`. |
| 172 | |
| 173 | |
| 174 | .. _format-characters: |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 175 | |
| 176 | Format Characters |
| 177 | ^^^^^^^^^^^^^^^^^ |
| 178 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 179 | Format characters have the following meaning; the conversion between C and |
Mark Dickinson | 719e4e3 | 2010-06-29 20:10:42 +0000 | [diff] [blame] | 180 | Python values should be obvious given their types. The 'Standard size' column |
| 181 | refers to the size of the packed value in bytes when using standard size; that |
| 182 | is, when the format string starts with one of ``'<'``, ``'>'``, ``'!'`` or |
| 183 | ``'='``. When using native size, the size of the packed value is |
| 184 | platform-dependent. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 185 | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 186 | +--------+--------------------------+--------------------+----------------+------------+ |
| 187 | | Format | C Type | Python type | Standard size | Notes | |
| 188 | +========+==========================+====================+================+============+ |
| 189 | | ``x`` | pad byte | no value | | | |
| 190 | +--------+--------------------------+--------------------+----------------+------------+ |
| 191 | | ``c`` | :c:type:`char` | bytes of length 1 | 1 | | |
| 192 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 193 | | ``b`` | :c:type:`signed char` | integer | 1 | \(1),\(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 194 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 195 | | ``B`` | :c:type:`unsigned char` | integer | 1 | \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 196 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 197 | | ``?`` | :c:type:`_Bool` | bool | 1 | \(1) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 198 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 199 | | ``h`` | :c:type:`short` | integer | 2 | \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 200 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 201 | | ``H`` | :c:type:`unsigned short` | integer | 2 | \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 202 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 203 | | ``i`` | :c:type:`int` | integer | 4 | \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 204 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 205 | | ``I`` | :c:type:`unsigned int` | integer | 4 | \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 206 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 207 | | ``l`` | :c:type:`long` | integer | 4 | \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 208 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 209 | | ``L`` | :c:type:`unsigned long` | integer | 4 | \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 210 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 211 | | ``q`` | :c:type:`long long` | integer | 8 | \(2), \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 212 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 213 | | ``Q`` | :c:type:`unsigned long | integer | 8 | \(2), \(3) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 214 | | | long` | | | | |
| 215 | +--------+--------------------------+--------------------+----------------+------------+ |
Antoine Pitrou | 45d9c91 | 2011-10-06 15:27:40 +0200 | [diff] [blame] | 216 | | ``n`` | :c:type:`ssize_t` | integer | | \(4) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 217 | +--------+--------------------------+--------------------+----------------+------------+ |
Antoine Pitrou | 45d9c91 | 2011-10-06 15:27:40 +0200 | [diff] [blame] | 218 | | ``N`` | :c:type:`size_t` | integer | | \(4) | |
| 219 | +--------+--------------------------+--------------------+----------------+------------+ |
Mark Dickinson | 7c4e409 | 2016-09-03 17:21:29 +0100 | [diff] [blame] | 220 | | ``e`` | \(7) | float | 2 | \(5) | |
| 221 | +--------+--------------------------+--------------------+----------------+------------+ |
Antoine Pitrou | 45d9c91 | 2011-10-06 15:27:40 +0200 | [diff] [blame] | 222 | | ``f`` | :c:type:`float` | float | 4 | \(5) | |
| 223 | +--------+--------------------------+--------------------+----------------+------------+ |
| 224 | | ``d`` | :c:type:`double` | float | 8 | \(5) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 225 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 226 | | ``s`` | :c:type:`char[]` | bytes | | | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 227 | +--------+--------------------------+--------------------+----------------+------------+ |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 228 | | ``p`` | :c:type:`char[]` | bytes | | | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 229 | +--------+--------------------------+--------------------+----------------+------------+ |
Antoine Pitrou | 45d9c91 | 2011-10-06 15:27:40 +0200 | [diff] [blame] | 230 | | ``P`` | :c:type:`void \*` | integer | | \(6) | |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 231 | +--------+--------------------------+--------------------+----------------+------------+ |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 232 | |
Antoine Pitrou | 45d9c91 | 2011-10-06 15:27:40 +0200 | [diff] [blame] | 233 | .. versionchanged:: 3.3 |
| 234 | Added support for the ``'n'`` and ``'N'`` formats. |
| 235 | |
Yury Selivanov | 3479b5f | 2016-11-10 13:25:26 -0500 | [diff] [blame] | 236 | .. versionchanged:: 3.6 |
| 237 | Added support for the ``'e'`` format. |
| 238 | |
| 239 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 240 | Notes: |
| 241 | |
| 242 | (1) |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 243 | The ``'?'`` conversion code corresponds to the :c:type:`_Bool` type defined by |
| 244 | C99. If this type is not available, it is simulated using a :c:type:`char`. In |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 245 | standard mode, it is always represented by one byte. |
| 246 | |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 247 | (2) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 248 | The ``'q'`` and ``'Q'`` conversion codes are available in native mode only if |
Georg Brandl | 60203b4 | 2010-10-06 10:11:56 +0000 | [diff] [blame] | 249 | the platform C compiler supports C :c:type:`long long`, or, on Windows, |
| 250 | :c:type:`__int64`. They are always available in standard modes. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 251 | |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 252 | (3) |
Mark Dickinson | c593577 | 2010-04-03 15:54:36 +0000 | [diff] [blame] | 253 | When attempting to pack a non-integer using any of the integer conversion |
| 254 | codes, if the non-integer has a :meth:`__index__` method then that method is |
| 255 | called to convert the argument to an integer before packing. |
| 256 | |
| 257 | .. versionchanged:: 3.2 |
| 258 | Use of the :meth:`__index__` method for non-integers is new in 3.2. |
| 259 | |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 260 | (4) |
Antoine Pitrou | 45d9c91 | 2011-10-06 15:27:40 +0200 | [diff] [blame] | 261 | The ``'n'`` and ``'N'`` conversion codes are only available for the native |
| 262 | size (selected as the default or with the ``'@'`` byte order character). |
| 263 | For the standard size, you can use whichever of the other integer formats |
| 264 | fits your application. |
| 265 | |
| 266 | (5) |
Mark Dickinson | 7c4e409 | 2016-09-03 17:21:29 +0100 | [diff] [blame] | 267 | For the ``'f'``, ``'d'`` and ``'e'`` conversion codes, the packed |
| 268 | representation uses the IEEE 754 binary32, binary64 or binary16 format (for |
| 269 | ``'f'``, ``'d'`` or ``'e'`` respectively), regardless of the floating-point |
| 270 | format used by the platform. |
Mark Dickinson | cb532f1 | 2010-06-15 08:42:37 +0000 | [diff] [blame] | 271 | |
Antoine Pitrou | 45d9c91 | 2011-10-06 15:27:40 +0200 | [diff] [blame] | 272 | (6) |
Mark Dickinson | cb532f1 | 2010-06-15 08:42:37 +0000 | [diff] [blame] | 273 | The ``'P'`` format character is only available for the native byte ordering |
| 274 | (selected as the default or with the ``'@'`` byte order character). The byte |
| 275 | order character ``'='`` chooses to use little- or big-endian ordering based |
| 276 | on the host system. The struct module does not interpret this as native |
| 277 | ordering, so the ``'P'`` format is not available. |
| 278 | |
Mark Dickinson | 7c4e409 | 2016-09-03 17:21:29 +0100 | [diff] [blame] | 279 | (7) |
| 280 | The IEEE 754 binary16 "half precision" type was introduced in the 2008 |
| 281 | revision of the `IEEE 754 standard <ieee 754 standard_>`_. It has a sign |
| 282 | bit, a 5-bit exponent and 11-bit precision (with 10 bits explicitly stored), |
| 283 | and can represent numbers between approximately ``6.1e-05`` and ``6.5e+04`` |
| 284 | at full precision. This type is not widely supported by C compilers: on a |
| 285 | typical machine, an unsigned short can be used for storage, but not for math |
| 286 | operations. See the Wikipedia page on the `half-precision floating-point |
| 287 | format <half precision format_>`_ for more information. |
| 288 | |
Mark Dickinson | c593577 | 2010-04-03 15:54:36 +0000 | [diff] [blame] | 289 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 290 | A format character may be preceded by an integral repeat count. For example, |
| 291 | the format string ``'4h'`` means exactly the same as ``'hhhh'``. |
| 292 | |
| 293 | Whitespace characters between formats are ignored; a count and its format must |
| 294 | not contain whitespace though. |
| 295 | |
Benjamin Peterson | 4ae1946 | 2008-07-31 15:03:40 +0000 | [diff] [blame] | 296 | For the ``'s'`` format character, the count is interpreted as the length of the |
| 297 | bytes, not a repeat count like for the other format characters; for example, |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 298 | ``'10s'`` means a single 10-byte string, while ``'10c'`` means 10 characters. |
Senthil Kumaran | ad3882a | 2011-07-17 17:29:17 +0800 | [diff] [blame] | 299 | If a count is not given, it defaults to 1. For packing, the string is |
| 300 | truncated or padded with null bytes as appropriate to make it fit. For |
| 301 | unpacking, the resulting bytes object always has exactly the specified number |
| 302 | of bytes. As a special case, ``'0s'`` means a single, empty string (while |
| 303 | ``'0c'`` means 0 characters). |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 304 | |
Mark Dickinson | b40b947 | 2009-03-29 16:58:21 +0000 | [diff] [blame] | 305 | When packing a value ``x`` using one of the integer formats (``'b'``, |
| 306 | ``'B'``, ``'h'``, ``'H'``, ``'i'``, ``'I'``, ``'l'``, ``'L'``, |
| 307 | ``'q'``, ``'Q'``), if ``x`` is outside the valid range for that format |
| 308 | then :exc:`struct.error` is raised. |
| 309 | |
| 310 | .. versionchanged:: 3.1 |
| 311 | In 3.0, some of the integer formats wrapped out-of-range values and |
| 312 | raised :exc:`DeprecationWarning` instead of :exc:`struct.error`. |
| 313 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 314 | The ``'p'`` format character encodes a "Pascal string", meaning a short |
Georg Brandl | 93eb42e | 2010-07-10 10:23:40 +0000 | [diff] [blame] | 315 | variable-length string stored in a *fixed number of bytes*, given by the count. |
| 316 | The first byte stored is the length of the string, or 255, whichever is |
| 317 | smaller. The bytes of the string follow. If the string passed in to |
| 318 | :func:`pack` is too long (longer than the count minus 1), only the leading |
| 319 | ``count-1`` bytes of the string are stored. If the string is shorter than |
| 320 | ``count-1``, it is padded with null bytes so that exactly count bytes in all |
| 321 | are used. Note that for :func:`unpack`, the ``'p'`` format character consumes |
| 322 | ``count`` bytes, but that the string returned can never contain more than 255 |
| 323 | bytes. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 324 | |
Christian Heimes | dd15f6c | 2008-03-16 00:07:10 +0000 | [diff] [blame] | 325 | For the ``'?'`` format character, the return value is either :const:`True` or |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 326 | :const:`False`. When packing, the truth value of the argument object is used. |
| 327 | Either 0 or 1 in the native or standard bool representation will be packed, and |
Serhiy Storchaka | fbc1c26 | 2013-11-29 12:17:13 +0200 | [diff] [blame] | 328 | any non-zero value will be ``True`` when unpacking. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 329 | |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 330 | |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 331 | |
| 332 | .. _struct-examples: |
| 333 | |
| 334 | Examples |
| 335 | ^^^^^^^^ |
| 336 | |
| 337 | .. note:: |
| 338 | All examples assume a native byte order, size, and alignment with a |
| 339 | big-endian machine. |
| 340 | |
| 341 | A basic example of packing/unpacking three integers:: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 342 | |
| 343 | >>> from struct import * |
| 344 | >>> pack('hhl', 1, 2, 3) |
Benjamin Peterson | 4ae1946 | 2008-07-31 15:03:40 +0000 | [diff] [blame] | 345 | b'\x00\x01\x00\x02\x00\x00\x00\x03' |
| 346 | >>> unpack('hhl', b'\x00\x01\x00\x02\x00\x00\x00\x03') |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 347 | (1, 2, 3) |
| 348 | >>> calcsize('hhl') |
| 349 | 8 |
| 350 | |
Benjamin Peterson | 2b7411d | 2008-05-26 17:36:47 +0000 | [diff] [blame] | 351 | Unpacked fields can be named by assigning them to variables or by wrapping |
| 352 | the result in a named tuple:: |
| 353 | |
Benjamin Peterson | 4ae1946 | 2008-07-31 15:03:40 +0000 | [diff] [blame] | 354 | >>> record = b'raymond \x32\x12\x08\x01\x08' |
Benjamin Peterson | 2b7411d | 2008-05-26 17:36:47 +0000 | [diff] [blame] | 355 | >>> name, serialnum, school, gradelevel = unpack('<10sHHb', record) |
| 356 | |
| 357 | >>> from collections import namedtuple |
| 358 | >>> Student = namedtuple('Student', 'name serialnum school gradelevel') |
Benjamin Peterson | 4ae1946 | 2008-07-31 15:03:40 +0000 | [diff] [blame] | 359 | >>> Student._make(unpack('<10sHHb', record)) |
| 360 | Student(name=b'raymond ', serialnum=4658, school=264, gradelevel=8) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 361 | |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 362 | The ordering of format characters may have an impact on size since the padding |
| 363 | needed to satisfy alignment requirements is different:: |
| 364 | |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 365 | >>> pack('ci', b'*', 0x12131415) |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 366 | b'*\x00\x00\x00\x12\x13\x14\x15' |
Victor Stinner | da9ec99 | 2010-12-28 13:26:42 +0000 | [diff] [blame] | 367 | >>> pack('ic', 0x12131415, b'*') |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 368 | b'\x12\x13\x14\x15*' |
| 369 | >>> calcsize('ci') |
| 370 | 8 |
| 371 | >>> calcsize('ic') |
| 372 | 5 |
| 373 | |
| 374 | The following format ``'llh0l'`` specifies two pad bytes at the end, assuming |
| 375 | longs are aligned on 4-byte boundaries:: |
| 376 | |
| 377 | >>> pack('llh0l', 1, 2, 3) |
| 378 | b'\x00\x00\x00\x01\x00\x00\x00\x02\x00\x03\x00\x00' |
| 379 | |
| 380 | This only works when native size and alignment are in effect; standard size and |
| 381 | alignment does not enforce any alignment. |
| 382 | |
| 383 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 384 | .. seealso:: |
| 385 | |
| 386 | Module :mod:`array` |
| 387 | Packed binary storage of homogeneous data. |
| 388 | |
| 389 | Module :mod:`xdrlib` |
| 390 | Packing and unpacking of XDR data. |
| 391 | |
| 392 | |
| 393 | .. _struct-objects: |
| 394 | |
Mark Dickinson | cfd56f2 | 2010-06-12 18:37:54 +0000 | [diff] [blame] | 395 | Classes |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 396 | ------- |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 397 | |
| 398 | The :mod:`struct` module also defines the following type: |
| 399 | |
| 400 | |
| 401 | .. class:: Struct(format) |
| 402 | |
Mark Dickinson | 6abf182 | 2010-04-12 21:00:59 +0000 | [diff] [blame] | 403 | Return a new Struct object which writes and reads binary data according to |
| 404 | the format string *format*. Creating a Struct object once and calling its |
| 405 | methods is more efficient than calling the :mod:`struct` functions with the |
| 406 | same format since the format string only needs to be compiled once. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 407 | |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 408 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 409 | Compiled Struct objects support the following methods and attributes: |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 410 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 411 | .. method:: pack(v1, v2, ...) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 412 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 413 | Identical to the :func:`pack` function, using the compiled format. |
Martin Panter | b030991 | 2016-04-15 23:03:54 +0000 | [diff] [blame] | 414 | (``len(result)`` will equal :attr:`size`.) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 415 | |
| 416 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 417 | .. method:: pack_into(buffer, offset, v1, v2, ...) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 418 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 419 | Identical to the :func:`pack_into` function, using the compiled format. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 420 | |
| 421 | |
Mark Dickinson | fdb99f1 | 2010-06-12 16:30:53 +0000 | [diff] [blame] | 422 | .. method:: unpack(buffer) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 423 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 424 | Identical to the :func:`unpack` function, using the compiled format. |
Martin Panter | b030991 | 2016-04-15 23:03:54 +0000 | [diff] [blame] | 425 | The buffer's size in bytes must equal :attr:`size`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 426 | |
| 427 | |
Georg Brandl | 7f01a13 | 2009-09-16 15:58:14 +0000 | [diff] [blame] | 428 | .. method:: unpack_from(buffer, offset=0) |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 429 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 430 | Identical to the :func:`unpack_from` function, using the compiled format. |
Xiang Zhang | c10b288 | 2018-03-11 02:58:52 +0800 | [diff] [blame] | 431 | The buffer's size in bytes, starting at position *offset*, must be at least |
Martin Panter | b030991 | 2016-04-15 23:03:54 +0000 | [diff] [blame] | 432 | :attr:`size`. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 433 | |
| 434 | |
Antoine Pitrou | 9f14681 | 2013-04-27 00:20:04 +0200 | [diff] [blame] | 435 | .. method:: iter_unpack(buffer) |
| 436 | |
| 437 | Identical to the :func:`iter_unpack` function, using the compiled format. |
Martin Panter | b030991 | 2016-04-15 23:03:54 +0000 | [diff] [blame] | 438 | The buffer's size in bytes must be a multiple of :attr:`size`. |
Antoine Pitrou | 9f14681 | 2013-04-27 00:20:04 +0200 | [diff] [blame] | 439 | |
| 440 | .. versionadded:: 3.4 |
| 441 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 442 | .. attribute:: format |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 443 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 444 | The format string used to construct this Struct object. |
Georg Brandl | 116aa62 | 2007-08-15 14:28:22 +0000 | [diff] [blame] | 445 | |
Victor Stinner | f87b85f | 2017-06-23 15:11:12 +0200 | [diff] [blame] | 446 | .. versionchanged:: 3.7 |
| 447 | The format string type is now :class:`str` instead of :class:`bytes`. |
| 448 | |
Benjamin Peterson | e41251e | 2008-04-25 01:59:09 +0000 | [diff] [blame] | 449 | .. attribute:: size |
Guido van Rossum | 04110fb | 2007-08-24 16:32:05 +0000 | [diff] [blame] | 450 | |
Mark Dickinson | fdb99f1 | 2010-06-12 16:30:53 +0000 | [diff] [blame] | 451 | The calculated size of the struct (and hence of the bytes object produced |
| 452 | by the :meth:`pack` method) corresponding to :attr:`format`. |
Guido van Rossum | 04110fb | 2007-08-24 16:32:05 +0000 | [diff] [blame] | 453 | |
Mark Dickinson | 7c4e409 | 2016-09-03 17:21:29 +0100 | [diff] [blame] | 454 | |
| 455 | .. _half precision format: https://en.wikipedia.org/wiki/Half-precision_floating-point_format |
| 456 | |
| 457 | .. _ieee 754 standard: https://en.wikipedia.org/wiki/IEEE_floating_point#IEEE_754-2008 |