Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| 2 | "http://www.w3.org/TR/html4/strict.dtd"> |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 3 | <html> |
| 4 | <head> |
| 5 | <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| 6 | <title>LLVM Bitcode File Format</title> |
| 7 | <link rel="stylesheet" href="llvm.css" type="text/css"> |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 8 | </head> |
| 9 | <body> |
| 10 | <div class="doc_title"> LLVM Bitcode File Format </div> |
| 11 | <ol> |
| 12 | <li><a href="#abstract">Abstract</a></li> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 13 | <li><a href="#overview">Overview</a></li> |
| 14 | <li><a href="#bitstream">Bitstream Format</a> |
| 15 | <ol> |
| 16 | <li><a href="#magic">Magic Numbers</a></li> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 17 | <li><a href="#primitives">Primitives</a></li> |
| 18 | <li><a href="#abbrevid">Abbreviation IDs</a></li> |
| 19 | <li><a href="#blocks">Blocks</a></li> |
| 20 | <li><a href="#datarecord">Data Records</a></li> |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 21 | <li><a href="#abbreviations">Abbreviations</a></li> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 22 | <li><a href="#stdblocks">Standard Blocks</a></li> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 23 | </ol> |
| 24 | </li> |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 25 | <li><a href="#wrapper">Bitcode Wrapper Format</a> |
| 26 | </li> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 27 | <li><a href="#llvmir">LLVM IR Encoding</a> |
| 28 | <ol> |
| 29 | <li><a href="#basics">Basics</a></li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 30 | <li><a href="#MODULE_BLOCK">MODULE_BLOCK Contents</a></li> |
| 31 | <li><a href="#PARAMATTR_BLOCK">PARAMATTR_BLOCK Contents</a></li> |
| 32 | <li><a href="#TYPE_BLOCK">TYPE_BLOCK Contents</a></li> |
| 33 | <li><a href="#CONSTANTS_BLOCK">CONSTANTS_BLOCK Contents</a></li> |
| 34 | <li><a href="#FUNCTION_BLOCK">FUNCTION_BLOCK Contents</a></li> |
| 35 | <li><a href="#TYPE_SYMTAB_BLOCK">TYPE_SYMTAB_BLOCK Contents</a></li> |
| 36 | <li><a href="#VALUE_SYMTAB_BLOCK">VALUE_SYMTAB_BLOCK Contents</a></li> |
| 37 | <li><a href="#METADATA_BLOCK">METADATA_BLOCK Contents</a></li> |
| 38 | <li><a href="#METADATA_ATTACHMENT">METADATA_ATTACHMENT Contents</a></li> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 39 | </ol> |
| 40 | </li> |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 41 | </ol> |
| 42 | <div class="doc_author"> |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 43 | <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> |
| 44 | and <a href="http://www.reverberate.org">Joshua Haberman</a>. |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 45 | </p> |
| 46 | </div> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 47 | |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 48 | <!-- *********************************************************************** --> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 49 | <div class="doc_section"> <a name="abstract">Abstract</a></div> |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 50 | <!-- *********************************************************************** --> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 51 | |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 52 | <div class="doc_text"> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 53 | |
| 54 | <p>This document describes the LLVM bitstream file format and the encoding of |
| 55 | the LLVM IR into it.</p> |
| 56 | |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 57 | </div> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 58 | |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 59 | <!-- *********************************************************************** --> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 60 | <div class="doc_section"> <a name="overview">Overview</a></div> |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 61 | <!-- *********************************************************************** --> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 62 | |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 63 | <div class="doc_text"> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 64 | |
| 65 | <p> |
| 66 | What is commonly known as the LLVM bitcode file format (also, sometimes |
| 67 | anachronistically known as bytecode) is actually two things: a <a |
| 68 | href="#bitstream">bitstream container format</a> |
| 69 | and an <a href="#llvmir">encoding of LLVM IR</a> into the container format.</p> |
| 70 | |
| 71 | <p> |
Reid Spencer | 58d0547 | 2007-05-12 08:01:52 +0000 | [diff] [blame] | 72 | The bitstream format is an abstract encoding of structured data, very |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 73 | similar to XML in some ways. Like XML, bitstream files contain tags, and nested |
| 74 | structures, and you can parse the file without having to understand the tags. |
| 75 | Unlike XML, the bitstream format is a binary encoding, and unlike XML it |
| 76 | provides a mechanism for the file to self-describe "abbreviations", which are |
| 77 | effectively size optimizations for the content.</p> |
| 78 | |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 79 | <p>LLVM IR files may be optionally embedded into a <a |
| 80 | href="#wrapper">wrapper</a> structure that makes it easy to embed extra data |
| 81 | along with LLVM IR files.</p> |
| 82 | |
| 83 | <p>This document first describes the LLVM bitstream format, describes the |
| 84 | wrapper format, then describes the record structure used by LLVM IR files. |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 85 | </p> |
| 86 | |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 87 | </div> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 88 | |
| 89 | <!-- *********************************************************************** --> |
| 90 | <div class="doc_section"> <a name="bitstream">Bitstream Format</a></div> |
| 91 | <!-- *********************************************************************** --> |
| 92 | |
| 93 | <div class="doc_text"> |
| 94 | |
| 95 | <p> |
| 96 | The bitstream format is literally a stream of bits, with a very simple |
| 97 | structure. This structure consists of the following concepts: |
| 98 | </p> |
| 99 | |
| 100 | <ul> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 101 | <li>A "<a href="#magic">magic number</a>" that identifies the contents of |
| 102 | the stream.</li> |
| 103 | <li>Encoding <a href="#primitives">primitives</a> like variable bit-rate |
| 104 | integers.</li> |
| 105 | <li><a href="#blocks">Blocks</a>, which define nested content.</li> |
| 106 | <li><a href="#datarecord">Data Records</a>, which describe entities within the |
| 107 | file.</li> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 108 | <li>Abbreviations, which specify compression optimizations for the file.</li> |
| 109 | </ul> |
| 110 | |
| 111 | <p>Note that the <a |
| 112 | href="CommandGuide/html/llvm-bcanalyzer.html">llvm-bcanalyzer</a> tool can be |
| 113 | used to dump and inspect arbitrary bitstreams, which is very useful for |
| 114 | understanding the encoding.</p> |
| 115 | |
| 116 | </div> |
| 117 | |
| 118 | <!-- ======================================================================= --> |
| 119 | <div class="doc_subsection"><a name="magic">Magic Numbers</a> |
| 120 | </div> |
| 121 | |
| 122 | <div class="doc_text"> |
| 123 | |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 124 | <p>The first two bytes of a bitcode file are 'BC' (0x42, 0x43). |
| 125 | The second two bytes are an application-specific magic number. Generic |
| 126 | bitcode tools can look at only the first two bytes to verify the file is |
| 127 | bitcode, while application-specific programs will want to look at all four.</p> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 128 | |
| 129 | </div> |
| 130 | |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 131 | <!-- ======================================================================= --> |
| 132 | <div class="doc_subsection"><a name="primitives">Primitives</a> |
| 133 | </div> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 134 | |
| 135 | <div class="doc_text"> |
| 136 | |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 137 | <p> |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 138 | A bitstream literally consists of a stream of bits, which are read in order |
| 139 | starting with the least significant bit of each byte. The stream is made up of a |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 140 | number of primitive values that encode a stream of unsigned integer values. |
| 141 | These |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 142 | integers are are encoded in two ways: either as <a href="#fixedwidth">Fixed |
| 143 | Width Integers</a> or as <a href="#variablewidth">Variable Width |
| 144 | Integers</a>. |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 145 | </p> |
| 146 | |
| 147 | </div> |
| 148 | |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 149 | <!-- _______________________________________________________________________ --> |
| 150 | <div class="doc_subsubsection"> <a name="fixedwidth">Fixed Width Integers</a> |
| 151 | </div> |
| 152 | |
| 153 | <div class="doc_text"> |
| 154 | |
| 155 | <p>Fixed-width integer values have their low bits emitted directly to the file. |
| 156 | For example, a 3-bit integer value encodes 1 as 001. Fixed width integers |
| 157 | are used when there are a well-known number of options for a field. For |
| 158 | example, boolean values are usually encoded with a 1-bit wide integer. |
| 159 | </p> |
| 160 | |
| 161 | </div> |
| 162 | |
| 163 | <!-- _______________________________________________________________________ --> |
| 164 | <div class="doc_subsubsection"> <a name="variablewidth">Variable Width |
| 165 | Integers</a></div> |
| 166 | |
| 167 | <div class="doc_text"> |
| 168 | |
| 169 | <p>Variable-width integer (VBR) values encode values of arbitrary size, |
| 170 | optimizing for the case where the values are small. Given a 4-bit VBR field, |
| 171 | any 3-bit value (0 through 7) is encoded directly, with the high bit set to |
| 172 | zero. Values larger than N-1 bits emit their bits in a series of N-1 bit |
| 173 | chunks, where all but the last set the high bit.</p> |
| 174 | |
| 175 | <p>For example, the value 27 (0x1B) is encoded as 1011 0011 when emitted as a |
| 176 | vbr4 value. The first set of four bits indicates the value 3 (011) with a |
| 177 | continuation piece (indicated by a high bit of 1). The next word indicates a |
| 178 | value of 24 (011 << 3) with no continuation. The sum (3+24) yields the value |
| 179 | 27. |
| 180 | </p> |
| 181 | |
| 182 | </div> |
| 183 | |
| 184 | <!-- _______________________________________________________________________ --> |
| 185 | <div class="doc_subsubsection"> <a name="char6">6-bit characters</a></div> |
| 186 | |
| 187 | <div class="doc_text"> |
| 188 | |
| 189 | <p>6-bit characters encode common characters into a fixed 6-bit field. They |
Chris Lattner | f1d64e9 | 2007-05-12 07:50:14 +0000 | [diff] [blame] | 190 | represent the following characters with the following 6-bit values:</p> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 191 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 192 | <div class="doc_code"> |
| 193 | <pre> |
| 194 | 'a' .. 'z' — 0 .. 25 |
| 195 | 'A' .. 'Z' — 26 .. 51 |
| 196 | '0' .. '9' — 52 .. 61 |
| 197 | '.' — 62 |
| 198 | '_' — 63 |
| 199 | </pre> |
| 200 | </div> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 201 | |
| 202 | <p>This encoding is only suitable for encoding characters and strings that |
| 203 | consist only of the above characters. It is completely incapable of encoding |
| 204 | characters not in the set.</p> |
| 205 | |
| 206 | </div> |
| 207 | |
| 208 | <!-- _______________________________________________________________________ --> |
| 209 | <div class="doc_subsubsection"> <a name="wordalign">Word Alignment</a></div> |
| 210 | |
| 211 | <div class="doc_text"> |
| 212 | |
| 213 | <p>Occasionally, it is useful to emit zero bits until the bitstream is a |
| 214 | multiple of 32 bits. This ensures that the bit position in the stream can be |
| 215 | represented as a multiple of 32-bit words.</p> |
| 216 | |
| 217 | </div> |
| 218 | |
| 219 | |
| 220 | <!-- ======================================================================= --> |
| 221 | <div class="doc_subsection"><a name="abbrevid">Abbreviation IDs</a> |
| 222 | </div> |
| 223 | |
| 224 | <div class="doc_text"> |
| 225 | |
| 226 | <p> |
| 227 | A bitstream is a sequential series of <a href="#blocks">Blocks</a> and |
| 228 | <a href="#datarecord">Data Records</a>. Both of these start with an |
| 229 | abbreviation ID encoded as a fixed-bitwidth field. The width is specified by |
| 230 | the current block, as described below. The value of the abbreviation ID |
| 231 | specifies either a builtin ID (which have special meanings, defined below) or |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 232 | one of the abbreviation IDs defined for the current block by the stream itself. |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 233 | </p> |
| 234 | |
| 235 | <p> |
| 236 | The set of builtin abbrev IDs is: |
| 237 | </p> |
| 238 | |
| 239 | <ul> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 240 | <li><tt>0 - <a href="#END_BLOCK">END_BLOCK</a></tt> — This abbrev ID marks |
| 241 | the end of the current block.</li> |
| 242 | <li><tt>1 - <a href="#ENTER_SUBBLOCK">ENTER_SUBBLOCK</a></tt> — This |
| 243 | abbrev ID marks the beginning of a new block.</li> |
| 244 | <li><tt>2 - <a href="#DEFINE_ABBREV">DEFINE_ABBREV</a></tt> — This defines |
| 245 | a new abbreviation.</li> |
| 246 | <li><tt>3 - <a href="#UNABBREV_RECORD">UNABBREV_RECORD</a></tt> — This ID |
| 247 | specifies the definition of an unabbreviated record.</li> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 248 | </ul> |
| 249 | |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 250 | <p>Abbreviation IDs 4 and above are defined by the stream itself, and specify |
| 251 | an <a href="#abbrev_records">abbreviated record encoding</a>.</p> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 252 | |
| 253 | </div> |
| 254 | |
| 255 | <!-- ======================================================================= --> |
| 256 | <div class="doc_subsection"><a name="blocks">Blocks</a> |
| 257 | </div> |
| 258 | |
| 259 | <div class="doc_text"> |
| 260 | |
| 261 | <p> |
| 262 | Blocks in a bitstream denote nested regions of the stream, and are identified by |
| 263 | a content-specific id number (for example, LLVM IR uses an ID of 12 to represent |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 264 | function bodies). Block IDs 0-7 are reserved for <a href="#stdblocks">standard blocks</a> |
| 265 | whose meaning is defined by Bitcode; block IDs 8 and greater are |
Benjamin Kramer | 8040cd3 | 2009-10-12 14:46:08 +0000 | [diff] [blame] | 266 | application specific. Nested blocks capture the hierarchical structure of the data |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 267 | encoded in it, and various properties are associated with blocks as the file is |
| 268 | parsed. Block definitions allow the reader to efficiently skip blocks |
| 269 | in constant time if the reader wants a summary of blocks, or if it wants to |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 270 | efficiently skip data it does not understand. The LLVM IR reader uses this |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 271 | mechanism to skip function bodies, lazily reading them on demand. |
| 272 | </p> |
| 273 | |
| 274 | <p> |
| 275 | When reading and encoding the stream, several properties are maintained for the |
| 276 | block. In particular, each block maintains: |
| 277 | </p> |
| 278 | |
| 279 | <ol> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 280 | <li>A current abbrev id width. This value starts at 2 at the beginning of |
| 281 | the stream, and is set every time a |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 282 | block record is entered. The block entry specifies the abbrev id width for |
| 283 | the body of the block.</li> |
| 284 | |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 285 | <li>A set of abbreviations. Abbreviations may be defined within a block, in |
| 286 | which case they are only defined in that block (neither subblocks nor |
| 287 | enclosing blocks see the abbreviation). Abbreviations can also be defined |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 288 | inside a <tt><a href="#BLOCKINFO">BLOCKINFO</a></tt> block, in which case |
| 289 | they are defined in all blocks that match the ID that the BLOCKINFO block is |
| 290 | describing. |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 291 | </li> |
| 292 | </ol> |
| 293 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 294 | <p> |
| 295 | As sub blocks are entered, these properties are saved and the new sub-block has |
| 296 | its own set of abbreviations, and its own abbrev id width. When a sub-block is |
| 297 | popped, the saved values are restored. |
| 298 | </p> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 299 | |
| 300 | </div> |
| 301 | |
| 302 | <!-- _______________________________________________________________________ --> |
| 303 | <div class="doc_subsubsection"> <a name="ENTER_SUBBLOCK">ENTER_SUBBLOCK |
| 304 | Encoding</a></div> |
| 305 | |
| 306 | <div class="doc_text"> |
| 307 | |
| 308 | <p><tt>[ENTER_SUBBLOCK, blockid<sub>vbr8</sub>, newabbrevlen<sub>vbr4</sub>, |
| 309 | <align32bits>, blocklen<sub>32</sub>]</tt></p> |
| 310 | |
| 311 | <p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 312 | The <tt>ENTER_SUBBLOCK</tt> abbreviation ID specifies the start of a new block |
| 313 | record. The <tt>blockid</tt> value is encoded as an 8-bit VBR identifier, and |
| 314 | indicates the type of block being entered, which can be |
| 315 | a <a href="#stdblocks">standard block</a> or an application-specific block. |
| 316 | The <tt>newabbrevlen</tt> value is a 4-bit VBR, which specifies the abbrev id |
| 317 | width for the sub-block. The <tt>blocklen</tt> value is a 32-bit aligned value |
| 318 | that specifies the size of the subblock in 32-bit words. This value allows the |
| 319 | reader to skip over the entire block in one jump. |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 320 | </p> |
| 321 | |
| 322 | </div> |
| 323 | |
| 324 | <!-- _______________________________________________________________________ --> |
| 325 | <div class="doc_subsubsection"> <a name="END_BLOCK">END_BLOCK |
| 326 | Encoding</a></div> |
| 327 | |
| 328 | <div class="doc_text"> |
| 329 | |
| 330 | <p><tt>[END_BLOCK, <align32bits>]</tt></p> |
| 331 | |
| 332 | <p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 333 | The <tt>END_BLOCK</tt> abbreviation ID specifies the end of the current block |
| 334 | record. Its end is aligned to 32-bits to ensure that the size of the block is |
| 335 | an even multiple of 32-bits. |
| 336 | </p> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 337 | |
| 338 | </div> |
| 339 | |
| 340 | |
| 341 | |
| 342 | <!-- ======================================================================= --> |
| 343 | <div class="doc_subsection"><a name="datarecord">Data Records</a> |
| 344 | </div> |
| 345 | |
| 346 | <div class="doc_text"> |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 347 | <p> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 348 | Data records consist of a record code and a number of (up to) 64-bit |
| 349 | integer values. The interpretation of the code and values is |
| 350 | application specific and may vary between different block types. |
| 351 | Records can be encoded either using an unabbrev record, or with an |
| 352 | abbreviation. In the LLVM IR format, for example, there is a record |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 353 | which encodes the target triple of a module. The code is |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 354 | <tt>MODULE_CODE_TRIPLE</tt>, and the values of the record are the |
| 355 | ASCII codes for the characters in the string. |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 356 | </p> |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 357 | |
| 358 | </div> |
| 359 | |
| 360 | <!-- _______________________________________________________________________ --> |
| 361 | <div class="doc_subsubsection"> <a name="UNABBREV_RECORD">UNABBREV_RECORD |
| 362 | Encoding</a></div> |
| 363 | |
| 364 | <div class="doc_text"> |
| 365 | |
| 366 | <p><tt>[UNABBREV_RECORD, code<sub>vbr6</sub>, numops<sub>vbr6</sub>, |
| 367 | op0<sub>vbr6</sub>, op1<sub>vbr6</sub>, ...]</tt></p> |
| 368 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 369 | <p> |
| 370 | An <tt>UNABBREV_RECORD</tt> provides a default fallback encoding, which is both |
| 371 | completely general and extremely inefficient. It can describe an arbitrary |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 372 | record by emitting the code and operands as VBRs. |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 373 | </p> |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 374 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 375 | <p> |
| 376 | For example, emitting an LLVM IR target triple as an unabbreviated record |
| 377 | requires emitting the <tt>UNABBREV_RECORD</tt> abbrevid, a vbr6 for the |
| 378 | <tt>MODULE_CODE_TRIPLE</tt> code, a vbr6 for the length of the string, which is |
| 379 | equal to the number of operands, and a vbr6 for each character. Because there |
| 380 | are no letters with values less than 32, each letter would need to be emitted as |
| 381 | at least a two-part VBR, which means that each letter would require at least 12 |
| 382 | bits. This is not an efficient encoding, but it is fully general. |
| 383 | </p> |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 384 | |
| 385 | </div> |
| 386 | |
| 387 | <!-- _______________________________________________________________________ --> |
| 388 | <div class="doc_subsubsection"> <a name="abbrev_records">Abbreviated Record |
| 389 | Encoding</a></div> |
| 390 | |
| 391 | <div class="doc_text"> |
| 392 | |
| 393 | <p><tt>[<abbrevid>, fields...]</tt></p> |
| 394 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 395 | <p> |
| 396 | An abbreviated record is a abbreviation id followed by a set of fields that are |
| 397 | encoded according to the <a href="#abbreviations">abbreviation definition</a>. |
| 398 | This allows records to be encoded significantly more densely than records |
| 399 | encoded with the <tt><a href="#UNABBREV_RECORD">UNABBREV_RECORD</a></tt> type, |
| 400 | and allows the abbreviation types to be specified in the stream itself, which |
| 401 | allows the files to be completely self describing. The actual encoding of |
| 402 | abbreviations is defined below. |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 403 | </p> |
| 404 | |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 405 | <p>The record code, which is the first field of an abbreviated record, |
| 406 | may be encoded in the abbreviation definition (as a literal |
| 407 | operand) or supplied in the abbreviated record (as a Fixed or VBR |
| 408 | operand value).</p> |
| 409 | |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 410 | </div> |
| 411 | |
| 412 | <!-- ======================================================================= --> |
| 413 | <div class="doc_subsection"><a name="abbreviations">Abbreviations</a> |
| 414 | </div> |
| 415 | |
| 416 | <div class="doc_text"> |
| 417 | <p> |
| 418 | Abbreviations are an important form of compression for bitstreams. The idea is |
| 419 | to specify a dense encoding for a class of records once, then use that encoding |
| 420 | to emit many records. It takes space to emit the encoding into the file, but |
| 421 | the space is recouped (hopefully plus some) when the records that use it are |
| 422 | emitted. |
| 423 | </p> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 424 | |
| 425 | <p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 426 | Abbreviations can be determined dynamically per client, per file. Because the |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 427 | abbreviations are stored in the bitstream itself, different streams of the same |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 428 | format can contain different sets of abbreviations according to the needs |
| 429 | of the specific stream. |
| 430 | As a concrete example, LLVM IR files usually emit an abbreviation |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 431 | for binary operators. If a specific LLVM module contained no or few binary |
| 432 | operators, the abbreviation does not need to be emitted. |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 433 | </p> |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 434 | </div> |
| 435 | |
| 436 | <!-- _______________________________________________________________________ --> |
| 437 | <div class="doc_subsubsection"><a name="DEFINE_ABBREV">DEFINE_ABBREV |
| 438 | Encoding</a></div> |
| 439 | |
| 440 | <div class="doc_text"> |
| 441 | |
| 442 | <p><tt>[DEFINE_ABBREV, numabbrevops<sub>vbr5</sub>, abbrevop0, abbrevop1, |
| 443 | ...]</tt></p> |
| 444 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 445 | <p> |
| 446 | A <tt>DEFINE_ABBREV</tt> record adds an abbreviation to the list of currently |
| 447 | defined abbreviations in the scope of this block. This definition only exists |
| 448 | inside this immediate block — it is not visible in subblocks or enclosing |
| 449 | blocks. Abbreviations are implicitly assigned IDs sequentially starting from 4 |
| 450 | (the first application-defined abbreviation ID). Any abbreviations defined in a |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 451 | <tt>BLOCKINFO</tt> record for the particular block type |
| 452 | receive IDs first, in order, followed by any |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 453 | abbreviations defined within the block itself. Abbreviated data records |
| 454 | reference this ID to indicate what abbreviation they are invoking. |
| 455 | </p> |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 456 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 457 | <p> |
| 458 | An abbreviation definition consists of the <tt>DEFINE_ABBREV</tt> abbrevid |
| 459 | followed by a VBR that specifies the number of abbrev operands, then the abbrev |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 460 | operands themselves. Abbreviation operands come in three forms. They all start |
| 461 | with a single bit that indicates whether the abbrev operand is a literal operand |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 462 | (when the bit is 1) or an encoding operand (when the bit is 0). |
| 463 | </p> |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 464 | |
| 465 | <ol> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 466 | <li>Literal operands — <tt>[1<sub>1</sub>, litvalue<sub>vbr8</sub>]</tt> |
| 467 | — Literal operands specify that the value in the result is always a single |
| 468 | specific value. This specific value is emitted as a vbr8 after the bit |
| 469 | indicating that it is a literal operand.</li> |
| 470 | <li>Encoding info without data — <tt>[0<sub>1</sub>, |
| 471 | encoding<sub>3</sub>]</tt> — Operand encodings that do not have extra |
| 472 | data are just emitted as their code. |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 473 | </li> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 474 | <li>Encoding info with data — <tt>[0<sub>1</sub>, encoding<sub>3</sub>, |
| 475 | value<sub>vbr5</sub>]</tt> — Operand encodings that do have extra data are |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 476 | emitted as their code, followed by the extra data. |
Chris Lattner | daeb63c | 2007-05-12 07:49:15 +0000 | [diff] [blame] | 477 | </li> |
| 478 | </ol> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 479 | |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 480 | <p>The possible operand encodings are:</p> |
| 481 | |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 482 | <ul> |
| 483 | <li>Fixed (code 1): The field should be emitted as |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 484 | a <a href="#fixedwidth">fixed-width value</a>, whose width is specified by |
| 485 | the operand's extra data.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 486 | <li>VBR (code 2): The field should be emitted as |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 487 | a <a href="#variablewidth">variable-width value</a>, whose width is |
| 488 | specified by the operand's extra data.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 489 | <li>Array (code 3): This field is an array of values. The array operand |
| 490 | has no extra data, but expects another operand to follow it, indicating |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 491 | the element type of the array. When reading an array in an abbreviated |
| 492 | record, the first integer is a vbr6 that indicates the array length, |
| 493 | followed by the encoded elements of the array. An array may only occur as |
| 494 | the last operand of an abbreviation (except for the one final operand that |
| 495 | gives the array's type).</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 496 | <li>Char6 (code 4): This field should be emitted as |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 497 | a <a href="#char6">char6-encoded value</a>. This operand type takes no |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 498 | extra data. Char6 encoding is normally used as an array element type. |
| 499 | </li> |
| 500 | <li>Blob (code 5): This field is emitted as a vbr6, followed by padding to a |
Chris Lattner | dcd006b | 2009-04-06 21:50:39 +0000 | [diff] [blame] | 501 | 32-bit boundary (for alignment) and an array of 8-bit objects. The array of |
| 502 | bytes is further followed by tail padding to ensure that its total length is |
| 503 | a multiple of 4 bytes. This makes it very efficient for the reader to |
| 504 | decode the data without having to make a copy of it: it can use a pointer to |
| 505 | the data in the mapped in file and poke directly at it. A blob may only |
| 506 | occur as the last operand of an abbreviation.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 507 | </ul> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 508 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 509 | <p> |
| 510 | For example, target triples in LLVM modules are encoded as a record of the |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 511 | form <tt>[TRIPLE, 'a', 'b', 'c', 'd']</tt>. Consider if the bitstream emitted |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 512 | the following abbrev entry: |
| 513 | </p> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 514 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 515 | <div class="doc_code"> |
| 516 | <pre> |
| 517 | [0, Fixed, 4] |
| 518 | [0, Array] |
| 519 | [0, Char6] |
| 520 | </pre> |
| 521 | </div> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 522 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 523 | <p> |
| 524 | When emitting a record with this abbreviation, the above entry would be emitted |
| 525 | as: |
| 526 | </p> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 527 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 528 | <div class="doc_code"> |
Bill Wendling | 903bcc4 | 2009-04-04 22:36:02 +0000 | [diff] [blame] | 529 | <p> |
| 530 | <tt>[4<sub>abbrevwidth</sub>, 2<sub>4</sub>, 4<sub>vbr6</sub>, 0<sub>6</sub>, |
| 531 | 1<sub>6</sub>, 2<sub>6</sub>, 3<sub>6</sub>]</tt> |
| 532 | </p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 533 | </div> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 534 | |
| 535 | <p>These values are:</p> |
| 536 | |
| 537 | <ol> |
| 538 | <li>The first value, 4, is the abbreviation ID for this abbreviation.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 539 | <li>The second value, 2, is the record code for <tt>TRIPLE</tt> records within LLVM IR file <tt>MODULE_BLOCK</tt> blocks.</li> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 540 | <li>The third value, 4, is the length of the array.</li> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 541 | <li>The rest of the values are the char6 encoded values |
| 542 | for <tt>"abcd"</tt>.</li> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 543 | </ol> |
| 544 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 545 | <p> |
| 546 | With this abbreviation, the triple is emitted with only 37 bits (assuming a |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 547 | abbrev id width of 3). Without the abbreviation, significantly more space would |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 548 | be required to emit the target triple. Also, because the <tt>TRIPLE</tt> value |
| 549 | is not emitted as a literal in the abbreviation, the abbreviation can also be |
| 550 | used for any other string value. |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 551 | </p> |
| 552 | |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 553 | </div> |
| 554 | |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 555 | <!-- ======================================================================= --> |
| 556 | <div class="doc_subsection"><a name="stdblocks">Standard Blocks</a> |
| 557 | </div> |
| 558 | |
| 559 | <div class="doc_text"> |
| 560 | |
| 561 | <p> |
| 562 | In addition to the basic block structure and record encodings, the bitstream |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 563 | also defines specific built-in block types. These block types specify how the |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 564 | stream is to be decoded or other metadata. In the future, new standard blocks |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 565 | may be added. Block IDs 0-7 are reserved for standard blocks. |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 566 | </p> |
| 567 | |
| 568 | </div> |
| 569 | |
| 570 | <!-- _______________________________________________________________________ --> |
| 571 | <div class="doc_subsubsection"><a name="BLOCKINFO">#0 - BLOCKINFO |
| 572 | Block</a></div> |
| 573 | |
| 574 | <div class="doc_text"> |
| 575 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 576 | <p> |
| 577 | The <tt>BLOCKINFO</tt> block allows the description of metadata for other |
| 578 | blocks. The currently specified records are: |
| 579 | </p> |
| 580 | |
| 581 | <div class="doc_code"> |
| 582 | <pre> |
| 583 | [SETBID (#1), blockid] |
| 584 | [DEFINE_ABBREV, ...] |
Chris Lattner | f9a3ec8 | 2009-04-26 22:21:57 +0000 | [diff] [blame] | 585 | [BLOCKNAME, ...name...] |
| 586 | [SETRECORDNAME, RecordID, ...name...] |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 587 | </pre> |
| 588 | </div> |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 589 | |
| 590 | <p> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 591 | The <tt>SETBID</tt> record (code 1) indicates which block ID is being |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 592 | described. <tt>SETBID</tt> records can occur multiple times throughout the |
| 593 | block to change which block ID is being described. There must be |
| 594 | a <tt>SETBID</tt> record prior to any other records. |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 595 | </p> |
| 596 | |
| 597 | <p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 598 | Standard <tt>DEFINE_ABBREV</tt> records can occur inside <tt>BLOCKINFO</tt> |
| 599 | blocks, but unlike their occurrence in normal blocks, the abbreviation is |
| 600 | defined for blocks matching the block ID we are describing, <i>not</i> the |
| 601 | <tt>BLOCKINFO</tt> block itself. The abbreviations defined |
| 602 | in <tt>BLOCKINFO</tt> blocks receive abbreviation IDs as described |
| 603 | in <tt><a href="#DEFINE_ABBREV">DEFINE_ABBREV</a></tt>. |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 604 | </p> |
| 605 | |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 606 | <p>The <tt>BLOCKNAME</tt> record (code 2) can optionally occur in this block. The elements of |
| 607 | the record are the bytes of the string name of the block. llvm-bcanalyzer can use |
Chris Lattner | f9a3ec8 | 2009-04-26 22:21:57 +0000 | [diff] [blame] | 608 | this to dump out bitcode files symbolically.</p> |
| 609 | |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 610 | <p>The <tt>SETRECORDNAME</tt> record (code 3) can also optionally occur in this block. The |
| 611 | first operand value is a record ID number, and the rest of the elements of the record are |
| 612 | the bytes for the string name of the record. llvm-bcanalyzer can use |
Chris Lattner | f9a3ec8 | 2009-04-26 22:21:57 +0000 | [diff] [blame] | 613 | this to dump out bitcode files symbolically.</p> |
| 614 | |
Chris Lattner | f19b8e4 | 2007-10-08 18:42:45 +0000 | [diff] [blame] | 615 | <p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 616 | Note that although the data in <tt>BLOCKINFO</tt> blocks is described as |
| 617 | "metadata," the abbreviations they contain are essential for parsing records |
| 618 | from the corresponding blocks. It is not safe to skip them. |
Chris Lattner | 7300af5 | 2007-05-13 00:59:52 +0000 | [diff] [blame] | 619 | </p> |
| 620 | |
| 621 | </div> |
Chris Lattner | 3a1716d | 2007-05-12 05:37:42 +0000 | [diff] [blame] | 622 | |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 623 | <!-- *********************************************************************** --> |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 624 | <div class="doc_section"> <a name="wrapper">Bitcode Wrapper Format</a></div> |
| 625 | <!-- *********************************************************************** --> |
| 626 | |
| 627 | <div class="doc_text"> |
| 628 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 629 | <p> |
| 630 | Bitcode files for LLVM IR may optionally be wrapped in a simple wrapper |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 631 | structure. This structure contains a simple header that indicates the offset |
| 632 | and size of the embedded BC file. This allows additional information to be |
| 633 | stored alongside the BC file. The structure of this file header is: |
| 634 | </p> |
| 635 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 636 | <div class="doc_code"> |
Bill Wendling | 903bcc4 | 2009-04-04 22:36:02 +0000 | [diff] [blame] | 637 | <p> |
| 638 | <tt>[Magic<sub>32</sub>, Version<sub>32</sub>, Offset<sub>32</sub>, |
| 639 | Size<sub>32</sub>, CPUType<sub>32</sub>]</tt> |
| 640 | </p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 641 | </div> |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 642 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 643 | <p> |
| 644 | Each of the fields are 32-bit fields stored in little endian form (as with |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 645 | the rest of the bitcode file fields). The Magic number is always |
| 646 | <tt>0x0B17C0DE</tt> and the version is currently always <tt>0</tt>. The Offset |
| 647 | field is the offset in bytes to the start of the bitcode stream in the file, and |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 648 | the Size field is the size in bytes of the stream. CPUType is a target-specific |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 649 | value that can be used to encode the CPU of the target. |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 650 | </p> |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 651 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 652 | </div> |
Chris Lattner | 6fa6a32 | 2008-07-09 05:14:23 +0000 | [diff] [blame] | 653 | |
| 654 | <!-- *********************************************************************** --> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 655 | <div class="doc_section"> <a name="llvmir">LLVM IR Encoding</a></div> |
| 656 | <!-- *********************************************************************** --> |
| 657 | |
| 658 | <div class="doc_text"> |
| 659 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 660 | <p> |
| 661 | LLVM IR is encoded into a bitstream by defining blocks and records. It uses |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 662 | blocks for things like constant pools, functions, symbol tables, etc. It uses |
| 663 | records for things like instructions, global variable descriptors, type |
| 664 | descriptions, etc. This document does not describe the set of abbreviations |
| 665 | that the writer uses, as these are fully self-described in the file, and the |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 666 | reader is not allowed to build in any knowledge of this. |
| 667 | </p> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 668 | |
| 669 | </div> |
| 670 | |
| 671 | <!-- ======================================================================= --> |
| 672 | <div class="doc_subsection"><a name="basics">Basics</a> |
| 673 | </div> |
| 674 | |
| 675 | <!-- _______________________________________________________________________ --> |
| 676 | <div class="doc_subsubsection"><a name="ir_magic">LLVM IR Magic Number</a></div> |
| 677 | |
| 678 | <div class="doc_text"> |
| 679 | |
| 680 | <p> |
| 681 | The magic number for LLVM IR files is: |
| 682 | </p> |
| 683 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 684 | <div class="doc_code"> |
Bill Wendling | 903bcc4 | 2009-04-04 22:36:02 +0000 | [diff] [blame] | 685 | <p> |
| 686 | <tt>[0x0<sub>4</sub>, 0xC<sub>4</sub>, 0xE<sub>4</sub>, 0xD<sub>4</sub>]</tt> |
| 687 | </p> |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 688 | </div> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 689 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 690 | <p> |
| 691 | When combined with the bitcode magic number and viewed as bytes, this is |
| 692 | <tt>"BC 0xC0DE"</tt>. |
| 693 | </p> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 694 | |
| 695 | </div> |
| 696 | |
| 697 | <!-- _______________________________________________________________________ --> |
| 698 | <div class="doc_subsubsection"><a name="ir_signed_vbr">Signed VBRs</a></div> |
| 699 | |
| 700 | <div class="doc_text"> |
| 701 | |
| 702 | <p> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 703 | <a href="#variablewidth">Variable Width Integer</a> encoding is an efficient way to |
| 704 | encode arbitrary sized unsigned values, but is an extremely inefficient for |
| 705 | encoding signed values, as signed values are otherwise treated as maximally large |
| 706 | unsigned values. |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 707 | </p> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 708 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 709 | <p> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 710 | As such, signed VBR values of a specific width are emitted as follows: |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 711 | </p> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 712 | |
| 713 | <ul> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 714 | <li>Positive values are emitted as VBRs of the specified width, but with their |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 715 | value shifted left by one.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 716 | <li>Negative values are emitted as VBRs of the specified width, but the negated |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 717 | value is shifted left by one, and the low bit is set.</li> |
| 718 | </ul> |
| 719 | |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 720 | <p> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 721 | With this encoding, small positive and small negative values can both |
| 722 | be emitted efficiently. Signed VBR encoding is used in |
| 723 | <tt>CST_CODE_INTEGER</tt> and <tt>CST_CODE_WIDE_INTEGER</tt> records |
| 724 | within <tt>CONSTANTS_BLOCK</tt> blocks. |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 725 | </p> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 726 | |
| 727 | </div> |
| 728 | |
| 729 | |
| 730 | <!-- _______________________________________________________________________ --> |
| 731 | <div class="doc_subsubsection"><a name="ir_blocks">LLVM IR Blocks</a></div> |
| 732 | |
| 733 | <div class="doc_text"> |
| 734 | |
| 735 | <p> |
| 736 | LLVM IR is defined with the following blocks: |
| 737 | </p> |
| 738 | |
| 739 | <ul> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 740 | <li>8 — <a href="#MODULE_BLOCK"><tt>MODULE_BLOCK</tt></a> — This is the top-level block that |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 741 | contains the entire module, and describes a variety of per-module |
| 742 | information.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 743 | <li>9 — <a href="#PARAMATTR_BLOCK"><tt>PARAMATTR_BLOCK</tt></a> — This enumerates the parameter |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 744 | attributes.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 745 | <li>10 — <a href="#TYPE_BLOCK"><tt>TYPE_BLOCK</tt></a> — This describes all of the types in |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 746 | the module.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 747 | <li>11 — <a href="#CONSTANTS_BLOCK"><tt>CONSTANTS_BLOCK</tt></a> — This describes constants for a |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 748 | module or function.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 749 | <li>12 — <a href="#FUNCTION_BLOCK"><tt>FUNCTION_BLOCK</tt></a> — This describes a function |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 750 | body.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 751 | <li>13 — <a href="#TYPE_SYMTAB_BLOCK"><tt>TYPE_SYMTAB_BLOCK</tt></a> — This describes the type symbol |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 752 | table.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 753 | <li>14 — <a href="#VALUE_SYMTAB_BLOCK"><tt>VALUE_SYMTAB_BLOCK</tt></a> — This describes a value symbol |
Bill Wendling | bb7425f | 2009-04-04 22:27:03 +0000 | [diff] [blame] | 754 | table.</li> |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 755 | <li>15 — <a href="#METADATA_BLOCK"><tt>METADATA_BLOCK</tt></a> — This describes metadata items.</li> |
| 756 | <li>16 — <a href="#METADATA_ATTACHMENT"><tt>METADATA_ATTACHMENT</tt></a> — This contains records associating metadata with function instruction values.</li> |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 757 | </ul> |
| 758 | |
| 759 | </div> |
| 760 | |
| 761 | <!-- ======================================================================= --> |
| 762 | <div class="doc_subsection"><a name="MODULE_BLOCK">MODULE_BLOCK Contents</a> |
| 763 | </div> |
| 764 | |
| 765 | <div class="doc_text"> |
| 766 | |
Chris Lattner | 5c303e8 | 2009-10-29 04:25:46 +0000 | [diff] [blame] | 767 | <p>The <tt>MODULE_BLOCK</tt> block (id 8) is the top-level block for LLVM |
| 768 | bitcode files, and each bitcode file must contain exactly one. In |
| 769 | addition to records (described below) containing information |
| 770 | about the module, a <tt>MODULE_BLOCK</tt> block may contain the |
| 771 | following sub-blocks: |
| 772 | </p> |
| 773 | |
| 774 | <ul> |
| 775 | <li><a href="#BLOCKINFO"><tt>BLOCKINFO</tt></a></li> |
| 776 | <li><a href="#PARAMATTR_BLOCK"><tt>PARAMATTR_BLOCK</tt></a></li> |
| 777 | <li><a href="#TYPE_BLOCK"><tt>TYPE_BLOCK</tt></a></li> |
| 778 | <li><a href="#TYPE_SYMTAB_BLOCK"><tt>TYPE_SYMTAB_BLOCK</tt></a></li> |
| 779 | <li><a href="#VALUE_SYMTAB_BLOCK"><tt>VALUE_SYMTAB_BLOCK</tt></a></li> |
| 780 | <li><a href="#CONSTANTS_BLOCK"><tt>CONSTANTS_BLOCK</tt></a></li> |
| 781 | <li><a href="#FUNCTION_BLOCK"><tt>FUNCTION_BLOCK</tt></a></li> |
| 782 | <li><a href="#METADATA_BLOCK"><tt>METADATA_BLOCK</tt></a></li> |
| 783 | </ul> |
| 784 | |
| 785 | </div> |
| 786 | |
| 787 | <!-- _______________________________________________________________________ --> |
| 788 | <div class="doc_subsubsection"><a name="MODULE_CODE_VERSION">MODULE_CODE_VERSION Record</a> |
| 789 | </div> |
| 790 | |
| 791 | <div class="doc_text"> |
| 792 | |
| 793 | <p><tt>[VERSION, version#]</tt></p> |
| 794 | |
| 795 | <p>The <tt>VERSION</tt> record (code 1) contains a single value |
| 796 | indicating the format version. Only version 0 is supported at this |
| 797 | time.</p> |
| 798 | </div> |
| 799 | |
| 800 | <!-- _______________________________________________________________________ --> |
| 801 | <div class="doc_subsubsection"><a name="MODULE_CODE_TRIPLE">MODULE_CODE_TRIPLE Record</a> |
| 802 | </div> |
| 803 | |
| 804 | <div class="doc_text"> |
| 805 | <p><tt>[TRIPLE, ...string...]</tt></p> |
| 806 | |
| 807 | <p>The <tt>TRIPLE</tt> record (code 2) contains a variable number of |
| 808 | values representing the bytes of the <tt>target triple</tt> |
| 809 | specification string.</p> |
| 810 | </div> |
| 811 | |
| 812 | <!-- _______________________________________________________________________ --> |
| 813 | <div class="doc_subsubsection"><a name="MODULE_CODE_DATALAYOUT">MODULE_CODE_DATALAYOUT Record</a> |
| 814 | </div> |
| 815 | |
| 816 | <div class="doc_text"> |
| 817 | <p><tt>[DATALAYOUT, ...string...]</tt></p> |
| 818 | |
| 819 | <p>The <tt>DATALAYOUT</tt> record (code 3) contains a variable number of |
| 820 | values representing the bytes of the <tt>target datalayout</tt> |
| 821 | specification string.</p> |
| 822 | </div> |
| 823 | |
| 824 | <!-- _______________________________________________________________________ --> |
| 825 | <div class="doc_subsubsection"><a name="MODULE_CODE_ASM">MODULE_CODE_ASM Record</a> |
| 826 | </div> |
| 827 | |
| 828 | <div class="doc_text"> |
| 829 | <p><tt>[ASM, ...string...]</tt></p> |
| 830 | |
| 831 | <p>The <tt>ASM</tt> record (code 4) contains a variable number of |
| 832 | values representing the bytes of <tt>module asm</tt> strings, with |
| 833 | individual assembly blocks separated by newline (ASCII 10) characters.</p> |
| 834 | </div> |
| 835 | |
| 836 | <!-- _______________________________________________________________________ --> |
| 837 | <div class="doc_subsubsection"><a name="MODULE_CODE_SECTIONNAME">MODULE_CODE_SECTIONNAME Record</a> |
| 838 | </div> |
| 839 | |
| 840 | <div class="doc_text"> |
| 841 | <p><tt>[SECTIONNAME, ...string...]</tt></p> |
| 842 | |
| 843 | <p>The <tt>SECTIONNAME</tt> record (code 5) contains a variable number |
| 844 | of values representing the bytes of a single section name |
| 845 | string. There should be one <tt>SECTIONNAME</tt> record for each |
| 846 | section name referenced (e.g., in global variable or function |
| 847 | <tt>section</tt> attributes) within the module. These records can be |
| 848 | referenced by the 1-based index in the <i>section</i> fields of |
| 849 | <tt>GLOBALVAR</tt> or <tt>FUNCTION</tt> records.</p> |
| 850 | </div> |
| 851 | |
| 852 | <!-- _______________________________________________________________________ --> |
| 853 | <div class="doc_subsubsection"><a name="MODULE_CODE_DEPLIB">MODULE_CODE_DEPLIB Record</a> |
| 854 | </div> |
| 855 | |
| 856 | <div class="doc_text"> |
| 857 | <p><tt>[DEPLIB, ...string...]</tt></p> |
| 858 | |
| 859 | <p>The <tt>DEPLIB</tt> record (code 6) contains a variable number of |
| 860 | values representing the bytes of a single dependent library name |
| 861 | string, one of the libraries mentioned in a <tt>deplibs</tt> |
| 862 | declaration. There should be one <tt>DEPLIB</tt> record for each |
| 863 | library name referenced.</p> |
| 864 | </div> |
| 865 | |
| 866 | <!-- _______________________________________________________________________ --> |
| 867 | <div class="doc_subsubsection"><a name="MODULE_CODE_GLOBALVAR">MODULE_CODE_GLOBALVAR Record</a> |
| 868 | </div> |
| 869 | |
| 870 | <div class="doc_text"> |
| 871 | <p><tt>[GLOBALVAR, pointer type, isconst, initid, linkage, alignment, section, visibility, threadlocal]</tt></p> |
| 872 | |
| 873 | <p>The <tt>GLOBALVAR</tt> record (code 7) marks the declaration or |
| 874 | definition of a global variable. The operand fields are:</p> |
| 875 | |
| 876 | <ul> |
| 877 | <li><i>pointer type</i>: The type index of the pointer type used to point to |
| 878 | this global variable</li> |
| 879 | |
| 880 | <li><i>isconst</i>: Non-zero if the variable is treated as constant within |
| 881 | the module, or zero if it is not</li> |
| 882 | |
| 883 | <li><i>initid</i>: If non-zero, the value index of the initializer for this |
| 884 | variable, plus 1.</li> |
| 885 | |
| 886 | <li><a name="linkage"><i>linkage</i></a>: An encoding of the linkage |
| 887 | type for this variable: |
| 888 | <ul> |
| 889 | <li><tt>external</tt>: code 0</li> |
| 890 | <li><tt>weak</tt>: code 1</li> |
| 891 | <li><tt>appending</tt>: code 2</li> |
| 892 | <li><tt>internal</tt>: code 3</li> |
| 893 | <li><tt>linkonce</tt>: code 4</li> |
| 894 | <li><tt>dllimport</tt>: code 5</li> |
| 895 | <li><tt>dllexport</tt>: code 6</li> |
| 896 | <li><tt>extern_weak</tt>: code 7</li> |
| 897 | <li><tt>common</tt>: code 8</li> |
| 898 | <li><tt>private</tt>: code 9</li> |
| 899 | <li><tt>weak_odr</tt>: code 10</li> |
| 900 | <li><tt>linkonce_odr</tt>: code 11</li> |
| 901 | <li><tt>available_externally</tt>: code 12</li> |
| 902 | <li><tt>linker_private</tt>: code 13</li> |
| 903 | </ul> |
| 904 | </li> |
| 905 | |
| 906 | <li><i>alignment</i>: The logarithm base 2 of the variable's requested |
| 907 | alignment, plus 1</li> |
| 908 | |
| 909 | <li><i>section</i>: If non-zero, the 1-based section index in the |
| 910 | table of <a href="#MODULE_CODE_SECTIONNAME">MODULE_CODE_SECTIONNAME</a> |
| 911 | entries.</li> |
| 912 | |
| 913 | <li><a name="visibility"><i>visibility</i></a>: If present, an |
| 914 | encoding of the visibility of this variable: |
| 915 | <ul> |
| 916 | <li><tt>default</tt>: code 0</li> |
| 917 | <li><tt>hidden</tt>: code 1</li> |
| 918 | <li><tt>protected</tt>: code 2</li> |
| 919 | </ul> |
| 920 | </li> |
| 921 | |
| 922 | <li><i>threadlocal</i>: If present and non-zero, indicates that the variable |
| 923 | is <tt>thread_local</tt></li> |
| 924 | |
| 925 | </ul> |
| 926 | </div> |
| 927 | |
| 928 | <!-- _______________________________________________________________________ --> |
| 929 | <div class="doc_subsubsection"><a name="MODULE_CODE_FUNCTION">MODULE_CODE_FUNCTION Record</a> |
| 930 | </div> |
| 931 | |
| 932 | <div class="doc_text"> |
| 933 | |
| 934 | <p><tt>[FUNCTION, type, callingconv, isproto, linkage, paramattr, alignment, section, visibility, gc]</tt></p> |
| 935 | |
| 936 | <p>The <tt>FUNCTION</tt> record (code 8) marks the declaration or |
| 937 | definition of a function. The operand fields are:</p> |
| 938 | |
| 939 | <ul> |
| 940 | <li><i>type</i>: The type index of the function type describing this function</li> |
| 941 | |
| 942 | <li><i>callingconv</i>: The calling convention number: |
| 943 | <ul> |
| 944 | <li><tt>ccc</tt>: code 0</li> |
| 945 | <li><tt>fastcc</tt>: code 8</li> |
| 946 | <li><tt>coldcc</tt>: code 9</li> |
| 947 | <li><tt>x86_stdcallcc</tt>: code 64</li> |
| 948 | <li><tt>x86_fastcallcc</tt>: code 65</li> |
| 949 | <li><tt>arm_apcscc</tt>: code 66</li> |
| 950 | <li><tt>arm_aapcscc</tt>: code 67</li> |
| 951 | <li><tt>arm_aapcs_vfpcc</tt>: code 68</li> |
| 952 | </ul> |
| 953 | </li> |
| 954 | |
| 955 | <li><i>isproto</i>: Non-zero if this entry represents a declaration |
| 956 | rather than a definition</li> |
| 957 | |
| 958 | <li><i>linkage</i>: An encoding of the <a href="#linkage">linkage type</a> |
| 959 | for this function</li> |
| 960 | |
| 961 | <li><i>paramattr</i>: If nonzero, the 1-based parameter attribute index |
| 962 | into the table of <a href="#PARAMATTR_CODE_ENTRY">PARAMATTR_CODE_ENTRY</a> |
| 963 | entries.</li> |
| 964 | |
| 965 | <li><i>alignment</i>: The logarithm base 2 of the function's requested |
| 966 | alignment, plus 1</li> |
| 967 | |
| 968 | <li><i>section</i>: If non-zero, the 1-based section index in the |
| 969 | table of <a href="#MODULE_CODE_SECTIONNAME">MODULE_CODE_SECTIONNAME</a> |
| 970 | entries.</li> |
| 971 | |
| 972 | <li><i>visibility</i>: An encoding of the <a href="#visibility">visibility</a> |
| 973 | of this function</li> |
| 974 | |
| 975 | <li><i>gc</i>: If present and nonzero, the 1-based garbage collector |
| 976 | index in the table of |
| 977 | <a href="#MODULE_CODE_GCNAME">MODULE_CODE_GCNAME</a> entries.</li> |
| 978 | </ul> |
| 979 | </div> |
| 980 | |
| 981 | <!-- _______________________________________________________________________ --> |
| 982 | <div class="doc_subsubsection"><a name="MODULE_CODE_ALIAS">MODULE_CODE_ALIAS Record</a> |
| 983 | </div> |
| 984 | |
| 985 | <div class="doc_text"> |
| 986 | |
| 987 | <p><tt>[ALIAS, alias type, aliasee val#, linkage, visibility]</tt></p> |
| 988 | |
| 989 | <p>The <tt>ALIAS</tt> record (code 9) marks the definition of an |
| 990 | alias. The operand fields are</p> |
| 991 | |
| 992 | <ul> |
| 993 | <li><i>alias type</i>: The type index of the alias</li> |
| 994 | |
| 995 | <li><i>aliasee val#</i>: The value index of the aliased value</li> |
| 996 | |
| 997 | <li><i>linkage</i>: An encoding of the <a href="#linkage">linkage type</a> |
| 998 | for this alias</li> |
| 999 | |
| 1000 | <li><i>visibility</i>: If present, an encoding of the |
| 1001 | <a href="#visibility">visibility</a> of the alias</li> |
| 1002 | |
| 1003 | </ul> |
| 1004 | </div> |
| 1005 | |
| 1006 | <!-- _______________________________________________________________________ --> |
| 1007 | <div class="doc_subsubsection"><a name="MODULE_CODE_PURGEVALS">MODULE_CODE_PURGEVALS Record</a> |
| 1008 | </div> |
| 1009 | |
| 1010 | <div class="doc_text"> |
| 1011 | <p><tt>[PURGEVALS, numvals]</tt></p> |
| 1012 | |
| 1013 | <p>The <tt>PURGEVALS</tt> record (code 10) resets the module-level |
| 1014 | value list to the size given by the single operand value. Module-level |
| 1015 | value list items are added by <tt>GLOBALVAR</tt>, <tt>FUNCTION</tt>, |
| 1016 | and <tt>ALIAS</tt> records. After a <tt>PURGEVALS</tt> record is seen, |
| 1017 | new value indices will start from the given <i>numvals</i> value.</p> |
| 1018 | </div> |
| 1019 | |
| 1020 | <!-- _______________________________________________________________________ --> |
| 1021 | <div class="doc_subsubsection"><a name="MODULE_CODE_GCNAME">MODULE_CODE_GCNAME Record</a> |
| 1022 | </div> |
| 1023 | |
| 1024 | <div class="doc_text"> |
| 1025 | <p><tt>[GCNAME, ...string...]</tt></p> |
| 1026 | |
| 1027 | <p>The <tt>GCNAME</tt> record (code 11) contains a variable number of |
| 1028 | values representing the bytes of a single garbage collector name |
| 1029 | string. There should be one <tt>GCNAME</tt> record for each garbage |
| 1030 | collector name referenced in function <tt>gc</tt> attributes within |
| 1031 | the module. These records can be referenced by 1-based index in the <i>gc</i> |
| 1032 | fields of <tt>FUNCTION</tt> records.</p> |
| 1033 | </div> |
| 1034 | |
| 1035 | <!-- ======================================================================= --> |
| 1036 | <div class="doc_subsection"><a name="PARAMATTR_BLOCK">PARAMATTR_BLOCK Contents</a> |
| 1037 | </div> |
| 1038 | |
| 1039 | <div class="doc_text"> |
| 1040 | |
| 1041 | <p>The <tt>PARAMATTR_BLOCK</tt> block (id 9) ... |
| 1042 | </p> |
| 1043 | |
| 1044 | </div> |
| 1045 | |
| 1046 | |
| 1047 | <!-- _______________________________________________________________________ --> |
| 1048 | <div class="doc_subsubsection"><a name="PARAMATTR_CODE_ENTRY">PARAMATTR_CODE_ENTRY Record</a> |
| 1049 | </div> |
| 1050 | |
| 1051 | <div class="doc_text"> |
| 1052 | |
| 1053 | <p><tt>[ENTRY, paramidx0, attr0, paramidx1, attr1...]</tt></p> |
| 1054 | |
| 1055 | <p>The <tt>ENTRY</tt> record (code 1) ... |
| 1056 | </p> |
| 1057 | </div> |
| 1058 | |
| 1059 | <!-- ======================================================================= --> |
| 1060 | <div class="doc_subsection"><a name="TYPE_BLOCK">TYPE_BLOCK Contents</a> |
| 1061 | </div> |
| 1062 | |
| 1063 | <div class="doc_text"> |
| 1064 | |
| 1065 | <p>The <tt>TYPE_BLOCK</tt> block (id 10) ... |
| 1066 | </p> |
| 1067 | |
| 1068 | </div> |
| 1069 | |
| 1070 | |
| 1071 | <!-- ======================================================================= --> |
| 1072 | <div class="doc_subsection"><a name="CONSTANTS_BLOCK">CONSTANTS_BLOCK Contents</a> |
| 1073 | </div> |
| 1074 | |
| 1075 | <div class="doc_text"> |
| 1076 | |
| 1077 | <p>The <tt>CONSTANTS_BLOCK</tt> block (id 11) ... |
| 1078 | </p> |
| 1079 | |
| 1080 | </div> |
| 1081 | |
| 1082 | |
| 1083 | <!-- ======================================================================= --> |
| 1084 | <div class="doc_subsection"><a name="FUNCTION_BLOCK">FUNCTION_BLOCK Contents</a> |
| 1085 | </div> |
| 1086 | |
| 1087 | <div class="doc_text"> |
| 1088 | |
| 1089 | <p>The <tt>FUNCTION_BLOCK</tt> block (id 12) ... |
| 1090 | </p> |
| 1091 | |
| 1092 | <p>In addition to the record types described below, a |
| 1093 | <tt>FUNCTION_BLOCK</tt> block may contain the following sub-blocks: |
| 1094 | </p> |
| 1095 | |
| 1096 | <ul> |
| 1097 | <li><a href="#CONSTANTS_BLOCK"><tt>CONSTANTS_BLOCK</tt></a></li> |
| 1098 | <li><a href="#VALUE_SYMTAB_BLOCK"><tt>VALUE_SYMTAB_BLOCK</tt></a></li> |
| 1099 | <li><a href="#METADATA_ATTACHMENT"><tt>METADATA_ATTACHMENT</tt></a></li> |
| 1100 | </ul> |
| 1101 | |
| 1102 | </div> |
| 1103 | |
| 1104 | |
| 1105 | <!-- ======================================================================= --> |
| 1106 | <div class="doc_subsection"><a name="TYPE_SYMTAB_BLOCK">TYPE_SYMTAB_BLOCK Contents</a> |
| 1107 | </div> |
| 1108 | |
| 1109 | <div class="doc_text"> |
| 1110 | |
| 1111 | <p>The <tt>TYPE_SYMTAB_BLOCK</tt> block (id 13) ... |
| 1112 | </p> |
| 1113 | |
| 1114 | </div> |
| 1115 | |
| 1116 | |
| 1117 | <!-- ======================================================================= --> |
| 1118 | <div class="doc_subsection"><a name="VALUE_SYMTAB_BLOCK">VALUE_SYMTAB_BLOCK Contents</a> |
| 1119 | </div> |
| 1120 | |
| 1121 | <div class="doc_text"> |
| 1122 | |
| 1123 | <p>The <tt>VALUE_SYMTAB_BLOCK</tt> block (id 14) ... |
| 1124 | </p> |
| 1125 | |
| 1126 | </div> |
| 1127 | |
| 1128 | |
| 1129 | <!-- ======================================================================= --> |
| 1130 | <div class="doc_subsection"><a name="METADATA_BLOCK">METADATA_BLOCK Contents</a> |
| 1131 | </div> |
| 1132 | |
| 1133 | <div class="doc_text"> |
| 1134 | |
| 1135 | <p>The <tt>METADATA_BLOCK</tt> block (id 15) ... |
| 1136 | </p> |
| 1137 | |
| 1138 | </div> |
| 1139 | |
| 1140 | |
| 1141 | <!-- ======================================================================= --> |
| 1142 | <div class="doc_subsection"><a name="METADATA_ATTACHMENT">METADATA_ATTACHMENT Contents</a> |
| 1143 | </div> |
| 1144 | |
| 1145 | <div class="doc_text"> |
| 1146 | |
| 1147 | <p>The <tt>METADATA_ATTACHMENT</tt> block (id 16) ... |
Chris Lattner | 69b3e40 | 2007-05-13 01:39:44 +0000 | [diff] [blame] | 1148 | </p> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 1149 | |
| 1150 | </div> |
| 1151 | |
| 1152 | |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 1153 | <!-- *********************************************************************** --> |
| 1154 | <hr> |
| 1155 | <address> <a href="http://jigsaw.w3.org/css-validator/check/referer"><img |
Misha Brukman | 4440870 | 2008-12-11 17:34:48 +0000 | [diff] [blame] | 1156 | src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 1157 | <a href="http://validator.w3.org/check/referer"><img |
Misha Brukman | 4440870 | 2008-12-11 17:34:48 +0000 | [diff] [blame] | 1158 | src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> |
Chris Lattner | e9ef457 | 2007-05-12 03:23:40 +0000 | [diff] [blame] | 1159 | <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> |
Reid Spencer | 2c1ce4f | 2007-01-20 23:21:08 +0000 | [diff] [blame] | 1160 | <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br> |
| 1161 | Last modified: $Date$ |
| 1162 | </address> |
| 1163 | </body> |
| 1164 | </html> |