Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1 | ============================== |
| 2 | LLVM Language Reference Manual |
| 3 | ============================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | :depth: 3 |
| 8 | |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9 | Abstract |
| 10 | ======== |
| 11 | |
| 12 | This document is a reference manual for the LLVM assembly language. LLVM |
| 13 | is a Static Single Assignment (SSA) based representation that provides |
| 14 | type safety, low-level operations, flexibility, and the capability of |
| 15 | representing 'all' high-level languages cleanly. It is the common code |
| 16 | representation used throughout all phases of the LLVM compilation |
| 17 | strategy. |
| 18 | |
| 19 | Introduction |
| 20 | ============ |
| 21 | |
| 22 | The LLVM code representation is designed to be used in three different |
| 23 | forms: as an in-memory compiler IR, as an on-disk bitcode representation |
| 24 | (suitable for fast loading by a Just-In-Time compiler), and as a human |
| 25 | readable assembly language representation. This allows LLVM to provide a |
| 26 | powerful intermediate representation for efficient compiler |
| 27 | transformations and analysis, while providing a natural means to debug |
| 28 | and visualize the transformations. The three different forms of LLVM are |
| 29 | all equivalent. This document describes the human readable |
| 30 | representation and notation. |
| 31 | |
| 32 | The LLVM representation aims to be light-weight and low-level while |
| 33 | being expressive, typed, and extensible at the same time. It aims to be |
| 34 | a "universal IR" of sorts, by being at a low enough level that |
| 35 | high-level ideas may be cleanly mapped to it (similar to how |
| 36 | microprocessors are "universal IR's", allowing many source languages to |
| 37 | be mapped to them). By providing type information, LLVM can be used as |
| 38 | the target of optimizations: for example, through pointer analysis, it |
| 39 | can be proven that a C automatic variable is never accessed outside of |
| 40 | the current function, allowing it to be promoted to a simple SSA value |
| 41 | instead of a memory location. |
| 42 | |
| 43 | .. _wellformed: |
| 44 | |
| 45 | Well-Formedness |
| 46 | --------------- |
| 47 | |
| 48 | It is important to note that this document describes 'well formed' LLVM |
| 49 | assembly language. There is a difference between what the parser accepts |
| 50 | and what is considered 'well formed'. For example, the following |
| 51 | instruction is syntactically okay, but not well formed: |
| 52 | |
| 53 | .. code-block:: llvm |
| 54 | |
| 55 | %x = add i32 1, %x |
| 56 | |
| 57 | because the definition of ``%x`` does not dominate all of its uses. The |
| 58 | LLVM infrastructure provides a verification pass that may be used to |
| 59 | verify that an LLVM module is well formed. This pass is automatically |
| 60 | run by the parser after parsing input assembly and by the optimizer |
| 61 | before it outputs bitcode. The violations pointed out by the verifier |
| 62 | pass indicate bugs in transformation passes or input to the parser. |
| 63 | |
| 64 | .. _identifiers: |
| 65 | |
| 66 | Identifiers |
| 67 | =========== |
| 68 | |
| 69 | LLVM identifiers come in two basic types: global and local. Global |
| 70 | identifiers (functions, global variables) begin with the ``'@'`` |
| 71 | character. Local identifiers (register names, types) begin with the |
| 72 | ``'%'`` character. Additionally, there are three different formats for |
| 73 | identifiers, for different purposes: |
| 74 | |
| 75 | #. Named values are represented as a string of characters with their |
| 76 | prefix. For example, ``%foo``, ``@DivisionByZero``, |
| 77 | ``%a.really.long.identifier``. The actual regular expression used is |
| 78 | '``[%@][a-zA-Z$._][a-zA-Z$._0-9]*``'. Identifiers which require other |
| 79 | characters in their names can be surrounded with quotes. Special |
| 80 | characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII |
| 81 | code for the character in hexadecimal. In this way, any character can |
| 82 | be used in a name value, even quotes themselves. |
| 83 | #. Unnamed values are represented as an unsigned numeric value with |
| 84 | their prefix. For example, ``%12``, ``@2``, ``%44``. |
| 85 | #. Constants, which are described in the section Constants_ below. |
| 86 | |
| 87 | LLVM requires that values start with a prefix for two reasons: Compilers |
| 88 | don't need to worry about name clashes with reserved words, and the set |
| 89 | of reserved words may be expanded in the future without penalty. |
| 90 | Additionally, unnamed identifiers allow a compiler to quickly come up |
| 91 | with a temporary variable without having to avoid symbol table |
| 92 | conflicts. |
| 93 | |
| 94 | Reserved words in LLVM are very similar to reserved words in other |
| 95 | languages. There are keywords for different opcodes ('``add``', |
| 96 | '``bitcast``', '``ret``', etc...), for primitive type names ('``void``', |
| 97 | '``i32``', etc...), and others. These reserved words cannot conflict |
| 98 | with variable names, because none of them start with a prefix character |
| 99 | (``'%'`` or ``'@'``). |
| 100 | |
| 101 | Here is an example of LLVM code to multiply the integer variable |
| 102 | '``%X``' by 8: |
| 103 | |
| 104 | The easy way: |
| 105 | |
| 106 | .. code-block:: llvm |
| 107 | |
| 108 | %result = mul i32 %X, 8 |
| 109 | |
| 110 | After strength reduction: |
| 111 | |
| 112 | .. code-block:: llvm |
| 113 | |
Dmitri Gribenko | 126fde5 | 2013-01-26 13:30:13 +0000 | [diff] [blame] | 114 | %result = shl i32 %X, 3 |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 115 | |
| 116 | And the hard way: |
| 117 | |
| 118 | .. code-block:: llvm |
| 119 | |
| 120 | %0 = add i32 %X, %X ; yields {i32}:%0 |
| 121 | %1 = add i32 %0, %0 ; yields {i32}:%1 |
| 122 | %result = add i32 %1, %1 |
| 123 | |
| 124 | This last way of multiplying ``%X`` by 8 illustrates several important |
| 125 | lexical features of LLVM: |
| 126 | |
| 127 | #. Comments are delimited with a '``;``' and go until the end of line. |
| 128 | #. Unnamed temporaries are created when the result of a computation is |
| 129 | not assigned to a named value. |
| 130 | #. Unnamed temporaries are numbered sequentially |
| 131 | |
| 132 | It also shows a convention that we follow in this document. When |
| 133 | demonstrating instructions, we will follow an instruction with a comment |
| 134 | that defines the type and name of value produced. |
| 135 | |
| 136 | High Level Structure |
| 137 | ==================== |
| 138 | |
| 139 | Module Structure |
| 140 | ---------------- |
| 141 | |
| 142 | LLVM programs are composed of ``Module``'s, each of which is a |
| 143 | translation unit of the input programs. Each module consists of |
| 144 | functions, global variables, and symbol table entries. Modules may be |
| 145 | combined together with the LLVM linker, which merges function (and |
| 146 | global variable) definitions, resolves forward declarations, and merges |
| 147 | symbol table entries. Here is an example of the "hello world" module: |
| 148 | |
| 149 | .. code-block:: llvm |
| 150 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 151 | ; Declare the string constant as a global constant. |
| 152 | @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 153 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 154 | ; External declaration of the puts function |
| 155 | declare i32 @puts(i8* nocapture) nounwind |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 156 | |
| 157 | ; Definition of main function |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 158 | define i32 @main() { ; i32()* |
| 159 | ; Convert [13 x i8]* to i8 *... |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 160 | %cast210 = getelementptr [13 x i8]* @.str, i64 0, i64 0 |
| 161 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 162 | ; Call puts function to write out the string to stdout. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 163 | call i32 @puts(i8* %cast210) |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 164 | ret i32 0 |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 165 | } |
| 166 | |
| 167 | ; Named metadata |
| 168 | !1 = metadata !{i32 42} |
| 169 | !foo = !{!1, null} |
| 170 | |
| 171 | This example is made up of a :ref:`global variable <globalvars>` named |
| 172 | "``.str``", an external declaration of the "``puts``" function, a |
| 173 | :ref:`function definition <functionstructure>` for "``main``" and |
| 174 | :ref:`named metadata <namedmetadatastructure>` "``foo``". |
| 175 | |
| 176 | In general, a module is made up of a list of global values (where both |
| 177 | functions and global variables are global values). Global values are |
| 178 | represented by a pointer to a memory location (in this case, a pointer |
| 179 | to an array of char, and a pointer to a function), and have one of the |
| 180 | following :ref:`linkage types <linkage>`. |
| 181 | |
| 182 | .. _linkage: |
| 183 | |
| 184 | Linkage Types |
| 185 | ------------- |
| 186 | |
| 187 | All Global Variables and Functions have one of the following types of |
| 188 | linkage: |
| 189 | |
| 190 | ``private`` |
| 191 | Global values with "``private``" linkage are only directly |
| 192 | accessible by objects in the current module. In particular, linking |
| 193 | code into a module with an private global value may cause the |
| 194 | private to be renamed as necessary to avoid collisions. Because the |
| 195 | symbol is private to the module, all references can be updated. This |
| 196 | doesn't show up in any symbol table in the object file. |
| 197 | ``linker_private`` |
| 198 | Similar to ``private``, but the symbol is passed through the |
| 199 | assembler and evaluated by the linker. Unlike normal strong symbols, |
| 200 | they are removed by the linker from the final linked image |
| 201 | (executable or dynamic library). |
| 202 | ``linker_private_weak`` |
| 203 | Similar to "``linker_private``", but the symbol is weak. Note that |
| 204 | ``linker_private_weak`` symbols are subject to coalescing by the |
| 205 | linker. The symbols are removed by the linker from the final linked |
| 206 | image (executable or dynamic library). |
| 207 | ``internal`` |
| 208 | Similar to private, but the value shows as a local symbol |
| 209 | (``STB_LOCAL`` in the case of ELF) in the object file. This |
| 210 | corresponds to the notion of the '``static``' keyword in C. |
| 211 | ``available_externally`` |
| 212 | Globals with "``available_externally``" linkage are never emitted |
| 213 | into the object file corresponding to the LLVM module. They exist to |
| 214 | allow inlining and other optimizations to take place given knowledge |
| 215 | of the definition of the global, which is known to be somewhere |
| 216 | outside the module. Globals with ``available_externally`` linkage |
| 217 | are allowed to be discarded at will, and are otherwise the same as |
| 218 | ``linkonce_odr``. This linkage type is only allowed on definitions, |
| 219 | not declarations. |
| 220 | ``linkonce`` |
| 221 | Globals with "``linkonce``" linkage are merged with other globals of |
| 222 | the same name when linkage occurs. This can be used to implement |
| 223 | some forms of inline functions, templates, or other code which must |
| 224 | be generated in each translation unit that uses it, but where the |
| 225 | body may be overridden with a more definitive definition later. |
| 226 | Unreferenced ``linkonce`` globals are allowed to be discarded. Note |
| 227 | that ``linkonce`` linkage does not actually allow the optimizer to |
| 228 | inline the body of this function into callers because it doesn't |
| 229 | know if this definition of the function is the definitive definition |
| 230 | within the program or whether it will be overridden by a stronger |
| 231 | definition. To enable inlining and other optimizations, use |
| 232 | "``linkonce_odr``" linkage. |
| 233 | ``weak`` |
| 234 | "``weak``" linkage has the same merging semantics as ``linkonce`` |
| 235 | linkage, except that unreferenced globals with ``weak`` linkage may |
| 236 | not be discarded. This is used for globals that are declared "weak" |
| 237 | in C source code. |
| 238 | ``common`` |
| 239 | "``common``" linkage is most similar to "``weak``" linkage, but they |
| 240 | are used for tentative definitions in C, such as "``int X;``" at |
| 241 | global scope. Symbols with "``common``" linkage are merged in the |
| 242 | same way as ``weak symbols``, and they may not be deleted if |
| 243 | unreferenced. ``common`` symbols may not have an explicit section, |
| 244 | must have a zero initializer, and may not be marked |
| 245 | ':ref:`constant <globalvars>`'. Functions and aliases may not have |
| 246 | common linkage. |
| 247 | |
| 248 | .. _linkage_appending: |
| 249 | |
| 250 | ``appending`` |
| 251 | "``appending``" linkage may only be applied to global variables of |
| 252 | pointer to array type. When two global variables with appending |
| 253 | linkage are linked together, the two global arrays are appended |
| 254 | together. This is the LLVM, typesafe, equivalent of having the |
| 255 | system linker append together "sections" with identical names when |
| 256 | .o files are linked. |
| 257 | ``extern_weak`` |
| 258 | The semantics of this linkage follow the ELF object file model: the |
| 259 | symbol is weak until linked, if not linked, the symbol becomes null |
| 260 | instead of being an undefined reference. |
| 261 | ``linkonce_odr``, ``weak_odr`` |
| 262 | Some languages allow differing globals to be merged, such as two |
| 263 | functions with different semantics. Other languages, such as |
| 264 | ``C++``, ensure that only equivalent globals are ever merged (the |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 265 | "one definition rule" --- "ODR"). Such languages can use the |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 266 | ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the |
| 267 | global will only be merged with equivalent globals. These linkage |
| 268 | types are otherwise the same as their non-``odr`` versions. |
| 269 | ``linkonce_odr_auto_hide`` |
| 270 | Similar to "``linkonce_odr``", but nothing in the translation unit |
| 271 | takes the address of this definition. For instance, functions that |
| 272 | had an inline definition, but the compiler decided not to inline it. |
| 273 | ``linkonce_odr_auto_hide`` may have only ``default`` visibility. The |
| 274 | symbols are removed by the linker from the final linked image |
| 275 | (executable or dynamic library). |
| 276 | ``external`` |
| 277 | If none of the above identifiers are used, the global is externally |
| 278 | visible, meaning that it participates in linkage and can be used to |
| 279 | resolve external symbol references. |
| 280 | |
| 281 | The next two types of linkage are targeted for Microsoft Windows |
| 282 | platform only. They are designed to support importing (exporting) |
| 283 | symbols from (to) DLLs (Dynamic Link Libraries). |
| 284 | |
| 285 | ``dllimport`` |
| 286 | "``dllimport``" linkage causes the compiler to reference a function |
| 287 | or variable via a global pointer to a pointer that is set up by the |
| 288 | DLL exporting the symbol. On Microsoft Windows targets, the pointer |
| 289 | name is formed by combining ``__imp_`` and the function or variable |
| 290 | name. |
| 291 | ``dllexport`` |
| 292 | "``dllexport``" linkage causes the compiler to provide a global |
| 293 | pointer to a pointer in a DLL, so that it can be referenced with the |
| 294 | ``dllimport`` attribute. On Microsoft Windows targets, the pointer |
| 295 | name is formed by combining ``__imp_`` and the function or variable |
| 296 | name. |
| 297 | |
| 298 | For example, since the "``.LC0``" variable is defined to be internal, if |
| 299 | another module defined a "``.LC0``" variable and was linked with this |
| 300 | one, one of the two would be renamed, preventing a collision. Since |
| 301 | "``main``" and "``puts``" are external (i.e., lacking any linkage |
| 302 | declarations), they are accessible outside of the current module. |
| 303 | |
| 304 | It is illegal for a function *declaration* to have any linkage type |
| 305 | other than ``external``, ``dllimport`` or ``extern_weak``. |
| 306 | |
| 307 | Aliases can have only ``external``, ``internal``, ``weak`` or |
| 308 | ``weak_odr`` linkages. |
| 309 | |
| 310 | .. _callingconv: |
| 311 | |
| 312 | Calling Conventions |
| 313 | ------------------- |
| 314 | |
| 315 | LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and |
| 316 | :ref:`invokes <i_invoke>` can all have an optional calling convention |
| 317 | specified for the call. The calling convention of any pair of dynamic |
| 318 | caller/callee must match, or the behavior of the program is undefined. |
| 319 | The following calling conventions are supported by LLVM, and more may be |
| 320 | added in the future: |
| 321 | |
| 322 | "``ccc``" - The C calling convention |
| 323 | This calling convention (the default if no other calling convention |
| 324 | is specified) matches the target C calling conventions. This calling |
| 325 | convention supports varargs function calls and tolerates some |
| 326 | mismatch in the declared prototype and implemented declaration of |
| 327 | the function (as does normal C). |
| 328 | "``fastcc``" - The fast calling convention |
| 329 | This calling convention attempts to make calls as fast as possible |
| 330 | (e.g. by passing things in registers). This calling convention |
| 331 | allows the target to use whatever tricks it wants to produce fast |
| 332 | code for the target, without having to conform to an externally |
| 333 | specified ABI (Application Binary Interface). `Tail calls can only |
| 334 | be optimized when this, the GHC or the HiPE convention is |
| 335 | used. <CodeGenerator.html#id80>`_ This calling convention does not |
| 336 | support varargs and requires the prototype of all callees to exactly |
| 337 | match the prototype of the function definition. |
| 338 | "``coldcc``" - The cold calling convention |
| 339 | This calling convention attempts to make code in the caller as |
| 340 | efficient as possible under the assumption that the call is not |
| 341 | commonly executed. As such, these calls often preserve all registers |
| 342 | so that the call does not break any live ranges in the caller side. |
| 343 | This calling convention does not support varargs and requires the |
| 344 | prototype of all callees to exactly match the prototype of the |
| 345 | function definition. |
| 346 | "``cc 10``" - GHC convention |
| 347 | This calling convention has been implemented specifically for use by |
| 348 | the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. |
| 349 | It passes everything in registers, going to extremes to achieve this |
| 350 | by disabling callee save registers. This calling convention should |
| 351 | not be used lightly but only for specific situations such as an |
| 352 | alternative to the *register pinning* performance technique often |
| 353 | used when implementing functional programming languages. At the |
| 354 | moment only X86 supports this convention and it has the following |
| 355 | limitations: |
| 356 | |
| 357 | - On *X86-32* only supports up to 4 bit type parameters. No |
| 358 | floating point types are supported. |
| 359 | - On *X86-64* only supports up to 10 bit type parameters and 6 |
| 360 | floating point parameters. |
| 361 | |
| 362 | This calling convention supports `tail call |
| 363 | optimization <CodeGenerator.html#id80>`_ but requires both the |
| 364 | caller and callee are using it. |
| 365 | "``cc 11``" - The HiPE calling convention |
| 366 | This calling convention has been implemented specifically for use by |
| 367 | the `High-Performance Erlang |
| 368 | (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* |
| 369 | native code compiler of the `Ericsson's Open Source Erlang/OTP |
| 370 | system <http://www.erlang.org/download.shtml>`_. It uses more |
| 371 | registers for argument passing than the ordinary C calling |
| 372 | convention and defines no callee-saved registers. The calling |
| 373 | convention properly supports `tail call |
| 374 | optimization <CodeGenerator.html#id80>`_ but requires that both the |
| 375 | caller and the callee use it. It uses a *register pinning* |
| 376 | mechanism, similar to GHC's convention, for keeping frequently |
| 377 | accessed runtime components pinned to specific hardware registers. |
| 378 | At the moment only X86 supports this convention (both 32 and 64 |
| 379 | bit). |
| 380 | "``cc <n>``" - Numbered convention |
| 381 | Any calling convention may be specified by number, allowing |
| 382 | target-specific calling conventions to be used. Target specific |
| 383 | calling conventions start at 64. |
| 384 | |
| 385 | More calling conventions can be added/defined on an as-needed basis, to |
| 386 | support Pascal conventions or any other well-known target-independent |
| 387 | convention. |
| 388 | |
| 389 | Visibility Styles |
| 390 | ----------------- |
| 391 | |
| 392 | All Global Variables and Functions have one of the following visibility |
| 393 | styles: |
| 394 | |
| 395 | "``default``" - Default style |
| 396 | On targets that use the ELF object file format, default visibility |
| 397 | means that the declaration is visible to other modules and, in |
| 398 | shared libraries, means that the declared entity may be overridden. |
| 399 | On Darwin, default visibility means that the declaration is visible |
| 400 | to other modules. Default visibility corresponds to "external |
| 401 | linkage" in the language. |
| 402 | "``hidden``" - Hidden style |
| 403 | Two declarations of an object with hidden visibility refer to the |
| 404 | same object if they are in the same shared object. Usually, hidden |
| 405 | visibility indicates that the symbol will not be placed into the |
| 406 | dynamic symbol table, so no other module (executable or shared |
| 407 | library) can reference it directly. |
| 408 | "``protected``" - Protected style |
| 409 | On ELF, protected visibility indicates that the symbol will be |
| 410 | placed in the dynamic symbol table, but that references within the |
| 411 | defining module will bind to the local symbol. That is, the symbol |
| 412 | cannot be overridden by another module. |
| 413 | |
| 414 | Named Types |
| 415 | ----------- |
| 416 | |
| 417 | LLVM IR allows you to specify name aliases for certain types. This can |
| 418 | make it easier to read the IR and make the IR more condensed |
| 419 | (particularly when recursive types are involved). An example of a name |
| 420 | specification is: |
| 421 | |
| 422 | .. code-block:: llvm |
| 423 | |
| 424 | %mytype = type { %mytype*, i32 } |
| 425 | |
| 426 | You may give a name to any :ref:`type <typesystem>` except |
| 427 | ":ref:`void <t_void>`". Type name aliases may be used anywhere a type is |
| 428 | expected with the syntax "%mytype". |
| 429 | |
| 430 | Note that type names are aliases for the structural type that they |
| 431 | indicate, and that you can therefore specify multiple names for the same |
| 432 | type. This often leads to confusing behavior when dumping out a .ll |
| 433 | file. Since LLVM IR uses structural typing, the name is not part of the |
| 434 | type. When printing out LLVM IR, the printer will pick *one name* to |
| 435 | render all types of a particular shape. This means that if you have code |
| 436 | where two different source types end up having the same LLVM type, that |
| 437 | the dumper will sometimes print the "wrong" or unexpected type. This is |
| 438 | an important design point and isn't going to change. |
| 439 | |
| 440 | .. _globalvars: |
| 441 | |
| 442 | Global Variables |
| 443 | ---------------- |
| 444 | |
| 445 | Global variables define regions of memory allocated at compilation time |
| 446 | instead of run-time. Global variables may optionally be initialized, may |
| 447 | have an explicit section to be placed in, and may have an optional |
| 448 | explicit alignment specified. |
| 449 | |
| 450 | A variable may be defined as ``thread_local``, which means that it will |
| 451 | not be shared by threads (each thread will have a separated copy of the |
| 452 | variable). Not all targets support thread-local variables. Optionally, a |
| 453 | TLS model may be specified: |
| 454 | |
| 455 | ``localdynamic`` |
| 456 | For variables that are only used within the current shared library. |
| 457 | ``initialexec`` |
| 458 | For variables in modules that will not be loaded dynamically. |
| 459 | ``localexec`` |
| 460 | For variables defined in the executable and only used within it. |
| 461 | |
| 462 | The models correspond to the ELF TLS models; see `ELF Handling For |
| 463 | Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for |
| 464 | more information on under which circumstances the different models may |
| 465 | be used. The target may choose a different TLS model if the specified |
| 466 | model is not supported, or if a better choice of model can be made. |
| 467 | |
Michael Gottesman | f573588 | 2013-01-31 05:48:48 +0000 | [diff] [blame] | 468 | A variable may be defined as a global ``constant``, which indicates that |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 469 | the contents of the variable will **never** be modified (enabling better |
| 470 | optimization, allowing the global data to be placed in the read-only |
| 471 | section of an executable, etc). Note that variables that need runtime |
Michael Gottesman | 3480487 | 2013-01-31 05:44:04 +0000 | [diff] [blame] | 472 | initialization cannot be marked ``constant`` as there is a store to the |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 473 | variable. |
| 474 | |
| 475 | LLVM explicitly allows *declarations* of global variables to be marked |
| 476 | constant, even if the final definition of the global is not. This |
| 477 | capability can be used to enable slightly better optimization of the |
| 478 | program, but requires the language definition to guarantee that |
| 479 | optimizations based on the 'constantness' are valid for the translation |
| 480 | units that do not include the definition. |
| 481 | |
| 482 | As SSA values, global variables define pointer values that are in scope |
| 483 | (i.e. they dominate) all basic blocks in the program. Global variables |
| 484 | always define a pointer to their "content" type because they describe a |
| 485 | region of memory, and all memory objects in LLVM are accessed through |
| 486 | pointers. |
| 487 | |
| 488 | Global variables can be marked with ``unnamed_addr`` which indicates |
| 489 | that the address is not significant, only the content. Constants marked |
| 490 | like this can be merged with other constants if they have the same |
| 491 | initializer. Note that a constant with significant address *can* be |
| 492 | merged with a ``unnamed_addr`` constant, the result being a constant |
| 493 | whose address is significant. |
| 494 | |
| 495 | A global variable may be declared to reside in a target-specific |
| 496 | numbered address space. For targets that support them, address spaces |
| 497 | may affect how optimizations are performed and/or what target |
| 498 | instructions are used to access the variable. The default address space |
| 499 | is zero. The address space qualifier must precede any other attributes. |
| 500 | |
| 501 | LLVM allows an explicit section to be specified for globals. If the |
| 502 | target supports it, it will emit globals to the section specified. |
| 503 | |
Michael Gottesman | 6c355ee | 2013-02-04 03:22:00 +0000 | [diff] [blame] | 504 | By default, global initializers are optimized by assuming that global |
Michael Gottesman | 4283499 | 2013-02-03 09:57:15 +0000 | [diff] [blame] | 505 | variables defined within the module are not modified from their |
| 506 | initial values before the start of the global initializer. This is |
| 507 | true even for variables potentially accessible from outside the |
| 508 | module, including those with external linkage or appearing in |
Michael Gottesman | fa987f0 | 2013-02-03 09:57:18 +0000 | [diff] [blame] | 509 | ``@llvm.used``. This assumption may be suppressed by marking the |
| 510 | variable with ``externally_initialized``. |
Michael Gottesman | 4283499 | 2013-02-03 09:57:15 +0000 | [diff] [blame] | 511 | |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 512 | An explicit alignment may be specified for a global, which must be a |
| 513 | power of 2. If not present, or if the alignment is set to zero, the |
| 514 | alignment of the global is set by the target to whatever it feels |
| 515 | convenient. If an explicit alignment is specified, the global is forced |
| 516 | to have exactly that alignment. Targets and optimizers are not allowed |
| 517 | to over-align the global if the global has an assigned section. In this |
| 518 | case, the extra alignment could be observable: for example, code could |
| 519 | assume that the globals are densely packed in their section and try to |
| 520 | iterate over them as an array, alignment padding would break this |
| 521 | iteration. |
| 522 | |
| 523 | For example, the following defines a global in a numbered address space |
| 524 | with an initializer, section, and alignment: |
| 525 | |
| 526 | .. code-block:: llvm |
| 527 | |
| 528 | @G = addrspace(5) constant float 1.0, section "foo", align 4 |
| 529 | |
| 530 | The following example defines a thread-local global with the |
| 531 | ``initialexec`` TLS model: |
| 532 | |
| 533 | .. code-block:: llvm |
| 534 | |
| 535 | @G = thread_local(initialexec) global i32 0, align 4 |
| 536 | |
| 537 | .. _functionstructure: |
| 538 | |
| 539 | Functions |
| 540 | --------- |
| 541 | |
| 542 | LLVM function definitions consist of the "``define``" keyword, an |
| 543 | optional :ref:`linkage type <linkage>`, an optional :ref:`visibility |
| 544 | style <visibility>`, an optional :ref:`calling convention <callingconv>`, |
| 545 | an optional ``unnamed_addr`` attribute, a return type, an optional |
| 546 | :ref:`parameter attribute <paramattrs>` for the return type, a function |
| 547 | name, a (possibly empty) argument list (each with optional :ref:`parameter |
| 548 | attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, |
| 549 | an optional section, an optional alignment, an optional :ref:`garbage |
| 550 | collector name <gc>`, an opening curly brace, a list of basic blocks, |
| 551 | and a closing curly brace. |
| 552 | |
| 553 | LLVM function declarations consist of the "``declare``" keyword, an |
| 554 | optional :ref:`linkage type <linkage>`, an optional :ref:`visibility |
| 555 | style <visibility>`, an optional :ref:`calling convention <callingconv>`, |
| 556 | an optional ``unnamed_addr`` attribute, a return type, an optional |
| 557 | :ref:`parameter attribute <paramattrs>` for the return type, a function |
| 558 | name, a possibly empty list of arguments, an optional alignment, and an |
| 559 | optional :ref:`garbage collector name <gc>`. |
| 560 | |
| 561 | A function definition contains a list of basic blocks, forming the CFG |
| 562 | (Control Flow Graph) for the function. Each basic block may optionally |
| 563 | start with a label (giving the basic block a symbol table entry), |
| 564 | contains a list of instructions, and ends with a |
| 565 | :ref:`terminator <terminators>` instruction (such as a branch or function |
| 566 | return). |
| 567 | |
| 568 | The first basic block in a function is special in two ways: it is |
| 569 | immediately executed on entrance to the function, and it is not allowed |
| 570 | to have predecessor basic blocks (i.e. there can not be any branches to |
| 571 | the entry block of a function). Because the block can have no |
| 572 | predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. |
| 573 | |
| 574 | LLVM allows an explicit section to be specified for functions. If the |
| 575 | target supports it, it will emit functions to the section specified. |
| 576 | |
| 577 | An explicit alignment may be specified for a function. If not present, |
| 578 | or if the alignment is set to zero, the alignment of the function is set |
| 579 | by the target to whatever it feels convenient. If an explicit alignment |
| 580 | is specified, the function is forced to have at least that much |
| 581 | alignment. All alignments must be a power of 2. |
| 582 | |
| 583 | If the ``unnamed_addr`` attribute is given, the address is know to not |
| 584 | be significant and two identical functions can be merged. |
| 585 | |
| 586 | Syntax:: |
| 587 | |
| 588 | define [linkage] [visibility] |
| 589 | [cconv] [ret attrs] |
| 590 | <ResultType> @<FunctionName> ([argument list]) |
| 591 | [fn Attrs] [section "name"] [align N] |
| 592 | [gc] { ... } |
| 593 | |
| 594 | Aliases |
| 595 | ------- |
| 596 | |
| 597 | Aliases act as "second name" for the aliasee value (which can be either |
| 598 | function, global variable, another alias or bitcast of global value). |
| 599 | Aliases may have an optional :ref:`linkage type <linkage>`, and an optional |
| 600 | :ref:`visibility style <visibility>`. |
| 601 | |
| 602 | Syntax:: |
| 603 | |
| 604 | @<Name> = alias [Linkage] [Visibility] <AliaseeTy> @<Aliasee> |
| 605 | |
| 606 | .. _namedmetadatastructure: |
| 607 | |
| 608 | Named Metadata |
| 609 | -------------- |
| 610 | |
| 611 | Named metadata is a collection of metadata. :ref:`Metadata |
| 612 | nodes <metadata>` (but not metadata strings) are the only valid |
| 613 | operands for a named metadata. |
| 614 | |
| 615 | Syntax:: |
| 616 | |
| 617 | ; Some unnamed metadata nodes, which are referenced by the named metadata. |
| 618 | !0 = metadata !{metadata !"zero"} |
| 619 | !1 = metadata !{metadata !"one"} |
| 620 | !2 = metadata !{metadata !"two"} |
| 621 | ; A named metadata. |
| 622 | !name = !{!0, !1, !2} |
| 623 | |
| 624 | .. _paramattrs: |
| 625 | |
| 626 | Parameter Attributes |
| 627 | -------------------- |
| 628 | |
| 629 | The return type and each parameter of a function type may have a set of |
| 630 | *parameter attributes* associated with them. Parameter attributes are |
| 631 | used to communicate additional information about the result or |
| 632 | parameters of a function. Parameter attributes are considered to be part |
| 633 | of the function, not of the function type, so functions with different |
| 634 | parameter attributes can have the same function type. |
| 635 | |
| 636 | Parameter attributes are simple keywords that follow the type specified. |
| 637 | If multiple parameter attributes are needed, they are space separated. |
| 638 | For example: |
| 639 | |
| 640 | .. code-block:: llvm |
| 641 | |
| 642 | declare i32 @printf(i8* noalias nocapture, ...) |
| 643 | declare i32 @atoi(i8 zeroext) |
| 644 | declare signext i8 @returns_signed_char() |
| 645 | |
| 646 | Note that any attributes for the function result (``nounwind``, |
| 647 | ``readonly``) come immediately after the argument list. |
| 648 | |
| 649 | Currently, only the following parameter attributes are defined: |
| 650 | |
| 651 | ``zeroext`` |
| 652 | This indicates to the code generator that the parameter or return |
| 653 | value should be zero-extended to the extent required by the target's |
| 654 | ABI (which is usually 32-bits, but is 8-bits for a i1 on x86-64) by |
| 655 | the caller (for a parameter) or the callee (for a return value). |
| 656 | ``signext`` |
| 657 | This indicates to the code generator that the parameter or return |
| 658 | value should be sign-extended to the extent required by the target's |
| 659 | ABI (which is usually 32-bits) by the caller (for a parameter) or |
| 660 | the callee (for a return value). |
| 661 | ``inreg`` |
| 662 | This indicates that this parameter or return value should be treated |
| 663 | in a special target-dependent fashion during while emitting code for |
| 664 | a function call or return (usually, by putting it in a register as |
| 665 | opposed to memory, though some targets use it to distinguish between |
| 666 | two different kinds of registers). Use of this attribute is |
| 667 | target-specific. |
| 668 | ``byval`` |
| 669 | This indicates that the pointer parameter should really be passed by |
| 670 | value to the function. The attribute implies that a hidden copy of |
| 671 | the pointee is made between the caller and the callee, so the callee |
| 672 | is unable to modify the value in the caller. This attribute is only |
| 673 | valid on LLVM pointer arguments. It is generally used to pass |
| 674 | structs and arrays by value, but is also valid on pointers to |
| 675 | scalars. The copy is considered to belong to the caller not the |
| 676 | callee (for example, ``readonly`` functions should not write to |
| 677 | ``byval`` parameters). This is not a valid attribute for return |
| 678 | values. |
| 679 | |
| 680 | The byval attribute also supports specifying an alignment with the |
| 681 | align attribute. It indicates the alignment of the stack slot to |
| 682 | form and the known alignment of the pointer specified to the call |
| 683 | site. If the alignment is not specified, then the code generator |
| 684 | makes a target-specific assumption. |
| 685 | |
| 686 | ``sret`` |
| 687 | This indicates that the pointer parameter specifies the address of a |
| 688 | structure that is the return value of the function in the source |
| 689 | program. This pointer must be guaranteed by the caller to be valid: |
Eli Bendersky | 98202c0 | 2013-01-23 22:05:19 +0000 | [diff] [blame] | 690 | loads and stores to the structure may be assumed by the callee |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 691 | not to trap and to be properly aligned. This may only be applied to |
| 692 | the first parameter. This is not a valid attribute for return |
| 693 | values. |
| 694 | ``noalias`` |
| 695 | This indicates that pointer values `*based* <pointeraliasing>` on |
| 696 | the argument or return value do not alias pointer values which are |
| 697 | not *based* on it, ignoring certain "irrelevant" dependencies. For a |
| 698 | call to the parent function, dependencies between memory references |
| 699 | from before or after the call and from those during the call are |
| 700 | "irrelevant" to the ``noalias`` keyword for the arguments and return |
| 701 | value used in that call. The caller shares the responsibility with |
| 702 | the callee for ensuring that these requirements are met. For further |
| 703 | details, please see the discussion of the NoAlias response in `alias |
| 704 | analysis <AliasAnalysis.html#MustMayNo>`_. |
| 705 | |
| 706 | Note that this definition of ``noalias`` is intentionally similar |
| 707 | to the definition of ``restrict`` in C99 for function arguments, |
| 708 | though it is slightly weaker. |
| 709 | |
| 710 | For function return values, C99's ``restrict`` is not meaningful, |
| 711 | while LLVM's ``noalias`` is. |
| 712 | ``nocapture`` |
| 713 | This indicates that the callee does not make any copies of the |
| 714 | pointer that outlive the callee itself. This is not a valid |
| 715 | attribute for return values. |
| 716 | |
| 717 | .. _nest: |
| 718 | |
| 719 | ``nest`` |
| 720 | This indicates that the pointer parameter can be excised using the |
| 721 | :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid |
| 722 | attribute for return values. |
| 723 | |
| 724 | .. _gc: |
| 725 | |
| 726 | Garbage Collector Names |
| 727 | ----------------------- |
| 728 | |
| 729 | Each function may specify a garbage collector name, which is simply a |
| 730 | string: |
| 731 | |
| 732 | .. code-block:: llvm |
| 733 | |
| 734 | define void @f() gc "name" { ... } |
| 735 | |
| 736 | The compiler declares the supported values of *name*. Specifying a |
| 737 | collector which will cause the compiler to alter its output in order to |
| 738 | support the named garbage collection algorithm. |
| 739 | |
Bill Wendling | 95ce4c2 | 2013-02-06 06:52:58 +0000 | [diff] [blame] | 740 | .. _attrgrp: |
| 741 | |
| 742 | Attribute Groups |
| 743 | ---------------- |
| 744 | |
| 745 | Attribute groups are groups of attributes that are referenced by objects within |
| 746 | the IR. They are important for keeping ``.ll`` files readable, because a lot of |
| 747 | functions will use the same set of attributes. In the degenerative case of a |
| 748 | ``.ll`` file that corresponds to a single ``.c`` file, the single attribute |
| 749 | group will capture the important command line flags used to build that file. |
| 750 | |
| 751 | An attribute group is a module-level object. To use an attribute group, an |
| 752 | object references the attribute group's ID (e.g. ``#37``). An object may refer |
| 753 | to more than one attribute group. In that situation, the attributes from the |
| 754 | different groups are merged. |
| 755 | |
| 756 | Here is an example of attribute groups for a function that should always be |
| 757 | inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: |
| 758 | |
| 759 | .. code-block:: llvm |
| 760 | |
| 761 | ; Target-independent attributes: |
| 762 | #0 = attributes { alwaysinline alignstack=4 } |
| 763 | |
| 764 | ; Target-dependent attributes: |
| 765 | #1 = attributes { "no-sse" } |
| 766 | |
| 767 | ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". |
| 768 | define void @f() #0 #1 { ... } |
| 769 | |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 770 | .. _fnattrs: |
| 771 | |
| 772 | Function Attributes |
| 773 | ------------------- |
| 774 | |
| 775 | Function attributes are set to communicate additional information about |
| 776 | a function. Function attributes are considered to be part of the |
| 777 | function, not of the function type, so functions with different function |
| 778 | attributes can have the same function type. |
| 779 | |
| 780 | Function attributes are simple keywords that follow the type specified. |
| 781 | If multiple attributes are needed, they are space separated. For |
| 782 | example: |
| 783 | |
| 784 | .. code-block:: llvm |
| 785 | |
| 786 | define void @f() noinline { ... } |
| 787 | define void @f() alwaysinline { ... } |
| 788 | define void @f() alwaysinline optsize { ... } |
| 789 | define void @f() optsize { ... } |
| 790 | |
| 791 | ``address_safety`` |
| 792 | This attribute indicates that the address safety analysis is enabled |
| 793 | for this function. |
| 794 | ``alignstack(<n>)`` |
| 795 | This attribute indicates that, when emitting the prologue and |
| 796 | epilogue, the backend should forcibly align the stack pointer. |
| 797 | Specify the desired alignment, which must be a power of two, in |
| 798 | parentheses. |
| 799 | ``alwaysinline`` |
| 800 | This attribute indicates that the inliner should attempt to inline |
| 801 | this function into callers whenever possible, ignoring any active |
| 802 | inlining size threshold for this caller. |
| 803 | ``nonlazybind`` |
| 804 | This attribute suppresses lazy symbol binding for the function. This |
| 805 | may make calls to the function faster, at the cost of extra program |
| 806 | startup time if the function is not called during program startup. |
| 807 | ``inlinehint`` |
| 808 | This attribute indicates that the source code contained a hint that |
| 809 | inlining this function is desirable (such as the "inline" keyword in |
| 810 | C/C++). It is just a hint; it imposes no requirements on the |
| 811 | inliner. |
| 812 | ``naked`` |
| 813 | This attribute disables prologue / epilogue emission for the |
| 814 | function. This can have very system-specific consequences. |
Bill Wendling | be5d747 | 2013-02-06 06:22:58 +0000 | [diff] [blame] | 815 | ``noduplicate`` |
| 816 | This attribute indicates that calls to the function cannot be |
| 817 | duplicated. A call to a ``noduplicate`` function may be moved |
| 818 | within its parent function, but may not be duplicated within |
| 819 | its parent function. |
| 820 | |
| 821 | A function containing a ``noduplicate`` call may still |
| 822 | be an inlining candidate, provided that the call is not |
| 823 | duplicated by inlining. That implies that the function has |
| 824 | internal linkage and only has one call site, so the original |
| 825 | call is dead after inlining. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 826 | ``noimplicitfloat`` |
| 827 | This attributes disables implicit floating point instructions. |
| 828 | ``noinline`` |
| 829 | This attribute indicates that the inliner should never inline this |
| 830 | function in any situation. This attribute may not be used together |
| 831 | with the ``alwaysinline`` attribute. |
| 832 | ``noredzone`` |
| 833 | This attribute indicates that the code generator should not use a |
| 834 | red zone, even if the target-specific ABI normally permits it. |
| 835 | ``noreturn`` |
| 836 | This function attribute indicates that the function never returns |
| 837 | normally. This produces undefined behavior at runtime if the |
| 838 | function ever does dynamically return. |
| 839 | ``nounwind`` |
| 840 | This function attribute indicates that the function never returns |
| 841 | with an unwind or exceptional control flow. If the function does |
| 842 | unwind, its runtime behavior is undefined. |
| 843 | ``optsize`` |
| 844 | This attribute suggests that optimization passes and code generator |
| 845 | passes make choices that keep the code size of this function low, |
| 846 | and otherwise do optimizations specifically to reduce code size. |
| 847 | ``readnone`` |
| 848 | This attribute indicates that the function computes its result (or |
| 849 | decides to unwind an exception) based strictly on its arguments, |
| 850 | without dereferencing any pointer arguments or otherwise accessing |
| 851 | any mutable state (e.g. memory, control registers, etc) visible to |
| 852 | caller functions. It does not write through any pointer arguments |
| 853 | (including ``byval`` arguments) and never changes any state visible |
| 854 | to callers. This means that it cannot unwind exceptions by calling |
| 855 | the ``C++`` exception throwing methods. |
| 856 | ``readonly`` |
| 857 | This attribute indicates that the function does not write through |
| 858 | any pointer arguments (including ``byval`` arguments) or otherwise |
| 859 | modify any state (e.g. memory, control registers, etc) visible to |
| 860 | caller functions. It may dereference pointer arguments and read |
| 861 | state that may be set in the caller. A readonly function always |
| 862 | returns the same value (or unwinds an exception identically) when |
| 863 | called with the same set of arguments and global state. It cannot |
| 864 | unwind an exception by calling the ``C++`` exception throwing |
| 865 | methods. |
| 866 | ``returns_twice`` |
| 867 | This attribute indicates that this function can return twice. The C |
| 868 | ``setjmp`` is an example of such a function. The compiler disables |
| 869 | some optimizations (like tail calls) in the caller of these |
| 870 | functions. |
| 871 | ``ssp`` |
| 872 | This attribute indicates that the function should emit a stack |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 873 | smashing protector. It is in the form of a "canary" --- a random value |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 874 | placed on the stack before the local variables that's checked upon |
| 875 | return from the function to see if it has been overwritten. A |
| 876 | heuristic is used to determine if a function needs stack protectors |
Bill Wendling | e4957fb | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 877 | or not. The heuristic used will enable protectors for functions with: |
Dmitri Gribenko | d8acb28 | 2013-01-29 23:14:41 +0000 | [diff] [blame] | 878 | |
Bill Wendling | e4957fb | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 879 | - Character arrays larger than ``ssp-buffer-size`` (default 8). |
| 880 | - Aggregates containing character arrays larger than ``ssp-buffer-size``. |
| 881 | - Calls to alloca() with variable sizes or constant sizes greater than |
| 882 | ``ssp-buffer-size``. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 883 | |
| 884 | If a function that has an ``ssp`` attribute is inlined into a |
| 885 | function that doesn't have an ``ssp`` attribute, then the resulting |
| 886 | function will have an ``ssp`` attribute. |
| 887 | ``sspreq`` |
| 888 | This attribute indicates that the function should *always* emit a |
| 889 | stack smashing protector. This overrides the ``ssp`` function |
| 890 | attribute. |
| 891 | |
| 892 | If a function that has an ``sspreq`` attribute is inlined into a |
| 893 | function that doesn't have an ``sspreq`` attribute or which has an |
Bill Wendling | 114baee | 2013-01-23 06:41:41 +0000 | [diff] [blame] | 894 | ``ssp`` or ``sspstrong`` attribute, then the resulting function will have |
| 895 | an ``sspreq`` attribute. |
| 896 | ``sspstrong`` |
| 897 | This attribute indicates that the function should emit a stack smashing |
Bill Wendling | e4957fb | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 898 | protector. This attribute causes a strong heuristic to be used when |
| 899 | determining if a function needs stack protectors. The strong heuristic |
| 900 | will enable protectors for functions with: |
Dmitri Gribenko | d8acb28 | 2013-01-29 23:14:41 +0000 | [diff] [blame] | 901 | |
Bill Wendling | e4957fb | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 902 | - Arrays of any size and type |
| 903 | - Aggregates containing an array of any size and type. |
| 904 | - Calls to alloca(). |
| 905 | - Local variables that have had their address taken. |
| 906 | |
| 907 | This overrides the ``ssp`` function attribute. |
Bill Wendling | 114baee | 2013-01-23 06:41:41 +0000 | [diff] [blame] | 908 | |
| 909 | If a function that has an ``sspstrong`` attribute is inlined into a |
| 910 | function that doesn't have an ``sspstrong`` attribute, then the |
| 911 | resulting function will have an ``sspstrong`` attribute. |
Kostya Serebryany | ab39afa | 2013-02-11 08:13:54 +0000 | [diff] [blame] | 912 | ``thread_safety`` |
| 913 | This attribute indicates that the thread safety analysis is enabled |
| 914 | for this function. |
| 915 | ``uninitialized_checks`` |
| 916 | This attribute indicates that the checks for uses of uninitialized |
| 917 | memory are enabled. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 918 | ``uwtable`` |
| 919 | This attribute indicates that the ABI being targeted requires that |
| 920 | an unwind table entry be produce for this function even if we can |
| 921 | show that no exceptions passes by it. This is normally the case for |
| 922 | the ELF x86-64 abi, but it can be disabled for some compilation |
| 923 | units. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 924 | |
| 925 | .. _moduleasm: |
| 926 | |
| 927 | Module-Level Inline Assembly |
| 928 | ---------------------------- |
| 929 | |
| 930 | Modules may contain "module-level inline asm" blocks, which corresponds |
| 931 | to the GCC "file scope inline asm" blocks. These blocks are internally |
| 932 | concatenated by LLVM and treated as a single unit, but may be separated |
| 933 | in the ``.ll`` file if desired. The syntax is very simple: |
| 934 | |
| 935 | .. code-block:: llvm |
| 936 | |
| 937 | module asm "inline asm code goes here" |
| 938 | module asm "more can go here" |
| 939 | |
| 940 | The strings can contain any character by escaping non-printable |
| 941 | characters. The escape sequence used is simply "\\xx" where "xx" is the |
| 942 | two digit hex code for the number. |
| 943 | |
| 944 | The inline asm code is simply printed to the machine code .s file when |
| 945 | assembly code is generated. |
| 946 | |
| 947 | Data Layout |
| 948 | ----------- |
| 949 | |
| 950 | A module may specify a target specific data layout string that specifies |
| 951 | how data is to be laid out in memory. The syntax for the data layout is |
| 952 | simply: |
| 953 | |
| 954 | .. code-block:: llvm |
| 955 | |
| 956 | target datalayout = "layout specification" |
| 957 | |
| 958 | The *layout specification* consists of a list of specifications |
| 959 | separated by the minus sign character ('-'). Each specification starts |
| 960 | with a letter and may include other information after the letter to |
| 961 | define some aspect of the data layout. The specifications accepted are |
| 962 | as follows: |
| 963 | |
| 964 | ``E`` |
| 965 | Specifies that the target lays out data in big-endian form. That is, |
| 966 | the bits with the most significance have the lowest address |
| 967 | location. |
| 968 | ``e`` |
| 969 | Specifies that the target lays out data in little-endian form. That |
| 970 | is, the bits with the least significance have the lowest address |
| 971 | location. |
| 972 | ``S<size>`` |
| 973 | Specifies the natural alignment of the stack in bits. Alignment |
| 974 | promotion of stack variables is limited to the natural stack |
| 975 | alignment to avoid dynamic stack realignment. The stack alignment |
| 976 | must be a multiple of 8-bits. If omitted, the natural stack |
| 977 | alignment defaults to "unspecified", which does not prevent any |
| 978 | alignment promotions. |
| 979 | ``p[n]:<size>:<abi>:<pref>`` |
| 980 | This specifies the *size* of a pointer and its ``<abi>`` and |
| 981 | ``<pref>``\erred alignments for address space ``n``. All sizes are in |
| 982 | bits. Specifying the ``<pref>`` alignment is optional. If omitted, the |
| 983 | preceding ``:`` should be omitted too. The address space, ``n`` is |
| 984 | optional, and if not specified, denotes the default address space 0. |
| 985 | The value of ``n`` must be in the range [1,2^23). |
| 986 | ``i<size>:<abi>:<pref>`` |
| 987 | This specifies the alignment for an integer type of a given bit |
| 988 | ``<size>``. The value of ``<size>`` must be in the range [1,2^23). |
| 989 | ``v<size>:<abi>:<pref>`` |
| 990 | This specifies the alignment for a vector type of a given bit |
| 991 | ``<size>``. |
| 992 | ``f<size>:<abi>:<pref>`` |
| 993 | This specifies the alignment for a floating point type of a given bit |
| 994 | ``<size>``. Only values of ``<size>`` that are supported by the target |
| 995 | will work. 32 (float) and 64 (double) are supported on all targets; 80 |
| 996 | or 128 (different flavors of long double) are also supported on some |
| 997 | targets. |
| 998 | ``a<size>:<abi>:<pref>`` |
| 999 | This specifies the alignment for an aggregate type of a given bit |
| 1000 | ``<size>``. |
| 1001 | ``s<size>:<abi>:<pref>`` |
| 1002 | This specifies the alignment for a stack object of a given bit |
| 1003 | ``<size>``. |
| 1004 | ``n<size1>:<size2>:<size3>...`` |
| 1005 | This specifies a set of native integer widths for the target CPU in |
| 1006 | bits. For example, it might contain ``n32`` for 32-bit PowerPC, |
| 1007 | ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of |
| 1008 | this set are considered to support most general arithmetic operations |
| 1009 | efficiently. |
| 1010 | |
| 1011 | When constructing the data layout for a given target, LLVM starts with a |
| 1012 | default set of specifications which are then (possibly) overridden by |
| 1013 | the specifications in the ``datalayout`` keyword. The default |
| 1014 | specifications are given in this list: |
| 1015 | |
| 1016 | - ``E`` - big endian |
| 1017 | - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment |
Patrik Hagglund | 3b5f0b0 | 2013-01-30 09:02:06 +0000 | [diff] [blame] | 1018 | - ``S0`` - natural stack alignment is unspecified |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1019 | - ``i1:8:8`` - i1 is 8-bit (byte) aligned |
| 1020 | - ``i8:8:8`` - i8 is 8-bit (byte) aligned |
| 1021 | - ``i16:16:16`` - i16 is 16-bit aligned |
| 1022 | - ``i32:32:32`` - i32 is 32-bit aligned |
| 1023 | - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred |
| 1024 | alignment of 64-bits |
Patrik Hagglund | 3b5f0b0 | 2013-01-30 09:02:06 +0000 | [diff] [blame] | 1025 | - ``f16:16:16`` - half is 16-bit aligned |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1026 | - ``f32:32:32`` - float is 32-bit aligned |
| 1027 | - ``f64:64:64`` - double is 64-bit aligned |
Patrik Hagglund | 3b5f0b0 | 2013-01-30 09:02:06 +0000 | [diff] [blame] | 1028 | - ``f128:128:128`` - quad is 128-bit aligned |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1029 | - ``v64:64:64`` - 64-bit vector is 64-bit aligned |
| 1030 | - ``v128:128:128`` - 128-bit vector is 128-bit aligned |
Patrik Hagglund | 3b5f0b0 | 2013-01-30 09:02:06 +0000 | [diff] [blame] | 1031 | - ``a0:0:64`` - aggregates are 64-bit aligned |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1032 | |
| 1033 | When LLVM is determining the alignment for a given type, it uses the |
| 1034 | following rules: |
| 1035 | |
| 1036 | #. If the type sought is an exact match for one of the specifications, |
| 1037 | that specification is used. |
| 1038 | #. If no match is found, and the type sought is an integer type, then |
| 1039 | the smallest integer type that is larger than the bitwidth of the |
| 1040 | sought type is used. If none of the specifications are larger than |
| 1041 | the bitwidth then the largest integer type is used. For example, |
| 1042 | given the default specifications above, the i7 type will use the |
| 1043 | alignment of i8 (next largest) while both i65 and i256 will use the |
| 1044 | alignment of i64 (largest specified). |
| 1045 | #. If no match is found, and the type sought is a vector type, then the |
| 1046 | largest vector type that is smaller than the sought vector type will |
| 1047 | be used as a fall back. This happens because <128 x double> can be |
| 1048 | implemented in terms of 64 <2 x double>, for example. |
| 1049 | |
| 1050 | The function of the data layout string may not be what you expect. |
| 1051 | Notably, this is not a specification from the frontend of what alignment |
| 1052 | the code generator should use. |
| 1053 | |
| 1054 | Instead, if specified, the target data layout is required to match what |
| 1055 | the ultimate *code generator* expects. This string is used by the |
| 1056 | mid-level optimizers to improve code, and this only works if it matches |
| 1057 | what the ultimate code generator uses. If you would like to generate IR |
| 1058 | that does not embed this target-specific detail into the IR, then you |
| 1059 | don't have to specify the string. This will disable some optimizations |
| 1060 | that require precise layout information, but this also prevents those |
| 1061 | optimizations from introducing target specificity into the IR. |
| 1062 | |
| 1063 | .. _pointeraliasing: |
| 1064 | |
| 1065 | Pointer Aliasing Rules |
| 1066 | ---------------------- |
| 1067 | |
| 1068 | Any memory access must be done through a pointer value associated with |
| 1069 | an address range of the memory access, otherwise the behavior is |
| 1070 | undefined. Pointer values are associated with address ranges according |
| 1071 | to the following rules: |
| 1072 | |
| 1073 | - A pointer value is associated with the addresses associated with any |
| 1074 | value it is *based* on. |
| 1075 | - An address of a global variable is associated with the address range |
| 1076 | of the variable's storage. |
| 1077 | - The result value of an allocation instruction is associated with the |
| 1078 | address range of the allocated storage. |
| 1079 | - A null pointer in the default address-space is associated with no |
| 1080 | address. |
| 1081 | - An integer constant other than zero or a pointer value returned from |
| 1082 | a function not defined within LLVM may be associated with address |
| 1083 | ranges allocated through mechanisms other than those provided by |
| 1084 | LLVM. Such ranges shall not overlap with any ranges of addresses |
| 1085 | allocated by mechanisms provided by LLVM. |
| 1086 | |
| 1087 | A pointer value is *based* on another pointer value according to the |
| 1088 | following rules: |
| 1089 | |
| 1090 | - A pointer value formed from a ``getelementptr`` operation is *based* |
| 1091 | on the first operand of the ``getelementptr``. |
| 1092 | - The result value of a ``bitcast`` is *based* on the operand of the |
| 1093 | ``bitcast``. |
| 1094 | - A pointer value formed by an ``inttoptr`` is *based* on all pointer |
| 1095 | values that contribute (directly or indirectly) to the computation of |
| 1096 | the pointer's value. |
| 1097 | - The "*based* on" relationship is transitive. |
| 1098 | |
| 1099 | Note that this definition of *"based"* is intentionally similar to the |
| 1100 | definition of *"based"* in C99, though it is slightly weaker. |
| 1101 | |
| 1102 | LLVM IR does not associate types with memory. The result type of a |
| 1103 | ``load`` merely indicates the size and alignment of the memory from |
| 1104 | which to load, as well as the interpretation of the value. The first |
| 1105 | operand type of a ``store`` similarly only indicates the size and |
| 1106 | alignment of the store. |
| 1107 | |
| 1108 | Consequently, type-based alias analysis, aka TBAA, aka |
| 1109 | ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. |
| 1110 | :ref:`Metadata <metadata>` may be used to encode additional information |
| 1111 | which specialized optimization passes may use to implement type-based |
| 1112 | alias analysis. |
| 1113 | |
| 1114 | .. _volatile: |
| 1115 | |
| 1116 | Volatile Memory Accesses |
| 1117 | ------------------------ |
| 1118 | |
| 1119 | Certain memory accesses, such as :ref:`load <i_load>`'s, |
| 1120 | :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be |
| 1121 | marked ``volatile``. The optimizers must not change the number of |
| 1122 | volatile operations or change their order of execution relative to other |
| 1123 | volatile operations. The optimizers *may* change the order of volatile |
| 1124 | operations relative to non-volatile operations. This is not Java's |
| 1125 | "volatile" and has no cross-thread synchronization behavior. |
| 1126 | |
Andrew Trick | 9a6dd02 | 2013-01-30 21:19:35 +0000 | [diff] [blame] | 1127 | IR-level volatile loads and stores cannot safely be optimized into |
| 1128 | llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are |
| 1129 | flagged volatile. Likewise, the backend should never split or merge |
| 1130 | target-legal volatile load/store instructions. |
| 1131 | |
Andrew Trick | 946317d | 2013-01-31 00:49:39 +0000 | [diff] [blame] | 1132 | .. admonition:: Rationale |
| 1133 | |
| 1134 | Platforms may rely on volatile loads and stores of natively supported |
| 1135 | data width to be executed as single instruction. For example, in C |
| 1136 | this holds for an l-value of volatile primitive type with native |
| 1137 | hardware support, but not necessarily for aggregate types. The |
| 1138 | frontend upholds these expectations, which are intentionally |
| 1139 | unspecified in the IR. The rules above ensure that IR transformation |
| 1140 | do not violate the frontend's contract with the language. |
| 1141 | |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1142 | .. _memmodel: |
| 1143 | |
| 1144 | Memory Model for Concurrent Operations |
| 1145 | -------------------------------------- |
| 1146 | |
| 1147 | The LLVM IR does not define any way to start parallel threads of |
| 1148 | execution or to register signal handlers. Nonetheless, there are |
| 1149 | platform-specific ways to create them, and we define LLVM IR's behavior |
| 1150 | in their presence. This model is inspired by the C++0x memory model. |
| 1151 | |
| 1152 | For a more informal introduction to this model, see the :doc:`Atomics`. |
| 1153 | |
| 1154 | We define a *happens-before* partial order as the least partial order |
| 1155 | that |
| 1156 | |
| 1157 | - Is a superset of single-thread program order, and |
| 1158 | - When a *synchronizes-with* ``b``, includes an edge from ``a`` to |
| 1159 | ``b``. *Synchronizes-with* pairs are introduced by platform-specific |
| 1160 | techniques, like pthread locks, thread creation, thread joining, |
| 1161 | etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering |
| 1162 | Constraints <ordering>`). |
| 1163 | |
| 1164 | Note that program order does not introduce *happens-before* edges |
| 1165 | between a thread and signals executing inside that thread. |
| 1166 | |
| 1167 | Every (defined) read operation (load instructions, memcpy, atomic |
| 1168 | loads/read-modify-writes, etc.) R reads a series of bytes written by |
| 1169 | (defined) write operations (store instructions, atomic |
| 1170 | stores/read-modify-writes, memcpy, etc.). For the purposes of this |
| 1171 | section, initialized globals are considered to have a write of the |
| 1172 | initializer which is atomic and happens before any other read or write |
| 1173 | of the memory in question. For each byte of a read R, R\ :sub:`byte` |
| 1174 | may see any write to the same byte, except: |
| 1175 | |
| 1176 | - If write\ :sub:`1` happens before write\ :sub:`2`, and |
| 1177 | write\ :sub:`2` happens before R\ :sub:`byte`, then |
| 1178 | R\ :sub:`byte` does not see write\ :sub:`1`. |
| 1179 | - If R\ :sub:`byte` happens before write\ :sub:`3`, then |
| 1180 | R\ :sub:`byte` does not see write\ :sub:`3`. |
| 1181 | |
| 1182 | Given that definition, R\ :sub:`byte` is defined as follows: |
| 1183 | |
| 1184 | - If R is volatile, the result is target-dependent. (Volatile is |
| 1185 | supposed to give guarantees which can support ``sig_atomic_t`` in |
| 1186 | C/C++, and may be used for accesses to addresses which do not behave |
| 1187 | like normal memory. It does not generally provide cross-thread |
| 1188 | synchronization.) |
| 1189 | - Otherwise, if there is no write to the same byte that happens before |
| 1190 | R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. |
| 1191 | - Otherwise, if R\ :sub:`byte` may see exactly one write, |
| 1192 | R\ :sub:`byte` returns the value written by that write. |
| 1193 | - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may |
| 1194 | see are atomic, it chooses one of the values written. See the :ref:`Atomic |
| 1195 | Memory Ordering Constraints <ordering>` section for additional |
| 1196 | constraints on how the choice is made. |
| 1197 | - Otherwise R\ :sub:`byte` returns ``undef``. |
| 1198 | |
| 1199 | R returns the value composed of the series of bytes it read. This |
| 1200 | implies that some bytes within the value may be ``undef`` **without** |
| 1201 | the entire value being ``undef``. Note that this only defines the |
| 1202 | semantics of the operation; it doesn't mean that targets will emit more |
| 1203 | than one instruction to read the series of bytes. |
| 1204 | |
| 1205 | Note that in cases where none of the atomic intrinsics are used, this |
| 1206 | model places only one restriction on IR transformations on top of what |
| 1207 | is required for single-threaded execution: introducing a store to a byte |
| 1208 | which might not otherwise be stored is not allowed in general. |
| 1209 | (Specifically, in the case where another thread might write to and read |
| 1210 | from an address, introducing a store can change a load that may see |
| 1211 | exactly one write into a load that may see multiple writes.) |
| 1212 | |
| 1213 | .. _ordering: |
| 1214 | |
| 1215 | Atomic Memory Ordering Constraints |
| 1216 | ---------------------------------- |
| 1217 | |
| 1218 | Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, |
| 1219 | :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, |
| 1220 | :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take |
| 1221 | an ordering parameter that determines which other atomic instructions on |
| 1222 | the same address they *synchronize with*. These semantics are borrowed |
| 1223 | from Java and C++0x, but are somewhat more colloquial. If these |
| 1224 | descriptions aren't precise enough, check those specs (see spec |
| 1225 | references in the :doc:`atomics guide <Atomics>`). |
| 1226 | :ref:`fence <i_fence>` instructions treat these orderings somewhat |
| 1227 | differently since they don't take an address. See that instruction's |
| 1228 | documentation for details. |
| 1229 | |
| 1230 | For a simpler introduction to the ordering constraints, see the |
| 1231 | :doc:`Atomics`. |
| 1232 | |
| 1233 | ``unordered`` |
| 1234 | The set of values that can be read is governed by the happens-before |
| 1235 | partial order. A value cannot be read unless some operation wrote |
| 1236 | it. This is intended to provide a guarantee strong enough to model |
| 1237 | Java's non-volatile shared variables. This ordering cannot be |
| 1238 | specified for read-modify-write operations; it is not strong enough |
| 1239 | to make them atomic in any interesting way. |
| 1240 | ``monotonic`` |
| 1241 | In addition to the guarantees of ``unordered``, there is a single |
| 1242 | total order for modifications by ``monotonic`` operations on each |
| 1243 | address. All modification orders must be compatible with the |
| 1244 | happens-before order. There is no guarantee that the modification |
| 1245 | orders can be combined to a global total order for the whole program |
| 1246 | (and this often will not be possible). The read in an atomic |
| 1247 | read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and |
| 1248 | :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification |
| 1249 | order immediately before the value it writes. If one atomic read |
| 1250 | happens before another atomic read of the same address, the later |
| 1251 | read must see the same value or a later value in the address's |
| 1252 | modification order. This disallows reordering of ``monotonic`` (or |
| 1253 | stronger) operations on the same address. If an address is written |
| 1254 | ``monotonic``-ally by one thread, and other threads ``monotonic``-ally |
| 1255 | read that address repeatedly, the other threads must eventually see |
| 1256 | the write. This corresponds to the C++0x/C1x |
| 1257 | ``memory_order_relaxed``. |
| 1258 | ``acquire`` |
| 1259 | In addition to the guarantees of ``monotonic``, a |
| 1260 | *synchronizes-with* edge may be formed with a ``release`` operation. |
| 1261 | This is intended to model C++'s ``memory_order_acquire``. |
| 1262 | ``release`` |
| 1263 | In addition to the guarantees of ``monotonic``, if this operation |
| 1264 | writes a value which is subsequently read by an ``acquire`` |
| 1265 | operation, it *synchronizes-with* that operation. (This isn't a |
| 1266 | complete description; see the C++0x definition of a release |
| 1267 | sequence.) This corresponds to the C++0x/C1x |
| 1268 | ``memory_order_release``. |
| 1269 | ``acq_rel`` (acquire+release) |
| 1270 | Acts as both an ``acquire`` and ``release`` operation on its |
| 1271 | address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. |
| 1272 | ``seq_cst`` (sequentially consistent) |
| 1273 | In addition to the guarantees of ``acq_rel`` (``acquire`` for an |
| 1274 | operation which only reads, ``release`` for an operation which only |
| 1275 | writes), there is a global total order on all |
| 1276 | sequentially-consistent operations on all addresses, which is |
| 1277 | consistent with the *happens-before* partial order and with the |
| 1278 | modification orders of all the affected addresses. Each |
| 1279 | sequentially-consistent read sees the last preceding write to the |
| 1280 | same address in this global order. This corresponds to the C++0x/C1x |
| 1281 | ``memory_order_seq_cst`` and Java volatile. |
| 1282 | |
| 1283 | .. _singlethread: |
| 1284 | |
| 1285 | If an atomic operation is marked ``singlethread``, it only *synchronizes |
| 1286 | with* or participates in modification and seq\_cst total orderings with |
| 1287 | other operations running in the same thread (for example, in signal |
| 1288 | handlers). |
| 1289 | |
| 1290 | .. _fastmath: |
| 1291 | |
| 1292 | Fast-Math Flags |
| 1293 | --------------- |
| 1294 | |
| 1295 | LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`, |
| 1296 | :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, |
| 1297 | :ref:`frem <i_frem>`) have the following flags that can set to enable |
| 1298 | otherwise unsafe floating point operations |
| 1299 | |
| 1300 | ``nnan`` |
| 1301 | No NaNs - Allow optimizations to assume the arguments and result are not |
| 1302 | NaN. Such optimizations are required to retain defined behavior over |
| 1303 | NaNs, but the value of the result is undefined. |
| 1304 | |
| 1305 | ``ninf`` |
| 1306 | No Infs - Allow optimizations to assume the arguments and result are not |
| 1307 | +/-Inf. Such optimizations are required to retain defined behavior over |
| 1308 | +/-Inf, but the value of the result is undefined. |
| 1309 | |
| 1310 | ``nsz`` |
| 1311 | No Signed Zeros - Allow optimizations to treat the sign of a zero |
| 1312 | argument or result as insignificant. |
| 1313 | |
| 1314 | ``arcp`` |
| 1315 | Allow Reciprocal - Allow optimizations to use the reciprocal of an |
| 1316 | argument rather than perform division. |
| 1317 | |
| 1318 | ``fast`` |
| 1319 | Fast - Allow algebraically equivalent transformations that may |
| 1320 | dramatically change results in floating point (e.g. reassociate). This |
| 1321 | flag implies all the others. |
| 1322 | |
| 1323 | .. _typesystem: |
| 1324 | |
| 1325 | Type System |
| 1326 | =========== |
| 1327 | |
| 1328 | The LLVM type system is one of the most important features of the |
| 1329 | intermediate representation. Being typed enables a number of |
| 1330 | optimizations to be performed on the intermediate representation |
| 1331 | directly, without having to do extra analyses on the side before the |
| 1332 | transformation. A strong type system makes it easier to read the |
| 1333 | generated code and enables novel analyses and transformations that are |
| 1334 | not feasible to perform on normal three address code representations. |
| 1335 | |
| 1336 | Type Classifications |
| 1337 | -------------------- |
| 1338 | |
| 1339 | The types fall into a few useful classifications: |
| 1340 | |
| 1341 | |
| 1342 | .. list-table:: |
| 1343 | :header-rows: 1 |
| 1344 | |
| 1345 | * - Classification |
| 1346 | - Types |
| 1347 | |
| 1348 | * - :ref:`integer <t_integer>` |
| 1349 | - ``i1``, ``i2``, ``i3``, ... ``i8``, ... ``i16``, ... ``i32``, ... |
| 1350 | ``i64``, ... |
| 1351 | |
| 1352 | * - :ref:`floating point <t_floating>` |
| 1353 | - ``half``, ``float``, ``double``, ``x86_fp80``, ``fp128``, |
| 1354 | ``ppc_fp128`` |
| 1355 | |
| 1356 | |
| 1357 | * - first class |
| 1358 | |
| 1359 | .. _t_firstclass: |
| 1360 | |
| 1361 | - :ref:`integer <t_integer>`, :ref:`floating point <t_floating>`, |
| 1362 | :ref:`pointer <t_pointer>`, :ref:`vector <t_vector>`, |
| 1363 | :ref:`structure <t_struct>`, :ref:`array <t_array>`, |
| 1364 | :ref:`label <t_label>`, :ref:`metadata <t_metadata>`. |
| 1365 | |
| 1366 | * - :ref:`primitive <t_primitive>` |
| 1367 | - :ref:`label <t_label>`, |
| 1368 | :ref:`void <t_void>`, |
| 1369 | :ref:`integer <t_integer>`, |
| 1370 | :ref:`floating point <t_floating>`, |
| 1371 | :ref:`x86mmx <t_x86mmx>`, |
| 1372 | :ref:`metadata <t_metadata>`. |
| 1373 | |
| 1374 | * - :ref:`derived <t_derived>` |
| 1375 | - :ref:`array <t_array>`, |
| 1376 | :ref:`function <t_function>`, |
| 1377 | :ref:`pointer <t_pointer>`, |
| 1378 | :ref:`structure <t_struct>`, |
| 1379 | :ref:`vector <t_vector>`, |
| 1380 | :ref:`opaque <t_opaque>`. |
| 1381 | |
| 1382 | The :ref:`first class <t_firstclass>` types are perhaps the most important. |
| 1383 | Values of these types are the only ones which can be produced by |
| 1384 | instructions. |
| 1385 | |
| 1386 | .. _t_primitive: |
| 1387 | |
| 1388 | Primitive Types |
| 1389 | --------------- |
| 1390 | |
| 1391 | The primitive types are the fundamental building blocks of the LLVM |
| 1392 | system. |
| 1393 | |
| 1394 | .. _t_integer: |
| 1395 | |
| 1396 | Integer Type |
| 1397 | ^^^^^^^^^^^^ |
| 1398 | |
| 1399 | Overview: |
| 1400 | """"""""" |
| 1401 | |
| 1402 | The integer type is a very simple type that simply specifies an |
| 1403 | arbitrary bit width for the integer type desired. Any bit width from 1 |
| 1404 | bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified. |
| 1405 | |
| 1406 | Syntax: |
| 1407 | """"""" |
| 1408 | |
| 1409 | :: |
| 1410 | |
| 1411 | iN |
| 1412 | |
| 1413 | The number of bits the integer will occupy is specified by the ``N`` |
| 1414 | value. |
| 1415 | |
| 1416 | Examples: |
| 1417 | """"""""" |
| 1418 | |
| 1419 | +----------------+------------------------------------------------+ |
| 1420 | | ``i1`` | a single-bit integer. | |
| 1421 | +----------------+------------------------------------------------+ |
| 1422 | | ``i32`` | a 32-bit integer. | |
| 1423 | +----------------+------------------------------------------------+ |
| 1424 | | ``i1942652`` | a really big integer of over 1 million bits. | |
| 1425 | +----------------+------------------------------------------------+ |
| 1426 | |
| 1427 | .. _t_floating: |
| 1428 | |
| 1429 | Floating Point Types |
| 1430 | ^^^^^^^^^^^^^^^^^^^^ |
| 1431 | |
| 1432 | .. list-table:: |
| 1433 | :header-rows: 1 |
| 1434 | |
| 1435 | * - Type |
| 1436 | - Description |
| 1437 | |
| 1438 | * - ``half`` |
| 1439 | - 16-bit floating point value |
| 1440 | |
| 1441 | * - ``float`` |
| 1442 | - 32-bit floating point value |
| 1443 | |
| 1444 | * - ``double`` |
| 1445 | - 64-bit floating point value |
| 1446 | |
| 1447 | * - ``fp128`` |
| 1448 | - 128-bit floating point value (112-bit mantissa) |
| 1449 | |
| 1450 | * - ``x86_fp80`` |
| 1451 | - 80-bit floating point value (X87) |
| 1452 | |
| 1453 | * - ``ppc_fp128`` |
| 1454 | - 128-bit floating point value (two 64-bits) |
| 1455 | |
| 1456 | .. _t_x86mmx: |
| 1457 | |
| 1458 | X86mmx Type |
| 1459 | ^^^^^^^^^^^ |
| 1460 | |
| 1461 | Overview: |
| 1462 | """"""""" |
| 1463 | |
| 1464 | The x86mmx type represents a value held in an MMX register on an x86 |
| 1465 | machine. The operations allowed on it are quite limited: parameters and |
| 1466 | return values, load and store, and bitcast. User-specified MMX |
| 1467 | instructions are represented as intrinsic or asm calls with arguments |
| 1468 | and/or results of this type. There are no arrays, vectors or constants |
| 1469 | of this type. |
| 1470 | |
| 1471 | Syntax: |
| 1472 | """"""" |
| 1473 | |
| 1474 | :: |
| 1475 | |
| 1476 | x86mmx |
| 1477 | |
| 1478 | .. _t_void: |
| 1479 | |
| 1480 | Void Type |
| 1481 | ^^^^^^^^^ |
| 1482 | |
| 1483 | Overview: |
| 1484 | """"""""" |
| 1485 | |
| 1486 | The void type does not represent any value and has no size. |
| 1487 | |
| 1488 | Syntax: |
| 1489 | """"""" |
| 1490 | |
| 1491 | :: |
| 1492 | |
| 1493 | void |
| 1494 | |
| 1495 | .. _t_label: |
| 1496 | |
| 1497 | Label Type |
| 1498 | ^^^^^^^^^^ |
| 1499 | |
| 1500 | Overview: |
| 1501 | """"""""" |
| 1502 | |
| 1503 | The label type represents code labels. |
| 1504 | |
| 1505 | Syntax: |
| 1506 | """"""" |
| 1507 | |
| 1508 | :: |
| 1509 | |
| 1510 | label |
| 1511 | |
| 1512 | .. _t_metadata: |
| 1513 | |
| 1514 | Metadata Type |
| 1515 | ^^^^^^^^^^^^^ |
| 1516 | |
| 1517 | Overview: |
| 1518 | """"""""" |
| 1519 | |
| 1520 | The metadata type represents embedded metadata. No derived types may be |
| 1521 | created from metadata except for :ref:`function <t_function>` arguments. |
| 1522 | |
| 1523 | Syntax: |
| 1524 | """"""" |
| 1525 | |
| 1526 | :: |
| 1527 | |
| 1528 | metadata |
| 1529 | |
| 1530 | .. _t_derived: |
| 1531 | |
| 1532 | Derived Types |
| 1533 | ------------- |
| 1534 | |
| 1535 | The real power in LLVM comes from the derived types in the system. This |
| 1536 | is what allows a programmer to represent arrays, functions, pointers, |
| 1537 | and other useful types. Each of these types contain one or more element |
| 1538 | types which may be a primitive type, or another derived type. For |
| 1539 | example, it is possible to have a two dimensional array, using an array |
| 1540 | as the element type of another array. |
| 1541 | |
| 1542 | .. _t_aggregate: |
| 1543 | |
| 1544 | Aggregate Types |
| 1545 | ^^^^^^^^^^^^^^^ |
| 1546 | |
| 1547 | Aggregate Types are a subset of derived types that can contain multiple |
| 1548 | member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are |
| 1549 | aggregate types. :ref:`Vectors <t_vector>` are not considered to be |
| 1550 | aggregate types. |
| 1551 | |
| 1552 | .. _t_array: |
| 1553 | |
| 1554 | Array Type |
| 1555 | ^^^^^^^^^^ |
| 1556 | |
| 1557 | Overview: |
| 1558 | """"""""" |
| 1559 | |
| 1560 | The array type is a very simple derived type that arranges elements |
| 1561 | sequentially in memory. The array type requires a size (number of |
| 1562 | elements) and an underlying data type. |
| 1563 | |
| 1564 | Syntax: |
| 1565 | """"""" |
| 1566 | |
| 1567 | :: |
| 1568 | |
| 1569 | [<# elements> x <elementtype>] |
| 1570 | |
| 1571 | The number of elements is a constant integer value; ``elementtype`` may |
| 1572 | be any type with a size. |
| 1573 | |
| 1574 | Examples: |
| 1575 | """"""""" |
| 1576 | |
| 1577 | +------------------+--------------------------------------+ |
| 1578 | | ``[40 x i32]`` | Array of 40 32-bit integer values. | |
| 1579 | +------------------+--------------------------------------+ |
| 1580 | | ``[41 x i32]`` | Array of 41 32-bit integer values. | |
| 1581 | +------------------+--------------------------------------+ |
| 1582 | | ``[4 x i8]`` | Array of 4 8-bit integer values. | |
| 1583 | +------------------+--------------------------------------+ |
| 1584 | |
| 1585 | Here are some examples of multidimensional arrays: |
| 1586 | |
| 1587 | +-----------------------------+----------------------------------------------------------+ |
| 1588 | | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | |
| 1589 | +-----------------------------+----------------------------------------------------------+ |
| 1590 | | ``[12 x [10 x float]]`` | 12x10 array of single precision floating point values. | |
| 1591 | +-----------------------------+----------------------------------------------------------+ |
| 1592 | | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | |
| 1593 | +-----------------------------+----------------------------------------------------------+ |
| 1594 | |
| 1595 | There is no restriction on indexing beyond the end of the array implied |
| 1596 | by a static type (though there are restrictions on indexing beyond the |
| 1597 | bounds of an allocated object in some cases). This means that |
| 1598 | single-dimension 'variable sized array' addressing can be implemented in |
| 1599 | LLVM with a zero length array type. An implementation of 'pascal style |
| 1600 | arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for |
| 1601 | example. |
| 1602 | |
| 1603 | .. _t_function: |
| 1604 | |
| 1605 | Function Type |
| 1606 | ^^^^^^^^^^^^^ |
| 1607 | |
| 1608 | Overview: |
| 1609 | """"""""" |
| 1610 | |
| 1611 | The function type can be thought of as a function signature. It consists |
| 1612 | of a return type and a list of formal parameter types. The return type |
| 1613 | of a function type is a first class type or a void type. |
| 1614 | |
| 1615 | Syntax: |
| 1616 | """"""" |
| 1617 | |
| 1618 | :: |
| 1619 | |
| 1620 | <returntype> (<parameter list>) |
| 1621 | |
| 1622 | ...where '``<parameter list>``' is a comma-separated list of type |
| 1623 | specifiers. Optionally, the parameter list may include a type ``...``, |
| 1624 | which indicates that the function takes a variable number of arguments. |
| 1625 | Variable argument functions can access their arguments with the |
| 1626 | :ref:`variable argument handling intrinsic <int_varargs>` functions. |
| 1627 | '``<returntype>``' is any type except :ref:`label <t_label>`. |
| 1628 | |
| 1629 | Examples: |
| 1630 | """"""""" |
| 1631 | |
| 1632 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1633 | | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | |
| 1634 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 1635 | | ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1636 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1637 | | ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | |
| 1638 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1639 | | ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | |
| 1640 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1641 | |
| 1642 | .. _t_struct: |
| 1643 | |
| 1644 | Structure Type |
| 1645 | ^^^^^^^^^^^^^^ |
| 1646 | |
| 1647 | Overview: |
| 1648 | """"""""" |
| 1649 | |
| 1650 | The structure type is used to represent a collection of data members |
| 1651 | together in memory. The elements of a structure may be any type that has |
| 1652 | a size. |
| 1653 | |
| 1654 | Structures in memory are accessed using '``load``' and '``store``' by |
| 1655 | getting a pointer to a field with the '``getelementptr``' instruction. |
| 1656 | Structures in registers are accessed using the '``extractvalue``' and |
| 1657 | '``insertvalue``' instructions. |
| 1658 | |
| 1659 | Structures may optionally be "packed" structures, which indicate that |
| 1660 | the alignment of the struct is one byte, and that there is no padding |
| 1661 | between the elements. In non-packed structs, padding between field types |
| 1662 | is inserted as defined by the DataLayout string in the module, which is |
| 1663 | required to match what the underlying code generator expects. |
| 1664 | |
| 1665 | Structures can either be "literal" or "identified". A literal structure |
| 1666 | is defined inline with other types (e.g. ``{i32, i32}*``) whereas |
| 1667 | identified types are always defined at the top level with a name. |
| 1668 | Literal types are uniqued by their contents and can never be recursive |
| 1669 | or opaque since there is no way to write one. Identified types can be |
| 1670 | recursive, can be opaqued, and are never uniqued. |
| 1671 | |
| 1672 | Syntax: |
| 1673 | """"""" |
| 1674 | |
| 1675 | :: |
| 1676 | |
| 1677 | %T1 = type { <type list> } ; Identified normal struct type |
| 1678 | %T2 = type <{ <type list> }> ; Identified packed struct type |
| 1679 | |
| 1680 | Examples: |
| 1681 | """"""""" |
| 1682 | |
| 1683 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1684 | | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | |
| 1685 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 1686 | | ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1687 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1688 | | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | |
| 1689 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1690 | |
| 1691 | .. _t_opaque: |
| 1692 | |
| 1693 | Opaque Structure Types |
| 1694 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 1695 | |
| 1696 | Overview: |
| 1697 | """"""""" |
| 1698 | |
| 1699 | Opaque structure types are used to represent named structure types that |
| 1700 | do not have a body specified. This corresponds (for example) to the C |
| 1701 | notion of a forward declared structure. |
| 1702 | |
| 1703 | Syntax: |
| 1704 | """"""" |
| 1705 | |
| 1706 | :: |
| 1707 | |
| 1708 | %X = type opaque |
| 1709 | %52 = type opaque |
| 1710 | |
| 1711 | Examples: |
| 1712 | """"""""" |
| 1713 | |
| 1714 | +--------------+-------------------+ |
| 1715 | | ``opaque`` | An opaque type. | |
| 1716 | +--------------+-------------------+ |
| 1717 | |
| 1718 | .. _t_pointer: |
| 1719 | |
| 1720 | Pointer Type |
| 1721 | ^^^^^^^^^^^^ |
| 1722 | |
| 1723 | Overview: |
| 1724 | """"""""" |
| 1725 | |
| 1726 | The pointer type is used to specify memory locations. Pointers are |
| 1727 | commonly used to reference objects in memory. |
| 1728 | |
| 1729 | Pointer types may have an optional address space attribute defining the |
| 1730 | numbered address space where the pointed-to object resides. The default |
| 1731 | address space is number zero. The semantics of non-zero address spaces |
| 1732 | are target-specific. |
| 1733 | |
| 1734 | Note that LLVM does not permit pointers to void (``void*``) nor does it |
| 1735 | permit pointers to labels (``label*``). Use ``i8*`` instead. |
| 1736 | |
| 1737 | Syntax: |
| 1738 | """"""" |
| 1739 | |
| 1740 | :: |
| 1741 | |
| 1742 | <type> * |
| 1743 | |
| 1744 | Examples: |
| 1745 | """"""""" |
| 1746 | |
| 1747 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 1748 | | ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. | |
| 1749 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 1750 | | ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. | |
| 1751 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 1752 | | ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. | |
| 1753 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 1754 | |
| 1755 | .. _t_vector: |
| 1756 | |
| 1757 | Vector Type |
| 1758 | ^^^^^^^^^^^ |
| 1759 | |
| 1760 | Overview: |
| 1761 | """"""""" |
| 1762 | |
| 1763 | A vector type is a simple derived type that represents a vector of |
| 1764 | elements. Vector types are used when multiple primitive data are |
| 1765 | operated in parallel using a single instruction (SIMD). A vector type |
| 1766 | requires a size (number of elements) and an underlying primitive data |
| 1767 | type. Vector types are considered :ref:`first class <t_firstclass>`. |
| 1768 | |
| 1769 | Syntax: |
| 1770 | """"""" |
| 1771 | |
| 1772 | :: |
| 1773 | |
| 1774 | < <# elements> x <elementtype> > |
| 1775 | |
| 1776 | The number of elements is a constant integer value larger than 0; |
| 1777 | elementtype may be any integer or floating point type, or a pointer to |
| 1778 | these types. Vectors of size zero are not allowed. |
| 1779 | |
| 1780 | Examples: |
| 1781 | """"""""" |
| 1782 | |
| 1783 | +-------------------+--------------------------------------------------+ |
| 1784 | | ``<4 x i32>`` | Vector of 4 32-bit integer values. | |
| 1785 | +-------------------+--------------------------------------------------+ |
| 1786 | | ``<8 x float>`` | Vector of 8 32-bit floating-point values. | |
| 1787 | +-------------------+--------------------------------------------------+ |
| 1788 | | ``<2 x i64>`` | Vector of 2 64-bit integer values. | |
| 1789 | +-------------------+--------------------------------------------------+ |
| 1790 | | ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | |
| 1791 | +-------------------+--------------------------------------------------+ |
| 1792 | |
| 1793 | Constants |
| 1794 | ========= |
| 1795 | |
| 1796 | LLVM has several different basic types of constants. This section |
| 1797 | describes them all and their syntax. |
| 1798 | |
| 1799 | Simple Constants |
| 1800 | ---------------- |
| 1801 | |
| 1802 | **Boolean constants** |
| 1803 | The two strings '``true``' and '``false``' are both valid constants |
| 1804 | of the ``i1`` type. |
| 1805 | **Integer constants** |
| 1806 | Standard integers (such as '4') are constants of the |
| 1807 | :ref:`integer <t_integer>` type. Negative numbers may be used with |
| 1808 | integer types. |
| 1809 | **Floating point constants** |
| 1810 | Floating point constants use standard decimal notation (e.g. |
| 1811 | 123.421), exponential notation (e.g. 1.23421e+2), or a more precise |
| 1812 | hexadecimal notation (see below). The assembler requires the exact |
| 1813 | decimal value of a floating-point constant. For example, the |
| 1814 | assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating |
| 1815 | decimal in binary. Floating point constants must have a :ref:`floating |
| 1816 | point <t_floating>` type. |
| 1817 | **Null pointer constants** |
| 1818 | The identifier '``null``' is recognized as a null pointer constant |
| 1819 | and must be of :ref:`pointer type <t_pointer>`. |
| 1820 | |
| 1821 | The one non-intuitive notation for constants is the hexadecimal form of |
| 1822 | floating point constants. For example, the form |
| 1823 | '``double 0x432ff973cafa8000``' is equivalent to (but harder to read |
| 1824 | than) '``double 4.5e+15``'. The only time hexadecimal floating point |
| 1825 | constants are required (and the only time that they are generated by the |
| 1826 | disassembler) is when a floating point constant must be emitted but it |
| 1827 | cannot be represented as a decimal floating point number in a reasonable |
| 1828 | number of digits. For example, NaN's, infinities, and other special |
| 1829 | values are represented in their IEEE hexadecimal format so that assembly |
| 1830 | and disassembly do not cause any bits to change in the constants. |
| 1831 | |
| 1832 | When using the hexadecimal form, constants of types half, float, and |
| 1833 | double are represented using the 16-digit form shown above (which |
| 1834 | matches the IEEE754 representation for double); half and float values |
Dmitri Gribenko | c3c8d2a | 2013-01-16 23:40:37 +0000 | [diff] [blame] | 1835 | must, however, be exactly representable as IEEE 754 half and single |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1836 | precision, respectively. Hexadecimal format is always used for long |
| 1837 | double, and there are three forms of long double. The 80-bit format used |
| 1838 | by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The |
| 1839 | 128-bit format used by PowerPC (two adjacent doubles) is represented by |
| 1840 | ``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is |
| 1841 | represented by ``0xL`` followed by 32 hexadecimal digits; no currently |
| 1842 | supported target uses this format. Long doubles will only work if they |
| 1843 | match the long double format on your target. The IEEE 16-bit format |
| 1844 | (half precision) is represented by ``0xH`` followed by 4 hexadecimal |
| 1845 | digits. All hexadecimal formats are big-endian (sign bit at the left). |
| 1846 | |
| 1847 | There are no constants of type x86mmx. |
| 1848 | |
| 1849 | Complex Constants |
| 1850 | ----------------- |
| 1851 | |
| 1852 | Complex constants are a (potentially recursive) combination of simple |
| 1853 | constants and smaller complex constants. |
| 1854 | |
| 1855 | **Structure constants** |
| 1856 | Structure constants are represented with notation similar to |
| 1857 | structure type definitions (a comma separated list of elements, |
| 1858 | surrounded by braces (``{}``)). For example: |
| 1859 | "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as |
| 1860 | "``@G = external global i32``". Structure constants must have |
| 1861 | :ref:`structure type <t_struct>`, and the number and types of elements |
| 1862 | must match those specified by the type. |
| 1863 | **Array constants** |
| 1864 | Array constants are represented with notation similar to array type |
| 1865 | definitions (a comma separated list of elements, surrounded by |
| 1866 | square brackets (``[]``)). For example: |
| 1867 | "``[ i32 42, i32 11, i32 74 ]``". Array constants must have |
| 1868 | :ref:`array type <t_array>`, and the number and types of elements must |
| 1869 | match those specified by the type. |
| 1870 | **Vector constants** |
| 1871 | Vector constants are represented with notation similar to vector |
| 1872 | type definitions (a comma separated list of elements, surrounded by |
| 1873 | less-than/greater-than's (``<>``)). For example: |
| 1874 | "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants |
| 1875 | must have :ref:`vector type <t_vector>`, and the number and types of |
| 1876 | elements must match those specified by the type. |
| 1877 | **Zero initialization** |
| 1878 | The string '``zeroinitializer``' can be used to zero initialize a |
| 1879 | value to zero of *any* type, including scalar and |
| 1880 | :ref:`aggregate <t_aggregate>` types. This is often used to avoid |
| 1881 | having to print large zero initializers (e.g. for large arrays) and |
| 1882 | is always exactly equivalent to using explicit zero initializers. |
| 1883 | **Metadata node** |
| 1884 | A metadata node is a structure-like constant with :ref:`metadata |
| 1885 | type <t_metadata>`. For example: |
| 1886 | "``metadata !{ i32 0, metadata !"test" }``". Unlike other |
| 1887 | constants that are meant to be interpreted as part of the |
| 1888 | instruction stream, metadata is a place to attach additional |
| 1889 | information such as debug info. |
| 1890 | |
| 1891 | Global Variable and Function Addresses |
| 1892 | -------------------------------------- |
| 1893 | |
| 1894 | The addresses of :ref:`global variables <globalvars>` and |
| 1895 | :ref:`functions <functionstructure>` are always implicitly valid |
| 1896 | (link-time) constants. These constants are explicitly referenced when |
| 1897 | the :ref:`identifier for the global <identifiers>` is used and always have |
| 1898 | :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM |
| 1899 | file: |
| 1900 | |
| 1901 | .. code-block:: llvm |
| 1902 | |
| 1903 | @X = global i32 17 |
| 1904 | @Y = global i32 42 |
| 1905 | @Z = global [2 x i32*] [ i32* @X, i32* @Y ] |
| 1906 | |
| 1907 | .. _undefvalues: |
| 1908 | |
| 1909 | Undefined Values |
| 1910 | ---------------- |
| 1911 | |
| 1912 | The string '``undef``' can be used anywhere a constant is expected, and |
| 1913 | indicates that the user of the value may receive an unspecified |
| 1914 | bit-pattern. Undefined values may be of any type (other than '``label``' |
| 1915 | or '``void``') and be used anywhere a constant is permitted. |
| 1916 | |
| 1917 | Undefined values are useful because they indicate to the compiler that |
| 1918 | the program is well defined no matter what value is used. This gives the |
| 1919 | compiler more freedom to optimize. Here are some examples of |
| 1920 | (potentially surprising) transformations that are valid (in pseudo IR): |
| 1921 | |
| 1922 | .. code-block:: llvm |
| 1923 | |
| 1924 | %A = add %X, undef |
| 1925 | %B = sub %X, undef |
| 1926 | %C = xor %X, undef |
| 1927 | Safe: |
| 1928 | %A = undef |
| 1929 | %B = undef |
| 1930 | %C = undef |
| 1931 | |
| 1932 | This is safe because all of the output bits are affected by the undef |
| 1933 | bits. Any output bit can have a zero or one depending on the input bits. |
| 1934 | |
| 1935 | .. code-block:: llvm |
| 1936 | |
| 1937 | %A = or %X, undef |
| 1938 | %B = and %X, undef |
| 1939 | Safe: |
| 1940 | %A = -1 |
| 1941 | %B = 0 |
| 1942 | Unsafe: |
| 1943 | %A = undef |
| 1944 | %B = undef |
| 1945 | |
| 1946 | These logical operations have bits that are not always affected by the |
| 1947 | input. For example, if ``%X`` has a zero bit, then the output of the |
| 1948 | '``and``' operation will always be a zero for that bit, no matter what |
| 1949 | the corresponding bit from the '``undef``' is. As such, it is unsafe to |
| 1950 | optimize or assume that the result of the '``and``' is '``undef``'. |
| 1951 | However, it is safe to assume that all bits of the '``undef``' could be |
| 1952 | 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that |
| 1953 | all the bits of the '``undef``' operand to the '``or``' could be set, |
| 1954 | allowing the '``or``' to be folded to -1. |
| 1955 | |
| 1956 | .. code-block:: llvm |
| 1957 | |
| 1958 | %A = select undef, %X, %Y |
| 1959 | %B = select undef, 42, %Y |
| 1960 | %C = select %X, %Y, undef |
| 1961 | Safe: |
| 1962 | %A = %X (or %Y) |
| 1963 | %B = 42 (or %Y) |
| 1964 | %C = %Y |
| 1965 | Unsafe: |
| 1966 | %A = undef |
| 1967 | %B = undef |
| 1968 | %C = undef |
| 1969 | |
| 1970 | This set of examples shows that undefined '``select``' (and conditional |
| 1971 | branch) conditions can go *either way*, but they have to come from one |
| 1972 | of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were |
| 1973 | both known to have a clear low bit, then ``%A`` would have to have a |
| 1974 | cleared low bit. However, in the ``%C`` example, the optimizer is |
| 1975 | allowed to assume that the '``undef``' operand could be the same as |
| 1976 | ``%Y``, allowing the whole '``select``' to be eliminated. |
| 1977 | |
| 1978 | .. code-block:: llvm |
| 1979 | |
| 1980 | %A = xor undef, undef |
| 1981 | |
| 1982 | %B = undef |
| 1983 | %C = xor %B, %B |
| 1984 | |
| 1985 | %D = undef |
| 1986 | %E = icmp lt %D, 4 |
| 1987 | %F = icmp gte %D, 4 |
| 1988 | |
| 1989 | Safe: |
| 1990 | %A = undef |
| 1991 | %B = undef |
| 1992 | %C = undef |
| 1993 | %D = undef |
| 1994 | %E = undef |
| 1995 | %F = undef |
| 1996 | |
| 1997 | This example points out that two '``undef``' operands are not |
| 1998 | necessarily the same. This can be surprising to people (and also matches |
| 1999 | C semantics) where they assume that "``X^X``" is always zero, even if |
| 2000 | ``X`` is undefined. This isn't true for a number of reasons, but the |
| 2001 | short answer is that an '``undef``' "variable" can arbitrarily change |
| 2002 | its value over its "live range". This is true because the variable |
| 2003 | doesn't actually *have a live range*. Instead, the value is logically |
| 2004 | read from arbitrary registers that happen to be around when needed, so |
| 2005 | the value is not necessarily consistent over time. In fact, ``%A`` and |
| 2006 | ``%C`` need to have the same semantics or the core LLVM "replace all |
| 2007 | uses with" concept would not hold. |
| 2008 | |
| 2009 | .. code-block:: llvm |
| 2010 | |
| 2011 | %A = fdiv undef, %X |
| 2012 | %B = fdiv %X, undef |
| 2013 | Safe: |
| 2014 | %A = undef |
| 2015 | b: unreachable |
| 2016 | |
| 2017 | These examples show the crucial difference between an *undefined value* |
| 2018 | and *undefined behavior*. An undefined value (like '``undef``') is |
| 2019 | allowed to have an arbitrary bit-pattern. This means that the ``%A`` |
| 2020 | operation can be constant folded to '``undef``', because the '``undef``' |
| 2021 | could be an SNaN, and ``fdiv`` is not (currently) defined on SNaN's. |
| 2022 | However, in the second example, we can make a more aggressive |
| 2023 | assumption: because the ``undef`` is allowed to be an arbitrary value, |
| 2024 | we are allowed to assume that it could be zero. Since a divide by zero |
| 2025 | has *undefined behavior*, we are allowed to assume that the operation |
| 2026 | does not execute at all. This allows us to delete the divide and all |
| 2027 | code after it. Because the undefined operation "can't happen", the |
| 2028 | optimizer can assume that it occurs in dead code. |
| 2029 | |
| 2030 | .. code-block:: llvm |
| 2031 | |
| 2032 | a: store undef -> %X |
| 2033 | b: store %X -> undef |
| 2034 | Safe: |
| 2035 | a: <deleted> |
| 2036 | b: unreachable |
| 2037 | |
| 2038 | These examples reiterate the ``fdiv`` example: a store *of* an undefined |
| 2039 | value can be assumed to not have any effect; we can assume that the |
| 2040 | value is overwritten with bits that happen to match what was already |
| 2041 | there. However, a store *to* an undefined location could clobber |
| 2042 | arbitrary memory, therefore, it has undefined behavior. |
| 2043 | |
| 2044 | .. _poisonvalues: |
| 2045 | |
| 2046 | Poison Values |
| 2047 | ------------- |
| 2048 | |
| 2049 | Poison values are similar to :ref:`undef values <undefvalues>`, however |
| 2050 | they also represent the fact that an instruction or constant expression |
| 2051 | which cannot evoke side effects has nevertheless detected a condition |
| 2052 | which results in undefined behavior. |
| 2053 | |
| 2054 | There is currently no way of representing a poison value in the IR; they |
| 2055 | only exist when produced by operations such as :ref:`add <i_add>` with |
| 2056 | the ``nsw`` flag. |
| 2057 | |
| 2058 | Poison value behavior is defined in terms of value *dependence*: |
| 2059 | |
| 2060 | - Values other than :ref:`phi <i_phi>` nodes depend on their operands. |
| 2061 | - :ref:`Phi <i_phi>` nodes depend on the operand corresponding to |
| 2062 | their dynamic predecessor basic block. |
| 2063 | - Function arguments depend on the corresponding actual argument values |
| 2064 | in the dynamic callers of their functions. |
| 2065 | - :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>` |
| 2066 | instructions that dynamically transfer control back to them. |
| 2067 | - :ref:`Invoke <i_invoke>` instructions depend on the |
| 2068 | :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing |
| 2069 | call instructions that dynamically transfer control back to them. |
| 2070 | - Non-volatile loads and stores depend on the most recent stores to all |
| 2071 | of the referenced memory addresses, following the order in the IR |
| 2072 | (including loads and stores implied by intrinsics such as |
| 2073 | :ref:`@llvm.memcpy <int_memcpy>`.) |
| 2074 | - An instruction with externally visible side effects depends on the |
| 2075 | most recent preceding instruction with externally visible side |
| 2076 | effects, following the order in the IR. (This includes :ref:`volatile |
| 2077 | operations <volatile>`.) |
| 2078 | - An instruction *control-depends* on a :ref:`terminator |
| 2079 | instruction <terminators>` if the terminator instruction has |
| 2080 | multiple successors and the instruction is always executed when |
| 2081 | control transfers to one of the successors, and may not be executed |
| 2082 | when control is transferred to another. |
| 2083 | - Additionally, an instruction also *control-depends* on a terminator |
| 2084 | instruction if the set of instructions it otherwise depends on would |
| 2085 | be different if the terminator had transferred control to a different |
| 2086 | successor. |
| 2087 | - Dependence is transitive. |
| 2088 | |
| 2089 | Poison Values have the same behavior as :ref:`undef values <undefvalues>`, |
| 2090 | with the additional affect that any instruction which has a *dependence* |
| 2091 | on a poison value has undefined behavior. |
| 2092 | |
| 2093 | Here are some examples: |
| 2094 | |
| 2095 | .. code-block:: llvm |
| 2096 | |
| 2097 | entry: |
| 2098 | %poison = sub nuw i32 0, 1 ; Results in a poison value. |
| 2099 | %still_poison = and i32 %poison, 0 ; 0, but also poison. |
| 2100 | %poison_yet_again = getelementptr i32* @h, i32 %still_poison |
| 2101 | store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned |
| 2102 | |
| 2103 | store i32 %poison, i32* @g ; Poison value stored to memory. |
| 2104 | %poison2 = load i32* @g ; Poison value loaded back from memory. |
| 2105 | |
| 2106 | store volatile i32 %poison, i32* @g ; External observation; undefined behavior. |
| 2107 | |
| 2108 | %narrowaddr = bitcast i32* @g to i16* |
| 2109 | %wideaddr = bitcast i32* @g to i64* |
| 2110 | %poison3 = load i16* %narrowaddr ; Returns a poison value. |
| 2111 | %poison4 = load i64* %wideaddr ; Returns a poison value. |
| 2112 | |
| 2113 | %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. |
| 2114 | br i1 %cmp, label %true, label %end ; Branch to either destination. |
| 2115 | |
| 2116 | true: |
| 2117 | store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so |
| 2118 | ; it has undefined behavior. |
| 2119 | br label %end |
| 2120 | |
| 2121 | end: |
| 2122 | %p = phi i32 [ 0, %entry ], [ 1, %true ] |
| 2123 | ; Both edges into this PHI are |
| 2124 | ; control-dependent on %cmp, so this |
| 2125 | ; always results in a poison value. |
| 2126 | |
| 2127 | store volatile i32 0, i32* @g ; This would depend on the store in %true |
| 2128 | ; if %cmp is true, or the store in %entry |
| 2129 | ; otherwise, so this is undefined behavior. |
| 2130 | |
| 2131 | br i1 %cmp, label %second_true, label %second_end |
| 2132 | ; The same branch again, but this time the |
| 2133 | ; true block doesn't have side effects. |
| 2134 | |
| 2135 | second_true: |
| 2136 | ; No side effects! |
| 2137 | ret void |
| 2138 | |
| 2139 | second_end: |
| 2140 | store volatile i32 0, i32* @g ; This time, the instruction always depends |
| 2141 | ; on the store in %end. Also, it is |
| 2142 | ; control-equivalent to %end, so this is |
| 2143 | ; well-defined (ignoring earlier undefined |
| 2144 | ; behavior in this example). |
| 2145 | |
| 2146 | .. _blockaddress: |
| 2147 | |
| 2148 | Addresses of Basic Blocks |
| 2149 | ------------------------- |
| 2150 | |
| 2151 | ``blockaddress(@function, %block)`` |
| 2152 | |
| 2153 | The '``blockaddress``' constant computes the address of the specified |
| 2154 | basic block in the specified function, and always has an ``i8*`` type. |
| 2155 | Taking the address of the entry block is illegal. |
| 2156 | |
| 2157 | This value only has defined behavior when used as an operand to the |
| 2158 | ':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons |
| 2159 | against null. Pointer equality tests between labels addresses results in |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2160 | undefined behavior --- though, again, comparison against null is ok, and |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2161 | no label is equal to the null pointer. This may be passed around as an |
| 2162 | opaque pointer sized value as long as the bits are not inspected. This |
| 2163 | allows ``ptrtoint`` and arithmetic to be performed on these values so |
| 2164 | long as the original value is reconstituted before the ``indirectbr`` |
| 2165 | instruction. |
| 2166 | |
| 2167 | Finally, some targets may provide defined semantics when using the value |
| 2168 | as the operand to an inline assembly, but that is target specific. |
| 2169 | |
| 2170 | Constant Expressions |
| 2171 | -------------------- |
| 2172 | |
| 2173 | Constant expressions are used to allow expressions involving other |
| 2174 | constants to be used as constants. Constant expressions may be of any |
| 2175 | :ref:`first class <t_firstclass>` type and may involve any LLVM operation |
| 2176 | that does not have side effects (e.g. load and call are not supported). |
| 2177 | The following is the syntax for constant expressions: |
| 2178 | |
| 2179 | ``trunc (CST to TYPE)`` |
| 2180 | Truncate a constant to another type. The bit size of CST must be |
| 2181 | larger than the bit size of TYPE. Both types must be integers. |
| 2182 | ``zext (CST to TYPE)`` |
| 2183 | Zero extend a constant to another type. The bit size of CST must be |
| 2184 | smaller than the bit size of TYPE. Both types must be integers. |
| 2185 | ``sext (CST to TYPE)`` |
| 2186 | Sign extend a constant to another type. The bit size of CST must be |
| 2187 | smaller than the bit size of TYPE. Both types must be integers. |
| 2188 | ``fptrunc (CST to TYPE)`` |
| 2189 | Truncate a floating point constant to another floating point type. |
| 2190 | The size of CST must be larger than the size of TYPE. Both types |
| 2191 | must be floating point. |
| 2192 | ``fpext (CST to TYPE)`` |
| 2193 | Floating point extend a constant to another type. The size of CST |
| 2194 | must be smaller or equal to the size of TYPE. Both types must be |
| 2195 | floating point. |
| 2196 | ``fptoui (CST to TYPE)`` |
| 2197 | Convert a floating point constant to the corresponding unsigned |
| 2198 | integer constant. TYPE must be a scalar or vector integer type. CST |
| 2199 | must be of scalar or vector floating point type. Both CST and TYPE |
| 2200 | must be scalars, or vectors of the same number of elements. If the |
| 2201 | value won't fit in the integer type, the results are undefined. |
| 2202 | ``fptosi (CST to TYPE)`` |
| 2203 | Convert a floating point constant to the corresponding signed |
| 2204 | integer constant. TYPE must be a scalar or vector integer type. CST |
| 2205 | must be of scalar or vector floating point type. Both CST and TYPE |
| 2206 | must be scalars, or vectors of the same number of elements. If the |
| 2207 | value won't fit in the integer type, the results are undefined. |
| 2208 | ``uitofp (CST to TYPE)`` |
| 2209 | Convert an unsigned integer constant to the corresponding floating |
| 2210 | point constant. TYPE must be a scalar or vector floating point type. |
| 2211 | CST must be of scalar or vector integer type. Both CST and TYPE must |
| 2212 | be scalars, or vectors of the same number of elements. If the value |
| 2213 | won't fit in the floating point type, the results are undefined. |
| 2214 | ``sitofp (CST to TYPE)`` |
| 2215 | Convert a signed integer constant to the corresponding floating |
| 2216 | point constant. TYPE must be a scalar or vector floating point type. |
| 2217 | CST must be of scalar or vector integer type. Both CST and TYPE must |
| 2218 | be scalars, or vectors of the same number of elements. If the value |
| 2219 | won't fit in the floating point type, the results are undefined. |
| 2220 | ``ptrtoint (CST to TYPE)`` |
| 2221 | Convert a pointer typed constant to the corresponding integer |
| 2222 | constant ``TYPE`` must be an integer type. ``CST`` must be of |
| 2223 | pointer type. The ``CST`` value is zero extended, truncated, or |
| 2224 | unchanged to make it fit in ``TYPE``. |
| 2225 | ``inttoptr (CST to TYPE)`` |
| 2226 | Convert an integer constant to a pointer constant. TYPE must be a |
| 2227 | pointer type. CST must be of integer type. The CST value is zero |
| 2228 | extended, truncated, or unchanged to make it fit in a pointer size. |
| 2229 | This one is *really* dangerous! |
| 2230 | ``bitcast (CST to TYPE)`` |
| 2231 | Convert a constant, CST, to another TYPE. The constraints of the |
| 2232 | operands are the same as those for the :ref:`bitcast |
| 2233 | instruction <i_bitcast>`. |
| 2234 | ``getelementptr (CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (CSTPTR, IDX0, IDX1, ...)`` |
| 2235 | Perform the :ref:`getelementptr operation <i_getelementptr>` on |
| 2236 | constants. As with the :ref:`getelementptr <i_getelementptr>` |
| 2237 | instruction, the index list may have zero or more indexes, which are |
| 2238 | required to make sense for the type of "CSTPTR". |
| 2239 | ``select (COND, VAL1, VAL2)`` |
| 2240 | Perform the :ref:`select operation <i_select>` on constants. |
| 2241 | ``icmp COND (VAL1, VAL2)`` |
| 2242 | Performs the :ref:`icmp operation <i_icmp>` on constants. |
| 2243 | ``fcmp COND (VAL1, VAL2)`` |
| 2244 | Performs the :ref:`fcmp operation <i_fcmp>` on constants. |
| 2245 | ``extractelement (VAL, IDX)`` |
| 2246 | Perform the :ref:`extractelement operation <i_extractelement>` on |
| 2247 | constants. |
| 2248 | ``insertelement (VAL, ELT, IDX)`` |
| 2249 | Perform the :ref:`insertelement operation <i_insertelement>` on |
| 2250 | constants. |
| 2251 | ``shufflevector (VEC1, VEC2, IDXMASK)`` |
| 2252 | Perform the :ref:`shufflevector operation <i_shufflevector>` on |
| 2253 | constants. |
| 2254 | ``extractvalue (VAL, IDX0, IDX1, ...)`` |
| 2255 | Perform the :ref:`extractvalue operation <i_extractvalue>` on |
| 2256 | constants. The index list is interpreted in a similar manner as |
| 2257 | indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At |
| 2258 | least one index value must be specified. |
| 2259 | ``insertvalue (VAL, ELT, IDX0, IDX1, ...)`` |
| 2260 | Perform the :ref:`insertvalue operation <i_insertvalue>` on constants. |
| 2261 | The index list is interpreted in a similar manner as indices in a |
| 2262 | ':ref:`getelementptr <i_getelementptr>`' operation. At least one index |
| 2263 | value must be specified. |
| 2264 | ``OPCODE (LHS, RHS)`` |
| 2265 | Perform the specified operation of the LHS and RHS constants. OPCODE |
| 2266 | may be any of the :ref:`binary <binaryops>` or :ref:`bitwise |
| 2267 | binary <bitwiseops>` operations. The constraints on operands are |
| 2268 | the same as those for the corresponding instruction (e.g. no bitwise |
| 2269 | operations on floating point values are allowed). |
| 2270 | |
| 2271 | Other Values |
| 2272 | ============ |
| 2273 | |
| 2274 | Inline Assembler Expressions |
| 2275 | ---------------------------- |
| 2276 | |
| 2277 | LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level |
| 2278 | Inline Assembly <moduleasm>`) through the use of a special value. This |
| 2279 | value represents the inline assembler as a string (containing the |
| 2280 | instructions to emit), a list of operand constraints (stored as a |
| 2281 | string), a flag that indicates whether or not the inline asm expression |
| 2282 | has side effects, and a flag indicating whether the function containing |
| 2283 | the asm needs to align its stack conservatively. An example inline |
| 2284 | assembler expression is: |
| 2285 | |
| 2286 | .. code-block:: llvm |
| 2287 | |
| 2288 | i32 (i32) asm "bswap $0", "=r,r" |
| 2289 | |
| 2290 | Inline assembler expressions may **only** be used as the callee operand |
| 2291 | of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. |
| 2292 | Thus, typically we have: |
| 2293 | |
| 2294 | .. code-block:: llvm |
| 2295 | |
| 2296 | %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) |
| 2297 | |
| 2298 | Inline asms with side effects not visible in the constraint list must be |
| 2299 | marked as having side effects. This is done through the use of the |
| 2300 | '``sideeffect``' keyword, like so: |
| 2301 | |
| 2302 | .. code-block:: llvm |
| 2303 | |
| 2304 | call void asm sideeffect "eieio", ""() |
| 2305 | |
| 2306 | In some cases inline asms will contain code that will not work unless |
| 2307 | the stack is aligned in some way, such as calls or SSE instructions on |
| 2308 | x86, yet will not contain code that does that alignment within the asm. |
| 2309 | The compiler should make conservative assumptions about what the asm |
| 2310 | might contain and should generate its usual stack alignment code in the |
| 2311 | prologue if the '``alignstack``' keyword is present: |
| 2312 | |
| 2313 | .. code-block:: llvm |
| 2314 | |
| 2315 | call void asm alignstack "eieio", ""() |
| 2316 | |
| 2317 | Inline asms also support using non-standard assembly dialects. The |
| 2318 | assumed dialect is ATT. When the '``inteldialect``' keyword is present, |
| 2319 | the inline asm is using the Intel dialect. Currently, ATT and Intel are |
| 2320 | the only supported dialects. An example is: |
| 2321 | |
| 2322 | .. code-block:: llvm |
| 2323 | |
| 2324 | call void asm inteldialect "eieio", ""() |
| 2325 | |
| 2326 | If multiple keywords appear the '``sideeffect``' keyword must come |
| 2327 | first, the '``alignstack``' keyword second and the '``inteldialect``' |
| 2328 | keyword last. |
| 2329 | |
| 2330 | Inline Asm Metadata |
| 2331 | ^^^^^^^^^^^^^^^^^^^ |
| 2332 | |
| 2333 | The call instructions that wrap inline asm nodes may have a |
| 2334 | "``!srcloc``" MDNode attached to it that contains a list of constant |
| 2335 | integers. If present, the code generator will use the integer as the |
| 2336 | location cookie value when report errors through the ``LLVMContext`` |
| 2337 | error reporting mechanisms. This allows a front-end to correlate backend |
| 2338 | errors that occur with inline asm back to the source code that produced |
| 2339 | it. For example: |
| 2340 | |
| 2341 | .. code-block:: llvm |
| 2342 | |
| 2343 | call void asm sideeffect "something bad", ""(), !srcloc !42 |
| 2344 | ... |
| 2345 | !42 = !{ i32 1234567 } |
| 2346 | |
| 2347 | It is up to the front-end to make sense of the magic numbers it places |
| 2348 | in the IR. If the MDNode contains multiple constants, the code generator |
| 2349 | will use the one that corresponds to the line of the asm that the error |
| 2350 | occurs on. |
| 2351 | |
| 2352 | .. _metadata: |
| 2353 | |
| 2354 | Metadata Nodes and Metadata Strings |
| 2355 | ----------------------------------- |
| 2356 | |
| 2357 | LLVM IR allows metadata to be attached to instructions in the program |
| 2358 | that can convey extra information about the code to the optimizers and |
| 2359 | code generator. One example application of metadata is source-level |
| 2360 | debug information. There are two metadata primitives: strings and nodes. |
| 2361 | All metadata has the ``metadata`` type and is identified in syntax by a |
| 2362 | preceding exclamation point ('``!``'). |
| 2363 | |
| 2364 | A metadata string is a string surrounded by double quotes. It can |
| 2365 | contain any character by escaping non-printable characters with |
| 2366 | "``\xx``" where "``xx``" is the two digit hex code. For example: |
| 2367 | "``!"test\00"``". |
| 2368 | |
| 2369 | Metadata nodes are represented with notation similar to structure |
| 2370 | constants (a comma separated list of elements, surrounded by braces and |
| 2371 | preceded by an exclamation point). Metadata nodes can have any values as |
| 2372 | their operand. For example: |
| 2373 | |
| 2374 | .. code-block:: llvm |
| 2375 | |
| 2376 | !{ metadata !"test\00", i32 10} |
| 2377 | |
| 2378 | A :ref:`named metadata <namedmetadatastructure>` is a collection of |
| 2379 | metadata nodes, which can be looked up in the module symbol table. For |
| 2380 | example: |
| 2381 | |
| 2382 | .. code-block:: llvm |
| 2383 | |
| 2384 | !foo = metadata !{!4, !3} |
| 2385 | |
| 2386 | Metadata can be used as function arguments. Here ``llvm.dbg.value`` |
| 2387 | function is using two metadata arguments: |
| 2388 | |
| 2389 | .. code-block:: llvm |
| 2390 | |
| 2391 | call void @llvm.dbg.value(metadata !24, i64 0, metadata !25) |
| 2392 | |
| 2393 | Metadata can be attached with an instruction. Here metadata ``!21`` is |
| 2394 | attached to the ``add`` instruction using the ``!dbg`` identifier: |
| 2395 | |
| 2396 | .. code-block:: llvm |
| 2397 | |
| 2398 | %indvar.next = add i64 %indvar, 1, !dbg !21 |
| 2399 | |
| 2400 | More information about specific metadata nodes recognized by the |
| 2401 | optimizers and code generator is found below. |
| 2402 | |
| 2403 | '``tbaa``' Metadata |
| 2404 | ^^^^^^^^^^^^^^^^^^^ |
| 2405 | |
| 2406 | In LLVM IR, memory does not have types, so LLVM's own type system is not |
| 2407 | suitable for doing TBAA. Instead, metadata is added to the IR to |
| 2408 | describe a type system of a higher level language. This can be used to |
| 2409 | implement typical C/C++ TBAA, but it can also be used to implement |
| 2410 | custom alias analysis behavior for other languages. |
| 2411 | |
| 2412 | The current metadata format is very simple. TBAA metadata nodes have up |
| 2413 | to three fields, e.g.: |
| 2414 | |
| 2415 | .. code-block:: llvm |
| 2416 | |
| 2417 | !0 = metadata !{ metadata !"an example type tree" } |
| 2418 | !1 = metadata !{ metadata !"int", metadata !0 } |
| 2419 | !2 = metadata !{ metadata !"float", metadata !0 } |
| 2420 | !3 = metadata !{ metadata !"const float", metadata !2, i64 1 } |
| 2421 | |
| 2422 | The first field is an identity field. It can be any value, usually a |
| 2423 | metadata string, which uniquely identifies the type. The most important |
| 2424 | name in the tree is the name of the root node. Two trees with different |
| 2425 | root node names are entirely disjoint, even if they have leaves with |
| 2426 | common names. |
| 2427 | |
| 2428 | The second field identifies the type's parent node in the tree, or is |
| 2429 | null or omitted for a root node. A type is considered to alias all of |
| 2430 | its descendants and all of its ancestors in the tree. Also, a type is |
| 2431 | considered to alias all types in other trees, so that bitcode produced |
| 2432 | from multiple front-ends is handled conservatively. |
| 2433 | |
| 2434 | If the third field is present, it's an integer which if equal to 1 |
| 2435 | indicates that the type is "constant" (meaning |
| 2436 | ``pointsToConstantMemory`` should return true; see `other useful |
| 2437 | AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). |
| 2438 | |
| 2439 | '``tbaa.struct``' Metadata |
| 2440 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 2441 | |
| 2442 | The :ref:`llvm.memcpy <int_memcpy>` is often used to implement |
| 2443 | aggregate assignment operations in C and similar languages, however it |
| 2444 | is defined to copy a contiguous region of memory, which is more than |
| 2445 | strictly necessary for aggregate types which contain holes due to |
| 2446 | padding. Also, it doesn't contain any TBAA information about the fields |
| 2447 | of the aggregate. |
| 2448 | |
| 2449 | ``!tbaa.struct`` metadata can describe which memory subregions in a |
| 2450 | memcpy are padding and what the TBAA tags of the struct are. |
| 2451 | |
| 2452 | The current metadata format is very simple. ``!tbaa.struct`` metadata |
| 2453 | nodes are a list of operands which are in conceptual groups of three. |
| 2454 | For each group of three, the first operand gives the byte offset of a |
| 2455 | field in bytes, the second gives its size in bytes, and the third gives |
| 2456 | its tbaa tag. e.g.: |
| 2457 | |
| 2458 | .. code-block:: llvm |
| 2459 | |
| 2460 | !4 = metadata !{ i64 0, i64 4, metadata !1, i64 8, i64 4, metadata !2 } |
| 2461 | |
| 2462 | This describes a struct with two fields. The first is at offset 0 bytes |
| 2463 | with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes |
| 2464 | and has size 4 bytes and has tbaa tag !2. |
| 2465 | |
| 2466 | Note that the fields need not be contiguous. In this example, there is a |
| 2467 | 4 byte gap between the two fields. This gap represents padding which |
| 2468 | does not carry useful data and need not be preserved. |
| 2469 | |
| 2470 | '``fpmath``' Metadata |
| 2471 | ^^^^^^^^^^^^^^^^^^^^^ |
| 2472 | |
| 2473 | ``fpmath`` metadata may be attached to any instruction of floating point |
| 2474 | type. It can be used to express the maximum acceptable error in the |
| 2475 | result of that instruction, in ULPs, thus potentially allowing the |
| 2476 | compiler to use a more efficient but less accurate method of computing |
| 2477 | it. ULP is defined as follows: |
| 2478 | |
| 2479 | If ``x`` is a real number that lies between two finite consecutive |
| 2480 | floating-point numbers ``a`` and ``b``, without being equal to one |
| 2481 | of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the |
| 2482 | distance between the two non-equal finite floating-point numbers |
| 2483 | nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. |
| 2484 | |
| 2485 | The metadata node shall consist of a single positive floating point |
| 2486 | number representing the maximum relative error, for example: |
| 2487 | |
| 2488 | .. code-block:: llvm |
| 2489 | |
| 2490 | !0 = metadata !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs |
| 2491 | |
| 2492 | '``range``' Metadata |
| 2493 | ^^^^^^^^^^^^^^^^^^^^ |
| 2494 | |
| 2495 | ``range`` metadata may be attached only to loads of integer types. It |
| 2496 | expresses the possible ranges the loaded value is in. The ranges are |
| 2497 | represented with a flattened list of integers. The loaded value is known |
| 2498 | to be in the union of the ranges defined by each consecutive pair. Each |
| 2499 | pair has the following properties: |
| 2500 | |
| 2501 | - The type must match the type loaded by the instruction. |
| 2502 | - The pair ``a,b`` represents the range ``[a,b)``. |
| 2503 | - Both ``a`` and ``b`` are constants. |
| 2504 | - The range is allowed to wrap. |
| 2505 | - The range should not represent the full or empty set. That is, |
| 2506 | ``a!=b``. |
| 2507 | |
| 2508 | In addition, the pairs must be in signed order of the lower bound and |
| 2509 | they must be non-contiguous. |
| 2510 | |
| 2511 | Examples: |
| 2512 | |
| 2513 | .. code-block:: llvm |
| 2514 | |
| 2515 | %a = load i8* %x, align 1, !range !0 ; Can only be 0 or 1 |
| 2516 | %b = load i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 |
| 2517 | %c = load i8* %z, align 1, !range !2 ; Can only be 0, 1, 3, 4 or 5 |
| 2518 | %d = load i8* %z, align 1, !range !3 ; Can only be -2, -1, 3, 4 or 5 |
| 2519 | ... |
| 2520 | !0 = metadata !{ i8 0, i8 2 } |
| 2521 | !1 = metadata !{ i8 255, i8 2 } |
| 2522 | !2 = metadata !{ i8 0, i8 2, i8 3, i8 6 } |
| 2523 | !3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 } |
| 2524 | |
| 2525 | Module Flags Metadata |
| 2526 | ===================== |
| 2527 | |
| 2528 | Information about the module as a whole is difficult to convey to LLVM's |
| 2529 | subsystems. The LLVM IR isn't sufficient to transmit this information. |
| 2530 | The ``llvm.module.flags`` named metadata exists in order to facilitate |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2531 | this. These flags are in the form of key / value pairs --- much like a |
| 2532 | dictionary --- making it easy for any subsystem who cares about a flag to |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2533 | look it up. |
| 2534 | |
| 2535 | The ``llvm.module.flags`` metadata contains a list of metadata triplets. |
| 2536 | Each triplet has the following form: |
| 2537 | |
| 2538 | - The first element is a *behavior* flag, which specifies the behavior |
| 2539 | when two (or more) modules are merged together, and it encounters two |
| 2540 | (or more) metadata with the same ID. The supported behaviors are |
| 2541 | described below. |
| 2542 | - The second element is a metadata string that is a unique ID for the |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2543 | metadata. Each module may only have one flag entry for each unique ID (not |
| 2544 | including entries with the **Require** behavior). |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2545 | - The third element is the value of the flag. |
| 2546 | |
| 2547 | When two (or more) modules are merged together, the resulting |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2548 | ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for |
| 2549 | each unique metadata ID string, there will be exactly one entry in the merged |
| 2550 | modules ``llvm.module.flags`` metadata table, and the value for that entry will |
| 2551 | be determined by the merge behavior flag, as described below. The only exception |
| 2552 | is that entries with the *Require* behavior are always preserved. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2553 | |
| 2554 | The following behaviors are supported: |
| 2555 | |
| 2556 | .. list-table:: |
| 2557 | :header-rows: 1 |
| 2558 | :widths: 10 90 |
| 2559 | |
| 2560 | * - Value |
| 2561 | - Behavior |
| 2562 | |
| 2563 | * - 1 |
| 2564 | - **Error** |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2565 | Emits an error if two values disagree, otherwise the resulting value |
| 2566 | is that of the operands. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2567 | |
| 2568 | * - 2 |
| 2569 | - **Warning** |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2570 | Emits a warning if two values disagree. The result value will be the |
| 2571 | operand for the flag from the first module being linked. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2572 | |
| 2573 | * - 3 |
| 2574 | - **Require** |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2575 | Adds a requirement that another module flag be present and have a |
| 2576 | specified value after linking is performed. The value must be a |
| 2577 | metadata pair, where the first element of the pair is the ID of the |
| 2578 | module flag to be restricted, and the second element of the pair is |
| 2579 | the value the module flag should be restricted to. This behavior can |
| 2580 | be used to restrict the allowable results (via triggering of an |
| 2581 | error) of linking IDs with the **Override** behavior. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2582 | |
| 2583 | * - 4 |
| 2584 | - **Override** |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2585 | Uses the specified value, regardless of the behavior or value of the |
| 2586 | other module. If both modules specify **Override**, but the values |
| 2587 | differ, an error will be emitted. |
| 2588 | |
Daniel Dunbar | 5db391c | 2013-01-16 21:38:56 +0000 | [diff] [blame] | 2589 | * - 5 |
| 2590 | - **Append** |
| 2591 | Appends the two values, which are required to be metadata nodes. |
| 2592 | |
| 2593 | * - 6 |
| 2594 | - **AppendUnique** |
| 2595 | Appends the two values, which are required to be metadata |
| 2596 | nodes. However, duplicate entries in the second list are dropped |
| 2597 | during the append operation. |
| 2598 | |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2599 | It is an error for a particular unique flag ID to have multiple behaviors, |
| 2600 | except in the case of **Require** (which adds restrictions on another metadata |
| 2601 | value) or **Override**. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2602 | |
| 2603 | An example of module flags: |
| 2604 | |
| 2605 | .. code-block:: llvm |
| 2606 | |
| 2607 | !0 = metadata !{ i32 1, metadata !"foo", i32 1 } |
| 2608 | !1 = metadata !{ i32 4, metadata !"bar", i32 37 } |
| 2609 | !2 = metadata !{ i32 2, metadata !"qux", i32 42 } |
| 2610 | !3 = metadata !{ i32 3, metadata !"qux", |
| 2611 | metadata !{ |
| 2612 | metadata !"foo", i32 1 |
| 2613 | } |
| 2614 | } |
| 2615 | !llvm.module.flags = !{ !0, !1, !2, !3 } |
| 2616 | |
| 2617 | - Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior |
| 2618 | if two or more ``!"foo"`` flags are seen is to emit an error if their |
| 2619 | values are not equal. |
| 2620 | |
| 2621 | - Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The |
| 2622 | behavior if two or more ``!"bar"`` flags are seen is to use the value |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2623 | '37'. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2624 | |
| 2625 | - Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The |
| 2626 | behavior if two or more ``!"qux"`` flags are seen is to emit a |
| 2627 | warning if their values are not equal. |
| 2628 | |
| 2629 | - Metadata ``!3`` has the ID ``!"qux"`` and the value: |
| 2630 | |
| 2631 | :: |
| 2632 | |
| 2633 | metadata !{ metadata !"foo", i32 1 } |
| 2634 | |
Daniel Dunbar | 8dd938e | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 2635 | The behavior is to emit an error if the ``llvm.module.flags`` does not |
| 2636 | contain a flag with the ID ``!"foo"`` that has the value '1' after linking is |
| 2637 | performed. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2638 | |
| 2639 | Objective-C Garbage Collection Module Flags Metadata |
| 2640 | ---------------------------------------------------- |
| 2641 | |
| 2642 | On the Mach-O platform, Objective-C stores metadata about garbage |
| 2643 | collection in a special section called "image info". The metadata |
| 2644 | consists of a version number and a bitmask specifying what types of |
| 2645 | garbage collection are supported (if any) by the file. If two or more |
| 2646 | modules are linked together their garbage collection metadata needs to |
| 2647 | be merged rather than appended together. |
| 2648 | |
| 2649 | The Objective-C garbage collection module flags metadata consists of the |
| 2650 | following key-value pairs: |
| 2651 | |
| 2652 | .. list-table:: |
| 2653 | :header-rows: 1 |
| 2654 | :widths: 30 70 |
| 2655 | |
| 2656 | * - Key |
| 2657 | - Value |
| 2658 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 2659 | * - ``Objective-C Version`` |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2660 | - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2661 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 2662 | * - ``Objective-C Image Info Version`` |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2663 | - **[Required]** --- The version of the image info section. Currently |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2664 | always 0. |
| 2665 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 2666 | * - ``Objective-C Image Info Section`` |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2667 | - **[Required]** --- The section to place the metadata. Valid values are |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2668 | ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and |
| 2669 | ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for |
| 2670 | Objective-C ABI version 2. |
| 2671 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 2672 | * - ``Objective-C Garbage Collection`` |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2673 | - **[Required]** --- Specifies whether garbage collection is supported or |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2674 | not. Valid values are 0, for no garbage collection, and 2, for garbage |
| 2675 | collection supported. |
| 2676 | |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 2677 | * - ``Objective-C GC Only`` |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2678 | - **[Optional]** --- Specifies that only garbage collection is supported. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2679 | If present, its value must be 6. This flag requires that the |
| 2680 | ``Objective-C Garbage Collection`` flag have the value 2. |
| 2681 | |
| 2682 | Some important flag interactions: |
| 2683 | |
| 2684 | - If a module with ``Objective-C Garbage Collection`` set to 0 is |
| 2685 | merged with a module with ``Objective-C Garbage Collection`` set to |
| 2686 | 2, then the resulting module has the |
| 2687 | ``Objective-C Garbage Collection`` flag set to 0. |
| 2688 | - A module with ``Objective-C Garbage Collection`` set to 0 cannot be |
| 2689 | merged with a module with ``Objective-C GC Only`` set to 6. |
| 2690 | |
Daniel Dunbar | e06bfe8 | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 2691 | Automatic Linker Flags Module Flags Metadata |
| 2692 | -------------------------------------------- |
| 2693 | |
| 2694 | Some targets support embedding flags to the linker inside individual object |
| 2695 | files. Typically this is used in conjunction with language extensions which |
| 2696 | allow source files to explicitly declare the libraries they depend on, and have |
| 2697 | these automatically be transmitted to the linker via object files. |
| 2698 | |
| 2699 | These flags are encoded in the IR using metadata in the module flags section, |
Daniel Dunbar | 3389dbc | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 2700 | using the ``Linker Options`` key. The merge behavior for this flag is required |
Daniel Dunbar | e06bfe8 | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 2701 | to be ``AppendUnique``, and the value for the key is expected to be a metadata |
| 2702 | node which should be a list of other metadata nodes, each of which should be a |
| 2703 | list of metadata strings defining linker options. |
| 2704 | |
| 2705 | For example, the following metadata section specifies two separate sets of |
| 2706 | linker options, presumably to link against ``libz`` and the ``Cocoa`` |
| 2707 | framework:: |
| 2708 | |
Daniel Dunbar | 6d49b68 | 2013-01-18 19:37:00 +0000 | [diff] [blame] | 2709 | !0 = metadata !{ i32 6, metadata !"Linker Options", |
Daniel Dunbar | e06bfe8 | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 2710 | metadata !{ |
Daniel Dunbar | 6d49b68 | 2013-01-18 19:37:00 +0000 | [diff] [blame] | 2711 | metadata !{ metadata !"-lz" }, |
| 2712 | metadata !{ metadata !"-framework", metadata !"Cocoa" } } } |
Daniel Dunbar | e06bfe8 | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 2713 | !llvm.module.flags = !{ !0 } |
| 2714 | |
| 2715 | The metadata encoding as lists of lists of options, as opposed to a collapsed |
| 2716 | list of options, is chosen so that the IR encoding can use multiple option |
| 2717 | strings to specify e.g., a single library, while still having that specifier be |
| 2718 | preserved as an atomic element that can be recognized by a target specific |
| 2719 | assembly writer or object file emitter. |
| 2720 | |
| 2721 | Each individual option is required to be either a valid option for the target's |
| 2722 | linker, or an option that is reserved by the target specific assembly writer or |
| 2723 | object file emitter. No other aspect of these options is defined by the IR. |
| 2724 | |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2725 | Intrinsic Global Variables |
| 2726 | ========================== |
| 2727 | |
| 2728 | LLVM has a number of "magic" global variables that contain data that |
| 2729 | affect code generation or other IR semantics. These are documented here. |
| 2730 | All globals of this sort should have a section specified as |
| 2731 | "``llvm.metadata``". This section and all globals that start with |
| 2732 | "``llvm.``" are reserved for use by LLVM. |
| 2733 | |
| 2734 | The '``llvm.used``' Global Variable |
| 2735 | ----------------------------------- |
| 2736 | |
| 2737 | The ``@llvm.used`` global is an array with i8\* element type which has |
| 2738 | :ref:`appending linkage <linkage_appending>`. This array contains a list of |
| 2739 | pointers to global variables and functions which may optionally have a |
| 2740 | pointer cast formed of bitcast or getelementptr. For example, a legal |
| 2741 | use of it is: |
| 2742 | |
| 2743 | .. code-block:: llvm |
| 2744 | |
| 2745 | @X = global i8 4 |
| 2746 | @Y = global i32 123 |
| 2747 | |
| 2748 | @llvm.used = appending global [2 x i8*] [ |
| 2749 | i8* @X, |
| 2750 | i8* bitcast (i32* @Y to i8*) |
| 2751 | ], section "llvm.metadata" |
| 2752 | |
| 2753 | If a global variable appears in the ``@llvm.used`` list, then the |
| 2754 | compiler, assembler, and linker are required to treat the symbol as if |
| 2755 | there is a reference to the global that it cannot see. For example, if a |
| 2756 | variable has internal linkage and no references other than that from the |
| 2757 | ``@llvm.used`` list, it cannot be deleted. This is commonly used to |
| 2758 | represent references from inline asms and other things the compiler |
| 2759 | cannot "see", and corresponds to "``attribute((used))``" in GNU C. |
| 2760 | |
| 2761 | On some targets, the code generator must emit a directive to the |
| 2762 | assembler or object file to prevent the assembler and linker from |
| 2763 | molesting the symbol. |
| 2764 | |
| 2765 | The '``llvm.compiler.used``' Global Variable |
| 2766 | -------------------------------------------- |
| 2767 | |
| 2768 | The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used`` |
| 2769 | directive, except that it only prevents the compiler from touching the |
| 2770 | symbol. On targets that support it, this allows an intelligent linker to |
| 2771 | optimize references to the symbol without being impeded as it would be |
| 2772 | by ``@llvm.used``. |
| 2773 | |
| 2774 | This is a rare construct that should only be used in rare circumstances, |
| 2775 | and should not be exposed to source languages. |
| 2776 | |
| 2777 | The '``llvm.global_ctors``' Global Variable |
| 2778 | ------------------------------------------- |
| 2779 | |
| 2780 | .. code-block:: llvm |
| 2781 | |
| 2782 | %0 = type { i32, void ()* } |
| 2783 | @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor }] |
| 2784 | |
| 2785 | The ``@llvm.global_ctors`` array contains a list of constructor |
| 2786 | functions and associated priorities. The functions referenced by this |
| 2787 | array will be called in ascending order of priority (i.e. lowest first) |
| 2788 | when the module is loaded. The order of functions with the same priority |
| 2789 | is not defined. |
| 2790 | |
| 2791 | The '``llvm.global_dtors``' Global Variable |
| 2792 | ------------------------------------------- |
| 2793 | |
| 2794 | .. code-block:: llvm |
| 2795 | |
| 2796 | %0 = type { i32, void ()* } |
| 2797 | @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor }] |
| 2798 | |
| 2799 | The ``@llvm.global_dtors`` array contains a list of destructor functions |
| 2800 | and associated priorities. The functions referenced by this array will |
| 2801 | be called in descending order of priority (i.e. highest first) when the |
| 2802 | module is loaded. The order of functions with the same priority is not |
| 2803 | defined. |
| 2804 | |
| 2805 | Instruction Reference |
| 2806 | ===================== |
| 2807 | |
| 2808 | The LLVM instruction set consists of several different classifications |
| 2809 | of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary |
| 2810 | instructions <binaryops>`, :ref:`bitwise binary |
| 2811 | instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and |
| 2812 | :ref:`other instructions <otherops>`. |
| 2813 | |
| 2814 | .. _terminators: |
| 2815 | |
| 2816 | Terminator Instructions |
| 2817 | ----------------------- |
| 2818 | |
| 2819 | As mentioned :ref:`previously <functionstructure>`, every basic block in a |
| 2820 | program ends with a "Terminator" instruction, which indicates which |
| 2821 | block should be executed after the current block is finished. These |
| 2822 | terminator instructions typically yield a '``void``' value: they produce |
| 2823 | control flow, not values (the one exception being the |
| 2824 | ':ref:`invoke <i_invoke>`' instruction). |
| 2825 | |
| 2826 | The terminator instructions are: ':ref:`ret <i_ret>`', |
| 2827 | ':ref:`br <i_br>`', ':ref:`switch <i_switch>`', |
| 2828 | ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', |
| 2829 | ':ref:`resume <i_resume>`', and ':ref:`unreachable <i_unreachable>`'. |
| 2830 | |
| 2831 | .. _i_ret: |
| 2832 | |
| 2833 | '``ret``' Instruction |
| 2834 | ^^^^^^^^^^^^^^^^^^^^^ |
| 2835 | |
| 2836 | Syntax: |
| 2837 | """"""" |
| 2838 | |
| 2839 | :: |
| 2840 | |
| 2841 | ret <type> <value> ; Return a value from a non-void function |
| 2842 | ret void ; Return from void function |
| 2843 | |
| 2844 | Overview: |
| 2845 | """"""""" |
| 2846 | |
| 2847 | The '``ret``' instruction is used to return control flow (and optionally |
| 2848 | a value) from a function back to the caller. |
| 2849 | |
| 2850 | There are two forms of the '``ret``' instruction: one that returns a |
| 2851 | value and then causes control flow, and one that just causes control |
| 2852 | flow to occur. |
| 2853 | |
| 2854 | Arguments: |
| 2855 | """""""""" |
| 2856 | |
| 2857 | The '``ret``' instruction optionally accepts a single argument, the |
| 2858 | return value. The type of the return value must be a ':ref:`first |
| 2859 | class <t_firstclass>`' type. |
| 2860 | |
| 2861 | A function is not :ref:`well formed <wellformed>` if it it has a non-void |
| 2862 | return type and contains a '``ret``' instruction with no return value or |
| 2863 | a return value with a type that does not match its type, or if it has a |
| 2864 | void return type and contains a '``ret``' instruction with a return |
| 2865 | value. |
| 2866 | |
| 2867 | Semantics: |
| 2868 | """""""""" |
| 2869 | |
| 2870 | When the '``ret``' instruction is executed, control flow returns back to |
| 2871 | the calling function's context. If the caller is a |
| 2872 | ":ref:`call <i_call>`" instruction, execution continues at the |
| 2873 | instruction after the call. If the caller was an |
| 2874 | ":ref:`invoke <i_invoke>`" instruction, execution continues at the |
| 2875 | beginning of the "normal" destination block. If the instruction returns |
| 2876 | a value, that value shall set the call or invoke instruction's return |
| 2877 | value. |
| 2878 | |
| 2879 | Example: |
| 2880 | """""""" |
| 2881 | |
| 2882 | .. code-block:: llvm |
| 2883 | |
| 2884 | ret i32 5 ; Return an integer value of 5 |
| 2885 | ret void ; Return from a void function |
| 2886 | ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 |
| 2887 | |
| 2888 | .. _i_br: |
| 2889 | |
| 2890 | '``br``' Instruction |
| 2891 | ^^^^^^^^^^^^^^^^^^^^ |
| 2892 | |
| 2893 | Syntax: |
| 2894 | """"""" |
| 2895 | |
| 2896 | :: |
| 2897 | |
| 2898 | br i1 <cond>, label <iftrue>, label <iffalse> |
| 2899 | br label <dest> ; Unconditional branch |
| 2900 | |
| 2901 | Overview: |
| 2902 | """"""""" |
| 2903 | |
| 2904 | The '``br``' instruction is used to cause control flow to transfer to a |
| 2905 | different basic block in the current function. There are two forms of |
| 2906 | this instruction, corresponding to a conditional branch and an |
| 2907 | unconditional branch. |
| 2908 | |
| 2909 | Arguments: |
| 2910 | """""""""" |
| 2911 | |
| 2912 | The conditional branch form of the '``br``' instruction takes a single |
| 2913 | '``i1``' value and two '``label``' values. The unconditional form of the |
| 2914 | '``br``' instruction takes a single '``label``' value as a target. |
| 2915 | |
| 2916 | Semantics: |
| 2917 | """""""""" |
| 2918 | |
| 2919 | Upon execution of a conditional '``br``' instruction, the '``i1``' |
| 2920 | argument is evaluated. If the value is ``true``, control flows to the |
| 2921 | '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows |
| 2922 | to the '``iffalse``' ``label`` argument. |
| 2923 | |
| 2924 | Example: |
| 2925 | """""""" |
| 2926 | |
| 2927 | .. code-block:: llvm |
| 2928 | |
| 2929 | Test: |
| 2930 | %cond = icmp eq i32 %a, %b |
| 2931 | br i1 %cond, label %IfEqual, label %IfUnequal |
| 2932 | IfEqual: |
| 2933 | ret i32 1 |
| 2934 | IfUnequal: |
| 2935 | ret i32 0 |
| 2936 | |
| 2937 | .. _i_switch: |
| 2938 | |
| 2939 | '``switch``' Instruction |
| 2940 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 2941 | |
| 2942 | Syntax: |
| 2943 | """"""" |
| 2944 | |
| 2945 | :: |
| 2946 | |
| 2947 | switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ] |
| 2948 | |
| 2949 | Overview: |
| 2950 | """"""""" |
| 2951 | |
| 2952 | The '``switch``' instruction is used to transfer control flow to one of |
| 2953 | several different places. It is a generalization of the '``br``' |
| 2954 | instruction, allowing a branch to occur to one of many possible |
| 2955 | destinations. |
| 2956 | |
| 2957 | Arguments: |
| 2958 | """""""""" |
| 2959 | |
| 2960 | The '``switch``' instruction uses three parameters: an integer |
| 2961 | comparison value '``value``', a default '``label``' destination, and an |
| 2962 | array of pairs of comparison value constants and '``label``'s. The table |
| 2963 | is not allowed to contain duplicate constant entries. |
| 2964 | |
| 2965 | Semantics: |
| 2966 | """""""""" |
| 2967 | |
| 2968 | The ``switch`` instruction specifies a table of values and destinations. |
| 2969 | When the '``switch``' instruction is executed, this table is searched |
| 2970 | for the given value. If the value is found, control flow is transferred |
| 2971 | to the corresponding destination; otherwise, control flow is transferred |
| 2972 | to the default destination. |
| 2973 | |
| 2974 | Implementation: |
| 2975 | """"""""""""""" |
| 2976 | |
| 2977 | Depending on properties of the target machine and the particular |
| 2978 | ``switch`` instruction, this instruction may be code generated in |
| 2979 | different ways. For example, it could be generated as a series of |
| 2980 | chained conditional branches or with a lookup table. |
| 2981 | |
| 2982 | Example: |
| 2983 | """""""" |
| 2984 | |
| 2985 | .. code-block:: llvm |
| 2986 | |
| 2987 | ; Emulate a conditional br instruction |
| 2988 | %Val = zext i1 %value to i32 |
| 2989 | switch i32 %Val, label %truedest [ i32 0, label %falsedest ] |
| 2990 | |
| 2991 | ; Emulate an unconditional br instruction |
| 2992 | switch i32 0, label %dest [ ] |
| 2993 | |
| 2994 | ; Implement a jump table: |
| 2995 | switch i32 %val, label %otherwise [ i32 0, label %onzero |
| 2996 | i32 1, label %onone |
| 2997 | i32 2, label %ontwo ] |
| 2998 | |
| 2999 | .. _i_indirectbr: |
| 3000 | |
| 3001 | '``indirectbr``' Instruction |
| 3002 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3003 | |
| 3004 | Syntax: |
| 3005 | """"""" |
| 3006 | |
| 3007 | :: |
| 3008 | |
| 3009 | indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] |
| 3010 | |
| 3011 | Overview: |
| 3012 | """"""""" |
| 3013 | |
| 3014 | The '``indirectbr``' instruction implements an indirect branch to a |
| 3015 | label within the current function, whose address is specified by |
| 3016 | "``address``". Address must be derived from a |
| 3017 | :ref:`blockaddress <blockaddress>` constant. |
| 3018 | |
| 3019 | Arguments: |
| 3020 | """""""""" |
| 3021 | |
| 3022 | The '``address``' argument is the address of the label to jump to. The |
| 3023 | rest of the arguments indicate the full set of possible destinations |
| 3024 | that the address may point to. Blocks are allowed to occur multiple |
| 3025 | times in the destination list, though this isn't particularly useful. |
| 3026 | |
| 3027 | This destination list is required so that dataflow analysis has an |
| 3028 | accurate understanding of the CFG. |
| 3029 | |
| 3030 | Semantics: |
| 3031 | """""""""" |
| 3032 | |
| 3033 | Control transfers to the block specified in the address argument. All |
| 3034 | possible destination blocks must be listed in the label list, otherwise |
| 3035 | this instruction has undefined behavior. This implies that jumps to |
| 3036 | labels defined in other functions have undefined behavior as well. |
| 3037 | |
| 3038 | Implementation: |
| 3039 | """"""""""""""" |
| 3040 | |
| 3041 | This is typically implemented with a jump through a register. |
| 3042 | |
| 3043 | Example: |
| 3044 | """""""" |
| 3045 | |
| 3046 | .. code-block:: llvm |
| 3047 | |
| 3048 | indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] |
| 3049 | |
| 3050 | .. _i_invoke: |
| 3051 | |
| 3052 | '``invoke``' Instruction |
| 3053 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3054 | |
| 3055 | Syntax: |
| 3056 | """"""" |
| 3057 | |
| 3058 | :: |
| 3059 | |
| 3060 | <result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs] |
| 3061 | to label <normal label> unwind label <exception label> |
| 3062 | |
| 3063 | Overview: |
| 3064 | """"""""" |
| 3065 | |
| 3066 | The '``invoke``' instruction causes control to transfer to a specified |
| 3067 | function, with the possibility of control flow transfer to either the |
| 3068 | '``normal``' label or the '``exception``' label. If the callee function |
| 3069 | returns with the "``ret``" instruction, control flow will return to the |
| 3070 | "normal" label. If the callee (or any indirect callees) returns via the |
| 3071 | ":ref:`resume <i_resume>`" instruction or other exception handling |
| 3072 | mechanism, control is interrupted and continued at the dynamically |
| 3073 | nearest "exception" label. |
| 3074 | |
| 3075 | The '``exception``' label is a `landing |
| 3076 | pad <ExceptionHandling.html#overview>`_ for the exception. As such, |
| 3077 | '``exception``' label is required to have the |
| 3078 | ":ref:`landingpad <i_landingpad>`" instruction, which contains the |
| 3079 | information about the behavior of the program after unwinding happens, |
| 3080 | as its first non-PHI instruction. The restrictions on the |
| 3081 | "``landingpad``" instruction's tightly couples it to the "``invoke``" |
| 3082 | instruction, so that the important information contained within the |
| 3083 | "``landingpad``" instruction can't be lost through normal code motion. |
| 3084 | |
| 3085 | Arguments: |
| 3086 | """""""""" |
| 3087 | |
| 3088 | This instruction requires several arguments: |
| 3089 | |
| 3090 | #. The optional "cconv" marker indicates which :ref:`calling |
| 3091 | convention <callingconv>` the call should use. If none is |
| 3092 | specified, the call defaults to using C calling conventions. |
| 3093 | #. The optional :ref:`Parameter Attributes <paramattrs>` list for return |
| 3094 | values. Only '``zeroext``', '``signext``', and '``inreg``' attributes |
| 3095 | are valid here. |
| 3096 | #. '``ptr to function ty``': shall be the signature of the pointer to |
| 3097 | function value being invoked. In most cases, this is a direct |
| 3098 | function invocation, but indirect ``invoke``'s are just as possible, |
| 3099 | branching off an arbitrary pointer to function value. |
| 3100 | #. '``function ptr val``': An LLVM value containing a pointer to a |
| 3101 | function to be invoked. |
| 3102 | #. '``function args``': argument list whose types match the function |
| 3103 | signature argument types and parameter attributes. All arguments must |
| 3104 | be of :ref:`first class <t_firstclass>` type. If the function signature |
| 3105 | indicates the function accepts a variable number of arguments, the |
| 3106 | extra arguments can be specified. |
| 3107 | #. '``normal label``': the label reached when the called function |
| 3108 | executes a '``ret``' instruction. |
| 3109 | #. '``exception label``': the label reached when a callee returns via |
| 3110 | the :ref:`resume <i_resume>` instruction or other exception handling |
| 3111 | mechanism. |
| 3112 | #. The optional :ref:`function attributes <fnattrs>` list. Only |
| 3113 | '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``' |
| 3114 | attributes are valid here. |
| 3115 | |
| 3116 | Semantics: |
| 3117 | """""""""" |
| 3118 | |
| 3119 | This instruction is designed to operate as a standard '``call``' |
| 3120 | instruction in most regards. The primary difference is that it |
| 3121 | establishes an association with a label, which is used by the runtime |
| 3122 | library to unwind the stack. |
| 3123 | |
| 3124 | This instruction is used in languages with destructors to ensure that |
| 3125 | proper cleanup is performed in the case of either a ``longjmp`` or a |
| 3126 | thrown exception. Additionally, this is important for implementation of |
| 3127 | '``catch``' clauses in high-level languages that support them. |
| 3128 | |
| 3129 | For the purposes of the SSA form, the definition of the value returned |
| 3130 | by the '``invoke``' instruction is deemed to occur on the edge from the |
| 3131 | current block to the "normal" label. If the callee unwinds then no |
| 3132 | return value is available. |
| 3133 | |
| 3134 | Example: |
| 3135 | """""""" |
| 3136 | |
| 3137 | .. code-block:: llvm |
| 3138 | |
| 3139 | %retval = invoke i32 @Test(i32 15) to label %Continue |
| 3140 | unwind label %TestCleanup ; {i32}:retval set |
| 3141 | %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue |
| 3142 | unwind label %TestCleanup ; {i32}:retval set |
| 3143 | |
| 3144 | .. _i_resume: |
| 3145 | |
| 3146 | '``resume``' Instruction |
| 3147 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3148 | |
| 3149 | Syntax: |
| 3150 | """"""" |
| 3151 | |
| 3152 | :: |
| 3153 | |
| 3154 | resume <type> <value> |
| 3155 | |
| 3156 | Overview: |
| 3157 | """"""""" |
| 3158 | |
| 3159 | The '``resume``' instruction is a terminator instruction that has no |
| 3160 | successors. |
| 3161 | |
| 3162 | Arguments: |
| 3163 | """""""""" |
| 3164 | |
| 3165 | The '``resume``' instruction requires one argument, which must have the |
| 3166 | same type as the result of any '``landingpad``' instruction in the same |
| 3167 | function. |
| 3168 | |
| 3169 | Semantics: |
| 3170 | """""""""" |
| 3171 | |
| 3172 | The '``resume``' instruction resumes propagation of an existing |
| 3173 | (in-flight) exception whose unwinding was interrupted with a |
| 3174 | :ref:`landingpad <i_landingpad>` instruction. |
| 3175 | |
| 3176 | Example: |
| 3177 | """""""" |
| 3178 | |
| 3179 | .. code-block:: llvm |
| 3180 | |
| 3181 | resume { i8*, i32 } %exn |
| 3182 | |
| 3183 | .. _i_unreachable: |
| 3184 | |
| 3185 | '``unreachable``' Instruction |
| 3186 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3187 | |
| 3188 | Syntax: |
| 3189 | """"""" |
| 3190 | |
| 3191 | :: |
| 3192 | |
| 3193 | unreachable |
| 3194 | |
| 3195 | Overview: |
| 3196 | """"""""" |
| 3197 | |
| 3198 | The '``unreachable``' instruction has no defined semantics. This |
| 3199 | instruction is used to inform the optimizer that a particular portion of |
| 3200 | the code is not reachable. This can be used to indicate that the code |
| 3201 | after a no-return function cannot be reached, and other facts. |
| 3202 | |
| 3203 | Semantics: |
| 3204 | """""""""" |
| 3205 | |
| 3206 | The '``unreachable``' instruction has no defined semantics. |
| 3207 | |
| 3208 | .. _binaryops: |
| 3209 | |
| 3210 | Binary Operations |
| 3211 | ----------------- |
| 3212 | |
| 3213 | Binary operators are used to do most of the computation in a program. |
| 3214 | They require two operands of the same type, execute an operation on |
| 3215 | them, and produce a single value. The operands might represent multiple |
| 3216 | data, as is the case with the :ref:`vector <t_vector>` data type. The |
| 3217 | result value has the same type as its operands. |
| 3218 | |
| 3219 | There are several different binary operators: |
| 3220 | |
| 3221 | .. _i_add: |
| 3222 | |
| 3223 | '``add``' Instruction |
| 3224 | ^^^^^^^^^^^^^^^^^^^^^ |
| 3225 | |
| 3226 | Syntax: |
| 3227 | """"""" |
| 3228 | |
| 3229 | :: |
| 3230 | |
| 3231 | <result> = add <ty> <op1>, <op2> ; yields {ty}:result |
| 3232 | <result> = add nuw <ty> <op1>, <op2> ; yields {ty}:result |
| 3233 | <result> = add nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3234 | <result> = add nuw nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3235 | |
| 3236 | Overview: |
| 3237 | """"""""" |
| 3238 | |
| 3239 | The '``add``' instruction returns the sum of its two operands. |
| 3240 | |
| 3241 | Arguments: |
| 3242 | """""""""" |
| 3243 | |
| 3244 | The two arguments to the '``add``' instruction must be |
| 3245 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3246 | arguments must have identical types. |
| 3247 | |
| 3248 | Semantics: |
| 3249 | """""""""" |
| 3250 | |
| 3251 | The value produced is the integer sum of the two operands. |
| 3252 | |
| 3253 | If the sum has unsigned overflow, the result returned is the |
| 3254 | mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of |
| 3255 | the result. |
| 3256 | |
| 3257 | Because LLVM integers use a two's complement representation, this |
| 3258 | instruction is appropriate for both signed and unsigned integers. |
| 3259 | |
| 3260 | ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", |
| 3261 | respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the |
| 3262 | result value of the ``add`` is a :ref:`poison value <poisonvalues>` if |
| 3263 | unsigned and/or signed overflow, respectively, occurs. |
| 3264 | |
| 3265 | Example: |
| 3266 | """""""" |
| 3267 | |
| 3268 | .. code-block:: llvm |
| 3269 | |
| 3270 | <result> = add i32 4, %var ; yields {i32}:result = 4 + %var |
| 3271 | |
| 3272 | .. _i_fadd: |
| 3273 | |
| 3274 | '``fadd``' Instruction |
| 3275 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3276 | |
| 3277 | Syntax: |
| 3278 | """"""" |
| 3279 | |
| 3280 | :: |
| 3281 | |
| 3282 | <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result |
| 3283 | |
| 3284 | Overview: |
| 3285 | """"""""" |
| 3286 | |
| 3287 | The '``fadd``' instruction returns the sum of its two operands. |
| 3288 | |
| 3289 | Arguments: |
| 3290 | """""""""" |
| 3291 | |
| 3292 | The two arguments to the '``fadd``' instruction must be :ref:`floating |
| 3293 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 3294 | Both arguments must have identical types. |
| 3295 | |
| 3296 | Semantics: |
| 3297 | """""""""" |
| 3298 | |
| 3299 | The value produced is the floating point sum of the two operands. This |
| 3300 | instruction can also take any number of :ref:`fast-math flags <fastmath>`, |
| 3301 | which are optimization hints to enable otherwise unsafe floating point |
| 3302 | optimizations: |
| 3303 | |
| 3304 | Example: |
| 3305 | """""""" |
| 3306 | |
| 3307 | .. code-block:: llvm |
| 3308 | |
| 3309 | <result> = fadd float 4.0, %var ; yields {float}:result = 4.0 + %var |
| 3310 | |
| 3311 | '``sub``' Instruction |
| 3312 | ^^^^^^^^^^^^^^^^^^^^^ |
| 3313 | |
| 3314 | Syntax: |
| 3315 | """"""" |
| 3316 | |
| 3317 | :: |
| 3318 | |
| 3319 | <result> = sub <ty> <op1>, <op2> ; yields {ty}:result |
| 3320 | <result> = sub nuw <ty> <op1>, <op2> ; yields {ty}:result |
| 3321 | <result> = sub nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3322 | <result> = sub nuw nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3323 | |
| 3324 | Overview: |
| 3325 | """"""""" |
| 3326 | |
| 3327 | The '``sub``' instruction returns the difference of its two operands. |
| 3328 | |
| 3329 | Note that the '``sub``' instruction is used to represent the '``neg``' |
| 3330 | instruction present in most other intermediate representations. |
| 3331 | |
| 3332 | Arguments: |
| 3333 | """""""""" |
| 3334 | |
| 3335 | The two arguments to the '``sub``' instruction must be |
| 3336 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3337 | arguments must have identical types. |
| 3338 | |
| 3339 | Semantics: |
| 3340 | """""""""" |
| 3341 | |
| 3342 | The value produced is the integer difference of the two operands. |
| 3343 | |
| 3344 | If the difference has unsigned overflow, the result returned is the |
| 3345 | mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of |
| 3346 | the result. |
| 3347 | |
| 3348 | Because LLVM integers use a two's complement representation, this |
| 3349 | instruction is appropriate for both signed and unsigned integers. |
| 3350 | |
| 3351 | ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", |
| 3352 | respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the |
| 3353 | result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if |
| 3354 | unsigned and/or signed overflow, respectively, occurs. |
| 3355 | |
| 3356 | Example: |
| 3357 | """""""" |
| 3358 | |
| 3359 | .. code-block:: llvm |
| 3360 | |
| 3361 | <result> = sub i32 4, %var ; yields {i32}:result = 4 - %var |
| 3362 | <result> = sub i32 0, %val ; yields {i32}:result = -%var |
| 3363 | |
| 3364 | .. _i_fsub: |
| 3365 | |
| 3366 | '``fsub``' Instruction |
| 3367 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3368 | |
| 3369 | Syntax: |
| 3370 | """"""" |
| 3371 | |
| 3372 | :: |
| 3373 | |
| 3374 | <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result |
| 3375 | |
| 3376 | Overview: |
| 3377 | """"""""" |
| 3378 | |
| 3379 | The '``fsub``' instruction returns the difference of its two operands. |
| 3380 | |
| 3381 | Note that the '``fsub``' instruction is used to represent the '``fneg``' |
| 3382 | instruction present in most other intermediate representations. |
| 3383 | |
| 3384 | Arguments: |
| 3385 | """""""""" |
| 3386 | |
| 3387 | The two arguments to the '``fsub``' instruction must be :ref:`floating |
| 3388 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 3389 | Both arguments must have identical types. |
| 3390 | |
| 3391 | Semantics: |
| 3392 | """""""""" |
| 3393 | |
| 3394 | The value produced is the floating point difference of the two operands. |
| 3395 | This instruction can also take any number of :ref:`fast-math |
| 3396 | flags <fastmath>`, which are optimization hints to enable otherwise |
| 3397 | unsafe floating point optimizations: |
| 3398 | |
| 3399 | Example: |
| 3400 | """""""" |
| 3401 | |
| 3402 | .. code-block:: llvm |
| 3403 | |
| 3404 | <result> = fsub float 4.0, %var ; yields {float}:result = 4.0 - %var |
| 3405 | <result> = fsub float -0.0, %val ; yields {float}:result = -%var |
| 3406 | |
| 3407 | '``mul``' Instruction |
| 3408 | ^^^^^^^^^^^^^^^^^^^^^ |
| 3409 | |
| 3410 | Syntax: |
| 3411 | """"""" |
| 3412 | |
| 3413 | :: |
| 3414 | |
| 3415 | <result> = mul <ty> <op1>, <op2> ; yields {ty}:result |
| 3416 | <result> = mul nuw <ty> <op1>, <op2> ; yields {ty}:result |
| 3417 | <result> = mul nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3418 | <result> = mul nuw nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3419 | |
| 3420 | Overview: |
| 3421 | """"""""" |
| 3422 | |
| 3423 | The '``mul``' instruction returns the product of its two operands. |
| 3424 | |
| 3425 | Arguments: |
| 3426 | """""""""" |
| 3427 | |
| 3428 | The two arguments to the '``mul``' instruction must be |
| 3429 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3430 | arguments must have identical types. |
| 3431 | |
| 3432 | Semantics: |
| 3433 | """""""""" |
| 3434 | |
| 3435 | The value produced is the integer product of the two operands. |
| 3436 | |
| 3437 | If the result of the multiplication has unsigned overflow, the result |
| 3438 | returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the |
| 3439 | bit width of the result. |
| 3440 | |
| 3441 | Because LLVM integers use a two's complement representation, and the |
| 3442 | result is the same width as the operands, this instruction returns the |
| 3443 | correct result for both signed and unsigned integers. If a full product |
| 3444 | (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be |
| 3445 | sign-extended or zero-extended as appropriate to the width of the full |
| 3446 | product. |
| 3447 | |
| 3448 | ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", |
| 3449 | respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the |
| 3450 | result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if |
| 3451 | unsigned and/or signed overflow, respectively, occurs. |
| 3452 | |
| 3453 | Example: |
| 3454 | """""""" |
| 3455 | |
| 3456 | .. code-block:: llvm |
| 3457 | |
| 3458 | <result> = mul i32 4, %var ; yields {i32}:result = 4 * %var |
| 3459 | |
| 3460 | .. _i_fmul: |
| 3461 | |
| 3462 | '``fmul``' Instruction |
| 3463 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3464 | |
| 3465 | Syntax: |
| 3466 | """"""" |
| 3467 | |
| 3468 | :: |
| 3469 | |
| 3470 | <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result |
| 3471 | |
| 3472 | Overview: |
| 3473 | """"""""" |
| 3474 | |
| 3475 | The '``fmul``' instruction returns the product of its two operands. |
| 3476 | |
| 3477 | Arguments: |
| 3478 | """""""""" |
| 3479 | |
| 3480 | The two arguments to the '``fmul``' instruction must be :ref:`floating |
| 3481 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 3482 | Both arguments must have identical types. |
| 3483 | |
| 3484 | Semantics: |
| 3485 | """""""""" |
| 3486 | |
| 3487 | The value produced is the floating point product of the two operands. |
| 3488 | This instruction can also take any number of :ref:`fast-math |
| 3489 | flags <fastmath>`, which are optimization hints to enable otherwise |
| 3490 | unsafe floating point optimizations: |
| 3491 | |
| 3492 | Example: |
| 3493 | """""""" |
| 3494 | |
| 3495 | .. code-block:: llvm |
| 3496 | |
| 3497 | <result> = fmul float 4.0, %var ; yields {float}:result = 4.0 * %var |
| 3498 | |
| 3499 | '``udiv``' Instruction |
| 3500 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3501 | |
| 3502 | Syntax: |
| 3503 | """"""" |
| 3504 | |
| 3505 | :: |
| 3506 | |
| 3507 | <result> = udiv <ty> <op1>, <op2> ; yields {ty}:result |
| 3508 | <result> = udiv exact <ty> <op1>, <op2> ; yields {ty}:result |
| 3509 | |
| 3510 | Overview: |
| 3511 | """"""""" |
| 3512 | |
| 3513 | The '``udiv``' instruction returns the quotient of its two operands. |
| 3514 | |
| 3515 | Arguments: |
| 3516 | """""""""" |
| 3517 | |
| 3518 | The two arguments to the '``udiv``' instruction must be |
| 3519 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3520 | arguments must have identical types. |
| 3521 | |
| 3522 | Semantics: |
| 3523 | """""""""" |
| 3524 | |
| 3525 | The value produced is the unsigned integer quotient of the two operands. |
| 3526 | |
| 3527 | Note that unsigned integer division and signed integer division are |
| 3528 | distinct operations; for signed integer division, use '``sdiv``'. |
| 3529 | |
| 3530 | Division by zero leads to undefined behavior. |
| 3531 | |
| 3532 | If the ``exact`` keyword is present, the result value of the ``udiv`` is |
| 3533 | a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as |
| 3534 | such, "((a udiv exact b) mul b) == a"). |
| 3535 | |
| 3536 | Example: |
| 3537 | """""""" |
| 3538 | |
| 3539 | .. code-block:: llvm |
| 3540 | |
| 3541 | <result> = udiv i32 4, %var ; yields {i32}:result = 4 / %var |
| 3542 | |
| 3543 | '``sdiv``' Instruction |
| 3544 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3545 | |
| 3546 | Syntax: |
| 3547 | """"""" |
| 3548 | |
| 3549 | :: |
| 3550 | |
| 3551 | <result> = sdiv <ty> <op1>, <op2> ; yields {ty}:result |
| 3552 | <result> = sdiv exact <ty> <op1>, <op2> ; yields {ty}:result |
| 3553 | |
| 3554 | Overview: |
| 3555 | """"""""" |
| 3556 | |
| 3557 | The '``sdiv``' instruction returns the quotient of its two operands. |
| 3558 | |
| 3559 | Arguments: |
| 3560 | """""""""" |
| 3561 | |
| 3562 | The two arguments to the '``sdiv``' instruction must be |
| 3563 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3564 | arguments must have identical types. |
| 3565 | |
| 3566 | Semantics: |
| 3567 | """""""""" |
| 3568 | |
| 3569 | The value produced is the signed integer quotient of the two operands |
| 3570 | rounded towards zero. |
| 3571 | |
| 3572 | Note that signed integer division and unsigned integer division are |
| 3573 | distinct operations; for unsigned integer division, use '``udiv``'. |
| 3574 | |
| 3575 | Division by zero leads to undefined behavior. Overflow also leads to |
| 3576 | undefined behavior; this is a rare case, but can occur, for example, by |
| 3577 | doing a 32-bit division of -2147483648 by -1. |
| 3578 | |
| 3579 | If the ``exact`` keyword is present, the result value of the ``sdiv`` is |
| 3580 | a :ref:`poison value <poisonvalues>` if the result would be rounded. |
| 3581 | |
| 3582 | Example: |
| 3583 | """""""" |
| 3584 | |
| 3585 | .. code-block:: llvm |
| 3586 | |
| 3587 | <result> = sdiv i32 4, %var ; yields {i32}:result = 4 / %var |
| 3588 | |
| 3589 | .. _i_fdiv: |
| 3590 | |
| 3591 | '``fdiv``' Instruction |
| 3592 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3593 | |
| 3594 | Syntax: |
| 3595 | """"""" |
| 3596 | |
| 3597 | :: |
| 3598 | |
| 3599 | <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result |
| 3600 | |
| 3601 | Overview: |
| 3602 | """"""""" |
| 3603 | |
| 3604 | The '``fdiv``' instruction returns the quotient of its two operands. |
| 3605 | |
| 3606 | Arguments: |
| 3607 | """""""""" |
| 3608 | |
| 3609 | The two arguments to the '``fdiv``' instruction must be :ref:`floating |
| 3610 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 3611 | Both arguments must have identical types. |
| 3612 | |
| 3613 | Semantics: |
| 3614 | """""""""" |
| 3615 | |
| 3616 | The value produced is the floating point quotient of the two operands. |
| 3617 | This instruction can also take any number of :ref:`fast-math |
| 3618 | flags <fastmath>`, which are optimization hints to enable otherwise |
| 3619 | unsafe floating point optimizations: |
| 3620 | |
| 3621 | Example: |
| 3622 | """""""" |
| 3623 | |
| 3624 | .. code-block:: llvm |
| 3625 | |
| 3626 | <result> = fdiv float 4.0, %var ; yields {float}:result = 4.0 / %var |
| 3627 | |
| 3628 | '``urem``' Instruction |
| 3629 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3630 | |
| 3631 | Syntax: |
| 3632 | """"""" |
| 3633 | |
| 3634 | :: |
| 3635 | |
| 3636 | <result> = urem <ty> <op1>, <op2> ; yields {ty}:result |
| 3637 | |
| 3638 | Overview: |
| 3639 | """"""""" |
| 3640 | |
| 3641 | The '``urem``' instruction returns the remainder from the unsigned |
| 3642 | division of its two arguments. |
| 3643 | |
| 3644 | Arguments: |
| 3645 | """""""""" |
| 3646 | |
| 3647 | The two arguments to the '``urem``' instruction must be |
| 3648 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3649 | arguments must have identical types. |
| 3650 | |
| 3651 | Semantics: |
| 3652 | """""""""" |
| 3653 | |
| 3654 | This instruction returns the unsigned integer *remainder* of a division. |
| 3655 | This instruction always performs an unsigned division to get the |
| 3656 | remainder. |
| 3657 | |
| 3658 | Note that unsigned integer remainder and signed integer remainder are |
| 3659 | distinct operations; for signed integer remainder, use '``srem``'. |
| 3660 | |
| 3661 | Taking the remainder of a division by zero leads to undefined behavior. |
| 3662 | |
| 3663 | Example: |
| 3664 | """""""" |
| 3665 | |
| 3666 | .. code-block:: llvm |
| 3667 | |
| 3668 | <result> = urem i32 4, %var ; yields {i32}:result = 4 % %var |
| 3669 | |
| 3670 | '``srem``' Instruction |
| 3671 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3672 | |
| 3673 | Syntax: |
| 3674 | """"""" |
| 3675 | |
| 3676 | :: |
| 3677 | |
| 3678 | <result> = srem <ty> <op1>, <op2> ; yields {ty}:result |
| 3679 | |
| 3680 | Overview: |
| 3681 | """"""""" |
| 3682 | |
| 3683 | The '``srem``' instruction returns the remainder from the signed |
| 3684 | division of its two operands. This instruction can also take |
| 3685 | :ref:`vector <t_vector>` versions of the values in which case the elements |
| 3686 | must be integers. |
| 3687 | |
| 3688 | Arguments: |
| 3689 | """""""""" |
| 3690 | |
| 3691 | The two arguments to the '``srem``' instruction must be |
| 3692 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3693 | arguments must have identical types. |
| 3694 | |
| 3695 | Semantics: |
| 3696 | """""""""" |
| 3697 | |
| 3698 | This instruction returns the *remainder* of a division (where the result |
| 3699 | is either zero or has the same sign as the dividend, ``op1``), not the |
| 3700 | *modulo* operator (where the result is either zero or has the same sign |
| 3701 | as the divisor, ``op2``) of a value. For more information about the |
| 3702 | difference, see `The Math |
| 3703 | Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a |
| 3704 | table of how this is implemented in various languages, please see |
| 3705 | `Wikipedia: modulo |
| 3706 | operation <http://en.wikipedia.org/wiki/Modulo_operation>`_. |
| 3707 | |
| 3708 | Note that signed integer remainder and unsigned integer remainder are |
| 3709 | distinct operations; for unsigned integer remainder, use '``urem``'. |
| 3710 | |
| 3711 | Taking the remainder of a division by zero leads to undefined behavior. |
| 3712 | Overflow also leads to undefined behavior; this is a rare case, but can |
| 3713 | occur, for example, by taking the remainder of a 32-bit division of |
| 3714 | -2147483648 by -1. (The remainder doesn't actually overflow, but this |
| 3715 | rule lets srem be implemented using instructions that return both the |
| 3716 | result of the division and the remainder.) |
| 3717 | |
| 3718 | Example: |
| 3719 | """""""" |
| 3720 | |
| 3721 | .. code-block:: llvm |
| 3722 | |
| 3723 | <result> = srem i32 4, %var ; yields {i32}:result = 4 % %var |
| 3724 | |
| 3725 | .. _i_frem: |
| 3726 | |
| 3727 | '``frem``' Instruction |
| 3728 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3729 | |
| 3730 | Syntax: |
| 3731 | """"""" |
| 3732 | |
| 3733 | :: |
| 3734 | |
| 3735 | <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields {ty}:result |
| 3736 | |
| 3737 | Overview: |
| 3738 | """"""""" |
| 3739 | |
| 3740 | The '``frem``' instruction returns the remainder from the division of |
| 3741 | its two operands. |
| 3742 | |
| 3743 | Arguments: |
| 3744 | """""""""" |
| 3745 | |
| 3746 | The two arguments to the '``frem``' instruction must be :ref:`floating |
| 3747 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 3748 | Both arguments must have identical types. |
| 3749 | |
| 3750 | Semantics: |
| 3751 | """""""""" |
| 3752 | |
| 3753 | This instruction returns the *remainder* of a division. The remainder |
| 3754 | has the same sign as the dividend. This instruction can also take any |
| 3755 | number of :ref:`fast-math flags <fastmath>`, which are optimization hints |
| 3756 | to enable otherwise unsafe floating point optimizations: |
| 3757 | |
| 3758 | Example: |
| 3759 | """""""" |
| 3760 | |
| 3761 | .. code-block:: llvm |
| 3762 | |
| 3763 | <result> = frem float 4.0, %var ; yields {float}:result = 4.0 % %var |
| 3764 | |
| 3765 | .. _bitwiseops: |
| 3766 | |
| 3767 | Bitwise Binary Operations |
| 3768 | ------------------------- |
| 3769 | |
| 3770 | Bitwise binary operators are used to do various forms of bit-twiddling |
| 3771 | in a program. They are generally very efficient instructions and can |
| 3772 | commonly be strength reduced from other instructions. They require two |
| 3773 | operands of the same type, execute an operation on them, and produce a |
| 3774 | single value. The resulting value is the same type as its operands. |
| 3775 | |
| 3776 | '``shl``' Instruction |
| 3777 | ^^^^^^^^^^^^^^^^^^^^^ |
| 3778 | |
| 3779 | Syntax: |
| 3780 | """"""" |
| 3781 | |
| 3782 | :: |
| 3783 | |
| 3784 | <result> = shl <ty> <op1>, <op2> ; yields {ty}:result |
| 3785 | <result> = shl nuw <ty> <op1>, <op2> ; yields {ty}:result |
| 3786 | <result> = shl nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3787 | <result> = shl nuw nsw <ty> <op1>, <op2> ; yields {ty}:result |
| 3788 | |
| 3789 | Overview: |
| 3790 | """"""""" |
| 3791 | |
| 3792 | The '``shl``' instruction returns the first operand shifted to the left |
| 3793 | a specified number of bits. |
| 3794 | |
| 3795 | Arguments: |
| 3796 | """""""""" |
| 3797 | |
| 3798 | Both arguments to the '``shl``' instruction must be the same |
| 3799 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. |
| 3800 | '``op2``' is treated as an unsigned value. |
| 3801 | |
| 3802 | Semantics: |
| 3803 | """""""""" |
| 3804 | |
| 3805 | The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`, |
| 3806 | where ``n`` is the width of the result. If ``op2`` is (statically or |
| 3807 | dynamically) negative or equal to or larger than the number of bits in |
| 3808 | ``op1``, the result is undefined. If the arguments are vectors, each |
| 3809 | vector element of ``op1`` is shifted by the corresponding shift amount |
| 3810 | in ``op2``. |
| 3811 | |
| 3812 | If the ``nuw`` keyword is present, then the shift produces a :ref:`poison |
| 3813 | value <poisonvalues>` if it shifts out any non-zero bits. If the |
| 3814 | ``nsw`` keyword is present, then the shift produces a :ref:`poison |
| 3815 | value <poisonvalues>` if it shifts out any bits that disagree with the |
| 3816 | resultant sign bit. As such, NUW/NSW have the same semantics as they |
| 3817 | would if the shift were expressed as a mul instruction with the same |
| 3818 | nsw/nuw bits in (mul %op1, (shl 1, %op2)). |
| 3819 | |
| 3820 | Example: |
| 3821 | """""""" |
| 3822 | |
| 3823 | .. code-block:: llvm |
| 3824 | |
| 3825 | <result> = shl i32 4, %var ; yields {i32}: 4 << %var |
| 3826 | <result> = shl i32 4, 2 ; yields {i32}: 16 |
| 3827 | <result> = shl i32 1, 10 ; yields {i32}: 1024 |
| 3828 | <result> = shl i32 1, 32 ; undefined |
| 3829 | <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> |
| 3830 | |
| 3831 | '``lshr``' Instruction |
| 3832 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3833 | |
| 3834 | Syntax: |
| 3835 | """"""" |
| 3836 | |
| 3837 | :: |
| 3838 | |
| 3839 | <result> = lshr <ty> <op1>, <op2> ; yields {ty}:result |
| 3840 | <result> = lshr exact <ty> <op1>, <op2> ; yields {ty}:result |
| 3841 | |
| 3842 | Overview: |
| 3843 | """"""""" |
| 3844 | |
| 3845 | The '``lshr``' instruction (logical shift right) returns the first |
| 3846 | operand shifted to the right a specified number of bits with zero fill. |
| 3847 | |
| 3848 | Arguments: |
| 3849 | """""""""" |
| 3850 | |
| 3851 | Both arguments to the '``lshr``' instruction must be the same |
| 3852 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. |
| 3853 | '``op2``' is treated as an unsigned value. |
| 3854 | |
| 3855 | Semantics: |
| 3856 | """""""""" |
| 3857 | |
| 3858 | This instruction always performs a logical shift right operation. The |
| 3859 | most significant bits of the result will be filled with zero bits after |
| 3860 | the shift. If ``op2`` is (statically or dynamically) equal to or larger |
| 3861 | than the number of bits in ``op1``, the result is undefined. If the |
| 3862 | arguments are vectors, each vector element of ``op1`` is shifted by the |
| 3863 | corresponding shift amount in ``op2``. |
| 3864 | |
| 3865 | If the ``exact`` keyword is present, the result value of the ``lshr`` is |
| 3866 | a :ref:`poison value <poisonvalues>` if any of the bits shifted out are |
| 3867 | non-zero. |
| 3868 | |
| 3869 | Example: |
| 3870 | """""""" |
| 3871 | |
| 3872 | .. code-block:: llvm |
| 3873 | |
| 3874 | <result> = lshr i32 4, 1 ; yields {i32}:result = 2 |
| 3875 | <result> = lshr i32 4, 2 ; yields {i32}:result = 1 |
| 3876 | <result> = lshr i8 4, 3 ; yields {i8}:result = 0 |
| 3877 | <result> = lshr i8 -2, 1 ; yields {i8}:result = 0x7FFFFFFF |
| 3878 | <result> = lshr i32 1, 32 ; undefined |
| 3879 | <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> |
| 3880 | |
| 3881 | '``ashr``' Instruction |
| 3882 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3883 | |
| 3884 | Syntax: |
| 3885 | """"""" |
| 3886 | |
| 3887 | :: |
| 3888 | |
| 3889 | <result> = ashr <ty> <op1>, <op2> ; yields {ty}:result |
| 3890 | <result> = ashr exact <ty> <op1>, <op2> ; yields {ty}:result |
| 3891 | |
| 3892 | Overview: |
| 3893 | """"""""" |
| 3894 | |
| 3895 | The '``ashr``' instruction (arithmetic shift right) returns the first |
| 3896 | operand shifted to the right a specified number of bits with sign |
| 3897 | extension. |
| 3898 | |
| 3899 | Arguments: |
| 3900 | """""""""" |
| 3901 | |
| 3902 | Both arguments to the '``ashr``' instruction must be the same |
| 3903 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. |
| 3904 | '``op2``' is treated as an unsigned value. |
| 3905 | |
| 3906 | Semantics: |
| 3907 | """""""""" |
| 3908 | |
| 3909 | This instruction always performs an arithmetic shift right operation, |
| 3910 | The most significant bits of the result will be filled with the sign bit |
| 3911 | of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger |
| 3912 | than the number of bits in ``op1``, the result is undefined. If the |
| 3913 | arguments are vectors, each vector element of ``op1`` is shifted by the |
| 3914 | corresponding shift amount in ``op2``. |
| 3915 | |
| 3916 | If the ``exact`` keyword is present, the result value of the ``ashr`` is |
| 3917 | a :ref:`poison value <poisonvalues>` if any of the bits shifted out are |
| 3918 | non-zero. |
| 3919 | |
| 3920 | Example: |
| 3921 | """""""" |
| 3922 | |
| 3923 | .. code-block:: llvm |
| 3924 | |
| 3925 | <result> = ashr i32 4, 1 ; yields {i32}:result = 2 |
| 3926 | <result> = ashr i32 4, 2 ; yields {i32}:result = 1 |
| 3927 | <result> = ashr i8 4, 3 ; yields {i8}:result = 0 |
| 3928 | <result> = ashr i8 -2, 1 ; yields {i8}:result = -1 |
| 3929 | <result> = ashr i32 1, 32 ; undefined |
| 3930 | <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> |
| 3931 | |
| 3932 | '``and``' Instruction |
| 3933 | ^^^^^^^^^^^^^^^^^^^^^ |
| 3934 | |
| 3935 | Syntax: |
| 3936 | """"""" |
| 3937 | |
| 3938 | :: |
| 3939 | |
| 3940 | <result> = and <ty> <op1>, <op2> ; yields {ty}:result |
| 3941 | |
| 3942 | Overview: |
| 3943 | """"""""" |
| 3944 | |
| 3945 | The '``and``' instruction returns the bitwise logical and of its two |
| 3946 | operands. |
| 3947 | |
| 3948 | Arguments: |
| 3949 | """""""""" |
| 3950 | |
| 3951 | The two arguments to the '``and``' instruction must be |
| 3952 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 3953 | arguments must have identical types. |
| 3954 | |
| 3955 | Semantics: |
| 3956 | """""""""" |
| 3957 | |
| 3958 | The truth table used for the '``and``' instruction is: |
| 3959 | |
| 3960 | +-----+-----+-----+ |
| 3961 | | In0 | In1 | Out | |
| 3962 | +-----+-----+-----+ |
| 3963 | | 0 | 0 | 0 | |
| 3964 | +-----+-----+-----+ |
| 3965 | | 0 | 1 | 0 | |
| 3966 | +-----+-----+-----+ |
| 3967 | | 1 | 0 | 0 | |
| 3968 | +-----+-----+-----+ |
| 3969 | | 1 | 1 | 1 | |
| 3970 | +-----+-----+-----+ |
| 3971 | |
| 3972 | Example: |
| 3973 | """""""" |
| 3974 | |
| 3975 | .. code-block:: llvm |
| 3976 | |
| 3977 | <result> = and i32 4, %var ; yields {i32}:result = 4 & %var |
| 3978 | <result> = and i32 15, 40 ; yields {i32}:result = 8 |
| 3979 | <result> = and i32 4, 8 ; yields {i32}:result = 0 |
| 3980 | |
| 3981 | '``or``' Instruction |
| 3982 | ^^^^^^^^^^^^^^^^^^^^ |
| 3983 | |
| 3984 | Syntax: |
| 3985 | """"""" |
| 3986 | |
| 3987 | :: |
| 3988 | |
| 3989 | <result> = or <ty> <op1>, <op2> ; yields {ty}:result |
| 3990 | |
| 3991 | Overview: |
| 3992 | """"""""" |
| 3993 | |
| 3994 | The '``or``' instruction returns the bitwise logical inclusive or of its |
| 3995 | two operands. |
| 3996 | |
| 3997 | Arguments: |
| 3998 | """""""""" |
| 3999 | |
| 4000 | The two arguments to the '``or``' instruction must be |
| 4001 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4002 | arguments must have identical types. |
| 4003 | |
| 4004 | Semantics: |
| 4005 | """""""""" |
| 4006 | |
| 4007 | The truth table used for the '``or``' instruction is: |
| 4008 | |
| 4009 | +-----+-----+-----+ |
| 4010 | | In0 | In1 | Out | |
| 4011 | +-----+-----+-----+ |
| 4012 | | 0 | 0 | 0 | |
| 4013 | +-----+-----+-----+ |
| 4014 | | 0 | 1 | 1 | |
| 4015 | +-----+-----+-----+ |
| 4016 | | 1 | 0 | 1 | |
| 4017 | +-----+-----+-----+ |
| 4018 | | 1 | 1 | 1 | |
| 4019 | +-----+-----+-----+ |
| 4020 | |
| 4021 | Example: |
| 4022 | """""""" |
| 4023 | |
| 4024 | :: |
| 4025 | |
| 4026 | <result> = or i32 4, %var ; yields {i32}:result = 4 | %var |
| 4027 | <result> = or i32 15, 40 ; yields {i32}:result = 47 |
| 4028 | <result> = or i32 4, 8 ; yields {i32}:result = 12 |
| 4029 | |
| 4030 | '``xor``' Instruction |
| 4031 | ^^^^^^^^^^^^^^^^^^^^^ |
| 4032 | |
| 4033 | Syntax: |
| 4034 | """"""" |
| 4035 | |
| 4036 | :: |
| 4037 | |
| 4038 | <result> = xor <ty> <op1>, <op2> ; yields {ty}:result |
| 4039 | |
| 4040 | Overview: |
| 4041 | """"""""" |
| 4042 | |
| 4043 | The '``xor``' instruction returns the bitwise logical exclusive or of |
| 4044 | its two operands. The ``xor`` is used to implement the "one's |
| 4045 | complement" operation, which is the "~" operator in C. |
| 4046 | |
| 4047 | Arguments: |
| 4048 | """""""""" |
| 4049 | |
| 4050 | The two arguments to the '``xor``' instruction must be |
| 4051 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4052 | arguments must have identical types. |
| 4053 | |
| 4054 | Semantics: |
| 4055 | """""""""" |
| 4056 | |
| 4057 | The truth table used for the '``xor``' instruction is: |
| 4058 | |
| 4059 | +-----+-----+-----+ |
| 4060 | | In0 | In1 | Out | |
| 4061 | +-----+-----+-----+ |
| 4062 | | 0 | 0 | 0 | |
| 4063 | +-----+-----+-----+ |
| 4064 | | 0 | 1 | 1 | |
| 4065 | +-----+-----+-----+ |
| 4066 | | 1 | 0 | 1 | |
| 4067 | +-----+-----+-----+ |
| 4068 | | 1 | 1 | 0 | |
| 4069 | +-----+-----+-----+ |
| 4070 | |
| 4071 | Example: |
| 4072 | """""""" |
| 4073 | |
| 4074 | .. code-block:: llvm |
| 4075 | |
| 4076 | <result> = xor i32 4, %var ; yields {i32}:result = 4 ^ %var |
| 4077 | <result> = xor i32 15, 40 ; yields {i32}:result = 39 |
| 4078 | <result> = xor i32 4, 8 ; yields {i32}:result = 12 |
| 4079 | <result> = xor i32 %V, -1 ; yields {i32}:result = ~%V |
| 4080 | |
| 4081 | Vector Operations |
| 4082 | ----------------- |
| 4083 | |
| 4084 | LLVM supports several instructions to represent vector operations in a |
| 4085 | target-independent manner. These instructions cover the element-access |
| 4086 | and vector-specific operations needed to process vectors effectively. |
| 4087 | While LLVM does directly support these vector operations, many |
| 4088 | sophisticated algorithms will want to use target-specific intrinsics to |
| 4089 | take full advantage of a specific target. |
| 4090 | |
| 4091 | .. _i_extractelement: |
| 4092 | |
| 4093 | '``extractelement``' Instruction |
| 4094 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4095 | |
| 4096 | Syntax: |
| 4097 | """"""" |
| 4098 | |
| 4099 | :: |
| 4100 | |
| 4101 | <result> = extractelement <n x <ty>> <val>, i32 <idx> ; yields <ty> |
| 4102 | |
| 4103 | Overview: |
| 4104 | """"""""" |
| 4105 | |
| 4106 | The '``extractelement``' instruction extracts a single scalar element |
| 4107 | from a vector at a specified index. |
| 4108 | |
| 4109 | Arguments: |
| 4110 | """""""""" |
| 4111 | |
| 4112 | The first operand of an '``extractelement``' instruction is a value of |
| 4113 | :ref:`vector <t_vector>` type. The second operand is an index indicating |
| 4114 | the position from which to extract the element. The index may be a |
| 4115 | variable. |
| 4116 | |
| 4117 | Semantics: |
| 4118 | """""""""" |
| 4119 | |
| 4120 | The result is a scalar of the same type as the element type of ``val``. |
| 4121 | Its value is the value at position ``idx`` of ``val``. If ``idx`` |
| 4122 | exceeds the length of ``val``, the results are undefined. |
| 4123 | |
| 4124 | Example: |
| 4125 | """""""" |
| 4126 | |
| 4127 | .. code-block:: llvm |
| 4128 | |
| 4129 | <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32 |
| 4130 | |
| 4131 | .. _i_insertelement: |
| 4132 | |
| 4133 | '``insertelement``' Instruction |
| 4134 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4135 | |
| 4136 | Syntax: |
| 4137 | """"""" |
| 4138 | |
| 4139 | :: |
| 4140 | |
| 4141 | <result> = insertelement <n x <ty>> <val>, <ty> <elt>, i32 <idx> ; yields <n x <ty>> |
| 4142 | |
| 4143 | Overview: |
| 4144 | """"""""" |
| 4145 | |
| 4146 | The '``insertelement``' instruction inserts a scalar element into a |
| 4147 | vector at a specified index. |
| 4148 | |
| 4149 | Arguments: |
| 4150 | """""""""" |
| 4151 | |
| 4152 | The first operand of an '``insertelement``' instruction is a value of |
| 4153 | :ref:`vector <t_vector>` type. The second operand is a scalar value whose |
| 4154 | type must equal the element type of the first operand. The third operand |
| 4155 | is an index indicating the position at which to insert the value. The |
| 4156 | index may be a variable. |
| 4157 | |
| 4158 | Semantics: |
| 4159 | """""""""" |
| 4160 | |
| 4161 | The result is a vector of the same type as ``val``. Its element values |
| 4162 | are those of ``val`` except at position ``idx``, where it gets the value |
| 4163 | ``elt``. If ``idx`` exceeds the length of ``val``, the results are |
| 4164 | undefined. |
| 4165 | |
| 4166 | Example: |
| 4167 | """""""" |
| 4168 | |
| 4169 | .. code-block:: llvm |
| 4170 | |
| 4171 | <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> |
| 4172 | |
| 4173 | .. _i_shufflevector: |
| 4174 | |
| 4175 | '``shufflevector``' Instruction |
| 4176 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4177 | |
| 4178 | Syntax: |
| 4179 | """"""" |
| 4180 | |
| 4181 | :: |
| 4182 | |
| 4183 | <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> |
| 4184 | |
| 4185 | Overview: |
| 4186 | """"""""" |
| 4187 | |
| 4188 | The '``shufflevector``' instruction constructs a permutation of elements |
| 4189 | from two input vectors, returning a vector with the same element type as |
| 4190 | the input and length that is the same as the shuffle mask. |
| 4191 | |
| 4192 | Arguments: |
| 4193 | """""""""" |
| 4194 | |
| 4195 | The first two operands of a '``shufflevector``' instruction are vectors |
| 4196 | with the same type. The third argument is a shuffle mask whose element |
| 4197 | type is always 'i32'. The result of the instruction is a vector whose |
| 4198 | length is the same as the shuffle mask and whose element type is the |
| 4199 | same as the element type of the first two operands. |
| 4200 | |
| 4201 | The shuffle mask operand is required to be a constant vector with either |
| 4202 | constant integer or undef values. |
| 4203 | |
| 4204 | Semantics: |
| 4205 | """""""""" |
| 4206 | |
| 4207 | The elements of the two input vectors are numbered from left to right |
| 4208 | across both of the vectors. The shuffle mask operand specifies, for each |
| 4209 | element of the result vector, which element of the two input vectors the |
| 4210 | result element gets. The element selector may be undef (meaning "don't |
| 4211 | care") and the second operand may be undef if performing a shuffle from |
| 4212 | only one vector. |
| 4213 | |
| 4214 | Example: |
| 4215 | """""""" |
| 4216 | |
| 4217 | .. code-block:: llvm |
| 4218 | |
| 4219 | <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, |
| 4220 | <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> |
| 4221 | <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, |
| 4222 | <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. |
| 4223 | <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, |
| 4224 | <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> |
| 4225 | <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, |
| 4226 | <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32> |
| 4227 | |
| 4228 | Aggregate Operations |
| 4229 | -------------------- |
| 4230 | |
| 4231 | LLVM supports several instructions for working with |
| 4232 | :ref:`aggregate <t_aggregate>` values. |
| 4233 | |
| 4234 | .. _i_extractvalue: |
| 4235 | |
| 4236 | '``extractvalue``' Instruction |
| 4237 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4238 | |
| 4239 | Syntax: |
| 4240 | """"""" |
| 4241 | |
| 4242 | :: |
| 4243 | |
| 4244 | <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}* |
| 4245 | |
| 4246 | Overview: |
| 4247 | """"""""" |
| 4248 | |
| 4249 | The '``extractvalue``' instruction extracts the value of a member field |
| 4250 | from an :ref:`aggregate <t_aggregate>` value. |
| 4251 | |
| 4252 | Arguments: |
| 4253 | """""""""" |
| 4254 | |
| 4255 | The first operand of an '``extractvalue``' instruction is a value of |
| 4256 | :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The operands are |
| 4257 | constant indices to specify which value to extract in a similar manner |
| 4258 | as indices in a '``getelementptr``' instruction. |
| 4259 | |
| 4260 | The major differences to ``getelementptr`` indexing are: |
| 4261 | |
| 4262 | - Since the value being indexed is not a pointer, the first index is |
| 4263 | omitted and assumed to be zero. |
| 4264 | - At least one index must be specified. |
| 4265 | - Not only struct indices but also array indices must be in bounds. |
| 4266 | |
| 4267 | Semantics: |
| 4268 | """""""""" |
| 4269 | |
| 4270 | The result is the value at the position in the aggregate specified by |
| 4271 | the index operands. |
| 4272 | |
| 4273 | Example: |
| 4274 | """""""" |
| 4275 | |
| 4276 | .. code-block:: llvm |
| 4277 | |
| 4278 | <result> = extractvalue {i32, float} %agg, 0 ; yields i32 |
| 4279 | |
| 4280 | .. _i_insertvalue: |
| 4281 | |
| 4282 | '``insertvalue``' Instruction |
| 4283 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4284 | |
| 4285 | Syntax: |
| 4286 | """"""" |
| 4287 | |
| 4288 | :: |
| 4289 | |
| 4290 | <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type> |
| 4291 | |
| 4292 | Overview: |
| 4293 | """"""""" |
| 4294 | |
| 4295 | The '``insertvalue``' instruction inserts a value into a member field in |
| 4296 | an :ref:`aggregate <t_aggregate>` value. |
| 4297 | |
| 4298 | Arguments: |
| 4299 | """""""""" |
| 4300 | |
| 4301 | The first operand of an '``insertvalue``' instruction is a value of |
| 4302 | :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is |
| 4303 | a first-class value to insert. The following operands are constant |
| 4304 | indices indicating the position at which to insert the value in a |
| 4305 | similar manner as indices in a '``extractvalue``' instruction. The value |
| 4306 | to insert must have the same type as the value identified by the |
| 4307 | indices. |
| 4308 | |
| 4309 | Semantics: |
| 4310 | """""""""" |
| 4311 | |
| 4312 | The result is an aggregate of the same type as ``val``. Its value is |
| 4313 | that of ``val`` except that the value at the position specified by the |
| 4314 | indices is that of ``elt``. |
| 4315 | |
| 4316 | Example: |
| 4317 | """""""" |
| 4318 | |
| 4319 | .. code-block:: llvm |
| 4320 | |
| 4321 | %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} |
| 4322 | %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val} |
| 4323 | %agg3 = insertvalue {i32, {float}} %agg1, float %val, 1, 0 ; yields {i32 1, float %val} |
| 4324 | |
| 4325 | .. _memoryops: |
| 4326 | |
| 4327 | Memory Access and Addressing Operations |
| 4328 | --------------------------------------- |
| 4329 | |
| 4330 | A key design point of an SSA-based representation is how it represents |
| 4331 | memory. In LLVM, no memory locations are in SSA form, which makes things |
| 4332 | very simple. This section describes how to read, write, and allocate |
| 4333 | memory in LLVM. |
| 4334 | |
| 4335 | .. _i_alloca: |
| 4336 | |
| 4337 | '``alloca``' Instruction |
| 4338 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4339 | |
| 4340 | Syntax: |
| 4341 | """"""" |
| 4342 | |
| 4343 | :: |
| 4344 | |
| 4345 | <result> = alloca <type>[, <ty> <NumElements>][, align <alignment>] ; yields {type*}:result |
| 4346 | |
| 4347 | Overview: |
| 4348 | """"""""" |
| 4349 | |
| 4350 | The '``alloca``' instruction allocates memory on the stack frame of the |
| 4351 | currently executing function, to be automatically released when this |
| 4352 | function returns to its caller. The object is always allocated in the |
| 4353 | generic address space (address space zero). |
| 4354 | |
| 4355 | Arguments: |
| 4356 | """""""""" |
| 4357 | |
| 4358 | The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements`` |
| 4359 | bytes of memory on the runtime stack, returning a pointer of the |
| 4360 | appropriate type to the program. If "NumElements" is specified, it is |
| 4361 | the number of elements allocated, otherwise "NumElements" is defaulted |
| 4362 | to be one. If a constant alignment is specified, the value result of the |
| 4363 | allocation is guaranteed to be aligned to at least that boundary. If not |
| 4364 | specified, or if zero, the target can choose to align the allocation on |
| 4365 | any convenient boundary compatible with the type. |
| 4366 | |
| 4367 | '``type``' may be any sized type. |
| 4368 | |
| 4369 | Semantics: |
| 4370 | """""""""" |
| 4371 | |
| 4372 | Memory is allocated; a pointer is returned. The operation is undefined |
| 4373 | if there is insufficient stack space for the allocation. '``alloca``'d |
| 4374 | memory is automatically released when the function returns. The |
| 4375 | '``alloca``' instruction is commonly used to represent automatic |
| 4376 | variables that must have an address available. When the function returns |
| 4377 | (either with the ``ret`` or ``resume`` instructions), the memory is |
| 4378 | reclaimed. Allocating zero bytes is legal, but the result is undefined. |
| 4379 | The order in which memory is allocated (ie., which way the stack grows) |
| 4380 | is not specified. |
| 4381 | |
| 4382 | Example: |
| 4383 | """""""" |
| 4384 | |
| 4385 | .. code-block:: llvm |
| 4386 | |
| 4387 | %ptr = alloca i32 ; yields {i32*}:ptr |
| 4388 | %ptr = alloca i32, i32 4 ; yields {i32*}:ptr |
| 4389 | %ptr = alloca i32, i32 4, align 1024 ; yields {i32*}:ptr |
| 4390 | %ptr = alloca i32, align 1024 ; yields {i32*}:ptr |
| 4391 | |
| 4392 | .. _i_load: |
| 4393 | |
| 4394 | '``load``' Instruction |
| 4395 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4396 | |
| 4397 | Syntax: |
| 4398 | """"""" |
| 4399 | |
| 4400 | :: |
| 4401 | |
| 4402 | <result> = load [volatile] <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>] |
| 4403 | <result> = load atomic [volatile] <ty>* <pointer> [singlethread] <ordering>, align <alignment> |
| 4404 | !<index> = !{ i32 1 } |
| 4405 | |
| 4406 | Overview: |
| 4407 | """"""""" |
| 4408 | |
| 4409 | The '``load``' instruction is used to read from memory. |
| 4410 | |
| 4411 | Arguments: |
| 4412 | """""""""" |
| 4413 | |
| 4414 | The argument to the '``load``' instruction specifies the memory address |
| 4415 | from which to load. The pointer must point to a :ref:`first |
| 4416 | class <t_firstclass>` type. If the ``load`` is marked as ``volatile``, |
| 4417 | then the optimizer is not allowed to modify the number or order of |
| 4418 | execution of this ``load`` with other :ref:`volatile |
| 4419 | operations <volatile>`. |
| 4420 | |
| 4421 | If the ``load`` is marked as ``atomic``, it takes an extra |
| 4422 | :ref:`ordering <ordering>` and optional ``singlethread`` argument. The |
| 4423 | ``release`` and ``acq_rel`` orderings are not valid on ``load`` |
| 4424 | instructions. Atomic loads produce :ref:`defined <memmodel>` results |
| 4425 | when they may see multiple atomic stores. The type of the pointee must |
| 4426 | be an integer type whose bit width is a power of two greater than or |
| 4427 | equal to eight and less than or equal to a target-specific size limit. |
| 4428 | ``align`` must be explicitly specified on atomic loads, and the load has |
| 4429 | undefined behavior if the alignment is not set to a value which is at |
| 4430 | least the size in bytes of the pointee. ``!nontemporal`` does not have |
| 4431 | any defined semantics for atomic loads. |
| 4432 | |
| 4433 | The optional constant ``align`` argument specifies the alignment of the |
| 4434 | operation (that is, the alignment of the memory address). A value of 0 |
| 4435 | or an omitted ``align`` argument means that the operation has the abi |
| 4436 | alignment for the target. It is the responsibility of the code emitter |
| 4437 | to ensure that the alignment information is correct. Overestimating the |
| 4438 | alignment results in undefined behavior. Underestimating the alignment |
| 4439 | may produce less efficient code. An alignment of 1 is always safe. |
| 4440 | |
| 4441 | The optional ``!nontemporal`` metadata must reference a single |
| 4442 | metatadata name <index> corresponding to a metadata node with one |
| 4443 | ``i32`` entry of value 1. The existence of the ``!nontemporal`` |
| 4444 | metatadata on the instruction tells the optimizer and code generator |
| 4445 | that this load is not expected to be reused in the cache. The code |
| 4446 | generator may select special instructions to save cache bandwidth, such |
| 4447 | as the ``MOVNT`` instruction on x86. |
| 4448 | |
| 4449 | The optional ``!invariant.load`` metadata must reference a single |
| 4450 | metatadata name <index> corresponding to a metadata node with no |
| 4451 | entries. The existence of the ``!invariant.load`` metatadata on the |
| 4452 | instruction tells the optimizer and code generator that this load |
| 4453 | address points to memory which does not change value during program |
| 4454 | execution. The optimizer may then move this load around, for example, by |
| 4455 | hoisting it out of loops using loop invariant code motion. |
| 4456 | |
| 4457 | Semantics: |
| 4458 | """""""""" |
| 4459 | |
| 4460 | The location of memory pointed to is loaded. If the value being loaded |
| 4461 | is of scalar type then the number of bytes read does not exceed the |
| 4462 | minimum number of bytes needed to hold all bits of the type. For |
| 4463 | example, loading an ``i24`` reads at most three bytes. When loading a |
| 4464 | value of a type like ``i20`` with a size that is not an integral number |
| 4465 | of bytes, the result is undefined if the value was not originally |
| 4466 | written using a store of the same type. |
| 4467 | |
| 4468 | Examples: |
| 4469 | """"""""" |
| 4470 | |
| 4471 | .. code-block:: llvm |
| 4472 | |
| 4473 | %ptr = alloca i32 ; yields {i32*}:ptr |
| 4474 | store i32 3, i32* %ptr ; yields {void} |
| 4475 | %val = load i32* %ptr ; yields {i32}:val = i32 3 |
| 4476 | |
| 4477 | .. _i_store: |
| 4478 | |
| 4479 | '``store``' Instruction |
| 4480 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 4481 | |
| 4482 | Syntax: |
| 4483 | """"""" |
| 4484 | |
| 4485 | :: |
| 4486 | |
| 4487 | store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields {void} |
| 4488 | store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields {void} |
| 4489 | |
| 4490 | Overview: |
| 4491 | """"""""" |
| 4492 | |
| 4493 | The '``store``' instruction is used to write to memory. |
| 4494 | |
| 4495 | Arguments: |
| 4496 | """""""""" |
| 4497 | |
| 4498 | There are two arguments to the '``store``' instruction: a value to store |
| 4499 | and an address at which to store it. The type of the '``<pointer>``' |
| 4500 | operand must be a pointer to the :ref:`first class <t_firstclass>` type of |
| 4501 | the '``<value>``' operand. If the ``store`` is marked as ``volatile``, |
| 4502 | then the optimizer is not allowed to modify the number or order of |
| 4503 | execution of this ``store`` with other :ref:`volatile |
| 4504 | operations <volatile>`. |
| 4505 | |
| 4506 | If the ``store`` is marked as ``atomic``, it takes an extra |
| 4507 | :ref:`ordering <ordering>` and optional ``singlethread`` argument. The |
| 4508 | ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` |
| 4509 | instructions. Atomic loads produce :ref:`defined <memmodel>` results |
| 4510 | when they may see multiple atomic stores. The type of the pointee must |
| 4511 | be an integer type whose bit width is a power of two greater than or |
| 4512 | equal to eight and less than or equal to a target-specific size limit. |
| 4513 | ``align`` must be explicitly specified on atomic stores, and the store |
| 4514 | has undefined behavior if the alignment is not set to a value which is |
| 4515 | at least the size in bytes of the pointee. ``!nontemporal`` does not |
| 4516 | have any defined semantics for atomic stores. |
| 4517 | |
| 4518 | The optional constant "align" argument specifies the alignment of the |
| 4519 | operation (that is, the alignment of the memory address). A value of 0 |
| 4520 | or an omitted "align" argument means that the operation has the abi |
| 4521 | alignment for the target. It is the responsibility of the code emitter |
| 4522 | to ensure that the alignment information is correct. Overestimating the |
| 4523 | alignment results in an undefined behavior. Underestimating the |
| 4524 | alignment may produce less efficient code. An alignment of 1 is always |
| 4525 | safe. |
| 4526 | |
| 4527 | The optional !nontemporal metadata must reference a single metatadata |
| 4528 | name <index> corresponding to a metadata node with one i32 entry of |
| 4529 | value 1. The existence of the !nontemporal metatadata on the instruction |
| 4530 | tells the optimizer and code generator that this load is not expected to |
| 4531 | be reused in the cache. The code generator may select special |
| 4532 | instructions to save cache bandwidth, such as the MOVNT instruction on |
| 4533 | x86. |
| 4534 | |
| 4535 | Semantics: |
| 4536 | """""""""" |
| 4537 | |
| 4538 | The contents of memory are updated to contain '``<value>``' at the |
| 4539 | location specified by the '``<pointer>``' operand. If '``<value>``' is |
| 4540 | of scalar type then the number of bytes written does not exceed the |
| 4541 | minimum number of bytes needed to hold all bits of the type. For |
| 4542 | example, storing an ``i24`` writes at most three bytes. When writing a |
| 4543 | value of a type like ``i20`` with a size that is not an integral number |
| 4544 | of bytes, it is unspecified what happens to the extra bits that do not |
| 4545 | belong to the type, but they will typically be overwritten. |
| 4546 | |
| 4547 | Example: |
| 4548 | """""""" |
| 4549 | |
| 4550 | .. code-block:: llvm |
| 4551 | |
| 4552 | %ptr = alloca i32 ; yields {i32*}:ptr |
| 4553 | store i32 3, i32* %ptr ; yields {void} |
| 4554 | %val = load i32* %ptr ; yields {i32}:val = i32 3 |
| 4555 | |
| 4556 | .. _i_fence: |
| 4557 | |
| 4558 | '``fence``' Instruction |
| 4559 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 4560 | |
| 4561 | Syntax: |
| 4562 | """"""" |
| 4563 | |
| 4564 | :: |
| 4565 | |
| 4566 | fence [singlethread] <ordering> ; yields {void} |
| 4567 | |
| 4568 | Overview: |
| 4569 | """"""""" |
| 4570 | |
| 4571 | The '``fence``' instruction is used to introduce happens-before edges |
| 4572 | between operations. |
| 4573 | |
| 4574 | Arguments: |
| 4575 | """""""""" |
| 4576 | |
| 4577 | '``fence``' instructions take an :ref:`ordering <ordering>` argument which |
| 4578 | defines what *synchronizes-with* edges they add. They can only be given |
| 4579 | ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings. |
| 4580 | |
| 4581 | Semantics: |
| 4582 | """""""""" |
| 4583 | |
| 4584 | A fence A which has (at least) ``release`` ordering semantics |
| 4585 | *synchronizes with* a fence B with (at least) ``acquire`` ordering |
| 4586 | semantics if and only if there exist atomic operations X and Y, both |
| 4587 | operating on some atomic object M, such that A is sequenced before X, X |
| 4588 | modifies M (either directly or through some side effect of a sequence |
| 4589 | headed by X), Y is sequenced before B, and Y observes M. This provides a |
| 4590 | *happens-before* dependency between A and B. Rather than an explicit |
| 4591 | ``fence``, one (but not both) of the atomic operations X or Y might |
| 4592 | provide a ``release`` or ``acquire`` (resp.) ordering constraint and |
| 4593 | still *synchronize-with* the explicit ``fence`` and establish the |
| 4594 | *happens-before* edge. |
| 4595 | |
| 4596 | A ``fence`` which has ``seq_cst`` ordering, in addition to having both |
| 4597 | ``acquire`` and ``release`` semantics specified above, participates in |
| 4598 | the global program order of other ``seq_cst`` operations and/or fences. |
| 4599 | |
| 4600 | The optional ":ref:`singlethread <singlethread>`" argument specifies |
| 4601 | that the fence only synchronizes with other fences in the same thread. |
| 4602 | (This is useful for interacting with signal handlers.) |
| 4603 | |
| 4604 | Example: |
| 4605 | """""""" |
| 4606 | |
| 4607 | .. code-block:: llvm |
| 4608 | |
| 4609 | fence acquire ; yields {void} |
| 4610 | fence singlethread seq_cst ; yields {void} |
| 4611 | |
| 4612 | .. _i_cmpxchg: |
| 4613 | |
| 4614 | '``cmpxchg``' Instruction |
| 4615 | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4616 | |
| 4617 | Syntax: |
| 4618 | """"""" |
| 4619 | |
| 4620 | :: |
| 4621 | |
| 4622 | cmpxchg [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <ordering> ; yields {ty} |
| 4623 | |
| 4624 | Overview: |
| 4625 | """"""""" |
| 4626 | |
| 4627 | The '``cmpxchg``' instruction is used to atomically modify memory. It |
| 4628 | loads a value in memory and compares it to a given value. If they are |
| 4629 | equal, it stores a new value into the memory. |
| 4630 | |
| 4631 | Arguments: |
| 4632 | """""""""" |
| 4633 | |
| 4634 | There are three arguments to the '``cmpxchg``' instruction: an address |
| 4635 | to operate on, a value to compare to the value currently be at that |
| 4636 | address, and a new value to place at that address if the compared values |
| 4637 | are equal. The type of '<cmp>' must be an integer type whose bit width |
| 4638 | is a power of two greater than or equal to eight and less than or equal |
| 4639 | to a target-specific size limit. '<cmp>' and '<new>' must have the same |
| 4640 | type, and the type of '<pointer>' must be a pointer to that type. If the |
| 4641 | ``cmpxchg`` is marked as ``volatile``, then the optimizer is not allowed |
| 4642 | to modify the number or order of execution of this ``cmpxchg`` with |
| 4643 | other :ref:`volatile operations <volatile>`. |
| 4644 | |
| 4645 | The :ref:`ordering <ordering>` argument specifies how this ``cmpxchg`` |
| 4646 | synchronizes with other atomic operations. |
| 4647 | |
| 4648 | The optional "``singlethread``" argument declares that the ``cmpxchg`` |
| 4649 | is only atomic with respect to code (usually signal handlers) running in |
| 4650 | the same thread as the ``cmpxchg``. Otherwise the cmpxchg is atomic with |
| 4651 | respect to all other code in the system. |
| 4652 | |
| 4653 | The pointer passed into cmpxchg must have alignment greater than or |
| 4654 | equal to the size in memory of the operand. |
| 4655 | |
| 4656 | Semantics: |
| 4657 | """""""""" |
| 4658 | |
| 4659 | The contents of memory at the location specified by the '``<pointer>``' |
| 4660 | operand is read and compared to '``<cmp>``'; if the read value is the |
| 4661 | equal, '``<new>``' is written. The original value at the location is |
| 4662 | returned. |
| 4663 | |
| 4664 | A successful ``cmpxchg`` is a read-modify-write instruction for the purpose |
| 4665 | of identifying release sequences. A failed ``cmpxchg`` is equivalent to an |
| 4666 | atomic load with an ordering parameter determined by dropping any |
| 4667 | ``release`` part of the ``cmpxchg``'s ordering. |
| 4668 | |
| 4669 | Example: |
| 4670 | """""""" |
| 4671 | |
| 4672 | .. code-block:: llvm |
| 4673 | |
| 4674 | entry: |
| 4675 | %orig = atomic load i32* %ptr unordered ; yields {i32} |
| 4676 | br label %loop |
| 4677 | |
| 4678 | loop: |
| 4679 | %cmp = phi i32 [ %orig, %entry ], [%old, %loop] |
| 4680 | %squared = mul i32 %cmp, %cmp |
| 4681 | %old = cmpxchg i32* %ptr, i32 %cmp, i32 %squared ; yields {i32} |
| 4682 | %success = icmp eq i32 %cmp, %old |
| 4683 | br i1 %success, label %done, label %loop |
| 4684 | |
| 4685 | done: |
| 4686 | ... |
| 4687 | |
| 4688 | .. _i_atomicrmw: |
| 4689 | |
| 4690 | '``atomicrmw``' Instruction |
| 4691 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4692 | |
| 4693 | Syntax: |
| 4694 | """"""" |
| 4695 | |
| 4696 | :: |
| 4697 | |
| 4698 | atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields {ty} |
| 4699 | |
| 4700 | Overview: |
| 4701 | """"""""" |
| 4702 | |
| 4703 | The '``atomicrmw``' instruction is used to atomically modify memory. |
| 4704 | |
| 4705 | Arguments: |
| 4706 | """""""""" |
| 4707 | |
| 4708 | There are three arguments to the '``atomicrmw``' instruction: an |
| 4709 | operation to apply, an address whose value to modify, an argument to the |
| 4710 | operation. The operation must be one of the following keywords: |
| 4711 | |
| 4712 | - xchg |
| 4713 | - add |
| 4714 | - sub |
| 4715 | - and |
| 4716 | - nand |
| 4717 | - or |
| 4718 | - xor |
| 4719 | - max |
| 4720 | - min |
| 4721 | - umax |
| 4722 | - umin |
| 4723 | |
| 4724 | The type of '<value>' must be an integer type whose bit width is a power |
| 4725 | of two greater than or equal to eight and less than or equal to a |
| 4726 | target-specific size limit. The type of the '``<pointer>``' operand must |
| 4727 | be a pointer to that type. If the ``atomicrmw`` is marked as |
| 4728 | ``volatile``, then the optimizer is not allowed to modify the number or |
| 4729 | order of execution of this ``atomicrmw`` with other :ref:`volatile |
| 4730 | operations <volatile>`. |
| 4731 | |
| 4732 | Semantics: |
| 4733 | """""""""" |
| 4734 | |
| 4735 | The contents of memory at the location specified by the '``<pointer>``' |
| 4736 | operand are atomically read, modified, and written back. The original |
| 4737 | value at the location is returned. The modification is specified by the |
| 4738 | operation argument: |
| 4739 | |
| 4740 | - xchg: ``*ptr = val`` |
| 4741 | - add: ``*ptr = *ptr + val`` |
| 4742 | - sub: ``*ptr = *ptr - val`` |
| 4743 | - and: ``*ptr = *ptr & val`` |
| 4744 | - nand: ``*ptr = ~(*ptr & val)`` |
| 4745 | - or: ``*ptr = *ptr | val`` |
| 4746 | - xor: ``*ptr = *ptr ^ val`` |
| 4747 | - max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison) |
| 4748 | - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) |
| 4749 | - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned |
| 4750 | comparison) |
| 4751 | - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned |
| 4752 | comparison) |
| 4753 | |
| 4754 | Example: |
| 4755 | """""""" |
| 4756 | |
| 4757 | .. code-block:: llvm |
| 4758 | |
| 4759 | %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields {i32} |
| 4760 | |
| 4761 | .. _i_getelementptr: |
| 4762 | |
| 4763 | '``getelementptr``' Instruction |
| 4764 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4765 | |
| 4766 | Syntax: |
| 4767 | """"""" |
| 4768 | |
| 4769 | :: |
| 4770 | |
| 4771 | <result> = getelementptr <pty>* <ptrval>{, <ty> <idx>}* |
| 4772 | <result> = getelementptr inbounds <pty>* <ptrval>{, <ty> <idx>}* |
| 4773 | <result> = getelementptr <ptr vector> ptrval, <vector index type> idx |
| 4774 | |
| 4775 | Overview: |
| 4776 | """"""""" |
| 4777 | |
| 4778 | The '``getelementptr``' instruction is used to get the address of a |
| 4779 | subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs |
| 4780 | address calculation only and does not access memory. |
| 4781 | |
| 4782 | Arguments: |
| 4783 | """""""""" |
| 4784 | |
| 4785 | The first argument is always a pointer or a vector of pointers, and |
| 4786 | forms the basis of the calculation. The remaining arguments are indices |
| 4787 | that indicate which of the elements of the aggregate object are indexed. |
| 4788 | The interpretation of each index is dependent on the type being indexed |
| 4789 | into. The first index always indexes the pointer value given as the |
| 4790 | first argument, the second index indexes a value of the type pointed to |
| 4791 | (not necessarily the value directly pointed to, since the first index |
| 4792 | can be non-zero), etc. The first type indexed into must be a pointer |
| 4793 | value, subsequent types can be arrays, vectors, and structs. Note that |
| 4794 | subsequent types being indexed into can never be pointers, since that |
| 4795 | would require loading the pointer before continuing calculation. |
| 4796 | |
| 4797 | The type of each index argument depends on the type it is indexing into. |
| 4798 | When indexing into a (optionally packed) structure, only ``i32`` integer |
| 4799 | **constants** are allowed (when using a vector of indices they must all |
| 4800 | be the **same** ``i32`` integer constant). When indexing into an array, |
| 4801 | pointer or vector, integers of any width are allowed, and they are not |
| 4802 | required to be constant. These integers are treated as signed values |
| 4803 | where relevant. |
| 4804 | |
| 4805 | For example, let's consider a C code fragment and how it gets compiled |
| 4806 | to LLVM: |
| 4807 | |
| 4808 | .. code-block:: c |
| 4809 | |
| 4810 | struct RT { |
| 4811 | char A; |
| 4812 | int B[10][20]; |
| 4813 | char C; |
| 4814 | }; |
| 4815 | struct ST { |
| 4816 | int X; |
| 4817 | double Y; |
| 4818 | struct RT Z; |
| 4819 | }; |
| 4820 | |
| 4821 | int *foo(struct ST *s) { |
| 4822 | return &s[1].Z.B[5][13]; |
| 4823 | } |
| 4824 | |
| 4825 | The LLVM code generated by Clang is: |
| 4826 | |
| 4827 | .. code-block:: llvm |
| 4828 | |
| 4829 | %struct.RT = type { i8, [10 x [20 x i32]], i8 } |
| 4830 | %struct.ST = type { i32, double, %struct.RT } |
| 4831 | |
| 4832 | define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp { |
| 4833 | entry: |
| 4834 | %arrayidx = getelementptr inbounds %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13 |
| 4835 | ret i32* %arrayidx |
| 4836 | } |
| 4837 | |
| 4838 | Semantics: |
| 4839 | """""""""" |
| 4840 | |
| 4841 | In the example above, the first index is indexing into the |
| 4842 | '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``' |
| 4843 | = '``{ i32, double, %struct.RT }``' type, a structure. The second index |
| 4844 | indexes into the third element of the structure, yielding a |
| 4845 | '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another |
| 4846 | structure. The third index indexes into the second element of the |
| 4847 | structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two |
| 4848 | dimensions of the array are subscripted into, yielding an '``i32``' |
| 4849 | type. The '``getelementptr``' instruction returns a pointer to this |
| 4850 | element, thus computing a value of '``i32*``' type. |
| 4851 | |
| 4852 | Note that it is perfectly legal to index partially through a structure, |
| 4853 | returning a pointer to an inner element. Because of this, the LLVM code |
| 4854 | for the given testcase is equivalent to: |
| 4855 | |
| 4856 | .. code-block:: llvm |
| 4857 | |
| 4858 | define i32* @foo(%struct.ST* %s) { |
| 4859 | %t1 = getelementptr %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1 |
| 4860 | %t2 = getelementptr %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2 |
| 4861 | %t3 = getelementptr %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3 |
| 4862 | %t4 = getelementptr [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4 |
| 4863 | %t5 = getelementptr [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5 |
| 4864 | ret i32* %t5 |
| 4865 | } |
| 4866 | |
| 4867 | If the ``inbounds`` keyword is present, the result value of the |
| 4868 | ``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base |
| 4869 | pointer is not an *in bounds* address of an allocated object, or if any |
| 4870 | of the addresses that would be formed by successive addition of the |
| 4871 | offsets implied by the indices to the base address with infinitely |
| 4872 | precise signed arithmetic are not an *in bounds* address of that |
| 4873 | allocated object. The *in bounds* addresses for an allocated object are |
| 4874 | all the addresses that point into the object, plus the address one byte |
| 4875 | past the end. In cases where the base is a vector of pointers the |
| 4876 | ``inbounds`` keyword applies to each of the computations element-wise. |
| 4877 | |
| 4878 | If the ``inbounds`` keyword is not present, the offsets are added to the |
| 4879 | base address with silently-wrapping two's complement arithmetic. If the |
| 4880 | offsets have a different width from the pointer, they are sign-extended |
| 4881 | or truncated to the width of the pointer. The result value of the |
| 4882 | ``getelementptr`` may be outside the object pointed to by the base |
| 4883 | pointer. The result value may not necessarily be used to access memory |
| 4884 | though, even if it happens to point into allocated storage. See the |
| 4885 | :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more |
| 4886 | information. |
| 4887 | |
| 4888 | The getelementptr instruction is often confusing. For some more insight |
| 4889 | into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`. |
| 4890 | |
| 4891 | Example: |
| 4892 | """""""" |
| 4893 | |
| 4894 | .. code-block:: llvm |
| 4895 | |
| 4896 | ; yields [12 x i8]*:aptr |
| 4897 | %aptr = getelementptr {i32, [12 x i8]}* %saptr, i64 0, i32 1 |
| 4898 | ; yields i8*:vptr |
| 4899 | %vptr = getelementptr {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1 |
| 4900 | ; yields i8*:eptr |
| 4901 | %eptr = getelementptr [12 x i8]* %aptr, i64 0, i32 1 |
| 4902 | ; yields i32*:iptr |
| 4903 | %iptr = getelementptr [10 x i32]* @arr, i16 0, i16 0 |
| 4904 | |
| 4905 | In cases where the pointer argument is a vector of pointers, each index |
| 4906 | must be a vector with the same number of elements. For example: |
| 4907 | |
| 4908 | .. code-block:: llvm |
| 4909 | |
| 4910 | %A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets, |
| 4911 | |
| 4912 | Conversion Operations |
| 4913 | --------------------- |
| 4914 | |
| 4915 | The instructions in this category are the conversion instructions |
| 4916 | (casting) which all take a single operand and a type. They perform |
| 4917 | various bit conversions on the operand. |
| 4918 | |
| 4919 | '``trunc .. to``' Instruction |
| 4920 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4921 | |
| 4922 | Syntax: |
| 4923 | """"""" |
| 4924 | |
| 4925 | :: |
| 4926 | |
| 4927 | <result> = trunc <ty> <value> to <ty2> ; yields ty2 |
| 4928 | |
| 4929 | Overview: |
| 4930 | """"""""" |
| 4931 | |
| 4932 | The '``trunc``' instruction truncates its operand to the type ``ty2``. |
| 4933 | |
| 4934 | Arguments: |
| 4935 | """""""""" |
| 4936 | |
| 4937 | The '``trunc``' instruction takes a value to trunc, and a type to trunc |
| 4938 | it to. Both types must be of :ref:`integer <t_integer>` types, or vectors |
| 4939 | of the same number of integers. The bit size of the ``value`` must be |
| 4940 | larger than the bit size of the destination type, ``ty2``. Equal sized |
| 4941 | types are not allowed. |
| 4942 | |
| 4943 | Semantics: |
| 4944 | """""""""" |
| 4945 | |
| 4946 | The '``trunc``' instruction truncates the high order bits in ``value`` |
| 4947 | and converts the remaining bits to ``ty2``. Since the source size must |
| 4948 | be larger than the destination size, ``trunc`` cannot be a *no-op cast*. |
| 4949 | It will always truncate bits. |
| 4950 | |
| 4951 | Example: |
| 4952 | """""""" |
| 4953 | |
| 4954 | .. code-block:: llvm |
| 4955 | |
| 4956 | %X = trunc i32 257 to i8 ; yields i8:1 |
| 4957 | %Y = trunc i32 123 to i1 ; yields i1:true |
| 4958 | %Z = trunc i32 122 to i1 ; yields i1:false |
| 4959 | %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7> |
| 4960 | |
| 4961 | '``zext .. to``' Instruction |
| 4962 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4963 | |
| 4964 | Syntax: |
| 4965 | """"""" |
| 4966 | |
| 4967 | :: |
| 4968 | |
| 4969 | <result> = zext <ty> <value> to <ty2> ; yields ty2 |
| 4970 | |
| 4971 | Overview: |
| 4972 | """"""""" |
| 4973 | |
| 4974 | The '``zext``' instruction zero extends its operand to type ``ty2``. |
| 4975 | |
| 4976 | Arguments: |
| 4977 | """""""""" |
| 4978 | |
| 4979 | The '``zext``' instruction takes a value to cast, and a type to cast it |
| 4980 | to. Both types must be of :ref:`integer <t_integer>` types, or vectors of |
| 4981 | the same number of integers. The bit size of the ``value`` must be |
| 4982 | smaller than the bit size of the destination type, ``ty2``. |
| 4983 | |
| 4984 | Semantics: |
| 4985 | """""""""" |
| 4986 | |
| 4987 | The ``zext`` fills the high order bits of the ``value`` with zero bits |
| 4988 | until it reaches the size of the destination type, ``ty2``. |
| 4989 | |
| 4990 | When zero extending from i1, the result will always be either 0 or 1. |
| 4991 | |
| 4992 | Example: |
| 4993 | """""""" |
| 4994 | |
| 4995 | .. code-block:: llvm |
| 4996 | |
| 4997 | %X = zext i32 257 to i64 ; yields i64:257 |
| 4998 | %Y = zext i1 true to i32 ; yields i32:1 |
| 4999 | %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> |
| 5000 | |
| 5001 | '``sext .. to``' Instruction |
| 5002 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5003 | |
| 5004 | Syntax: |
| 5005 | """"""" |
| 5006 | |
| 5007 | :: |
| 5008 | |
| 5009 | <result> = sext <ty> <value> to <ty2> ; yields ty2 |
| 5010 | |
| 5011 | Overview: |
| 5012 | """"""""" |
| 5013 | |
| 5014 | The '``sext``' sign extends ``value`` to the type ``ty2``. |
| 5015 | |
| 5016 | Arguments: |
| 5017 | """""""""" |
| 5018 | |
| 5019 | The '``sext``' instruction takes a value to cast, and a type to cast it |
| 5020 | to. Both types must be of :ref:`integer <t_integer>` types, or vectors of |
| 5021 | the same number of integers. The bit size of the ``value`` must be |
| 5022 | smaller than the bit size of the destination type, ``ty2``. |
| 5023 | |
| 5024 | Semantics: |
| 5025 | """""""""" |
| 5026 | |
| 5027 | The '``sext``' instruction performs a sign extension by copying the sign |
| 5028 | bit (highest order bit) of the ``value`` until it reaches the bit size |
| 5029 | of the type ``ty2``. |
| 5030 | |
| 5031 | When sign extending from i1, the extension always results in -1 or 0. |
| 5032 | |
| 5033 | Example: |
| 5034 | """""""" |
| 5035 | |
| 5036 | .. code-block:: llvm |
| 5037 | |
| 5038 | %X = sext i8 -1 to i16 ; yields i16 :65535 |
| 5039 | %Y = sext i1 true to i32 ; yields i32:-1 |
| 5040 | %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> |
| 5041 | |
| 5042 | '``fptrunc .. to``' Instruction |
| 5043 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5044 | |
| 5045 | Syntax: |
| 5046 | """"""" |
| 5047 | |
| 5048 | :: |
| 5049 | |
| 5050 | <result> = fptrunc <ty> <value> to <ty2> ; yields ty2 |
| 5051 | |
| 5052 | Overview: |
| 5053 | """"""""" |
| 5054 | |
| 5055 | The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. |
| 5056 | |
| 5057 | Arguments: |
| 5058 | """""""""" |
| 5059 | |
| 5060 | The '``fptrunc``' instruction takes a :ref:`floating point <t_floating>` |
| 5061 | value to cast and a :ref:`floating point <t_floating>` type to cast it to. |
| 5062 | The size of ``value`` must be larger than the size of ``ty2``. This |
| 5063 | implies that ``fptrunc`` cannot be used to make a *no-op cast*. |
| 5064 | |
| 5065 | Semantics: |
| 5066 | """""""""" |
| 5067 | |
| 5068 | The '``fptrunc``' instruction truncates a ``value`` from a larger |
| 5069 | :ref:`floating point <t_floating>` type to a smaller :ref:`floating |
| 5070 | point <t_floating>` type. If the value cannot fit within the |
| 5071 | destination type, ``ty2``, then the results are undefined. |
| 5072 | |
| 5073 | Example: |
| 5074 | """""""" |
| 5075 | |
| 5076 | .. code-block:: llvm |
| 5077 | |
| 5078 | %X = fptrunc double 123.0 to float ; yields float:123.0 |
| 5079 | %Y = fptrunc double 1.0E+300 to float ; yields undefined |
| 5080 | |
| 5081 | '``fpext .. to``' Instruction |
| 5082 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5083 | |
| 5084 | Syntax: |
| 5085 | """"""" |
| 5086 | |
| 5087 | :: |
| 5088 | |
| 5089 | <result> = fpext <ty> <value> to <ty2> ; yields ty2 |
| 5090 | |
| 5091 | Overview: |
| 5092 | """"""""" |
| 5093 | |
| 5094 | The '``fpext``' extends a floating point ``value`` to a larger floating |
| 5095 | point value. |
| 5096 | |
| 5097 | Arguments: |
| 5098 | """""""""" |
| 5099 | |
| 5100 | The '``fpext``' instruction takes a :ref:`floating point <t_floating>` |
| 5101 | ``value`` to cast, and a :ref:`floating point <t_floating>` type to cast it |
| 5102 | to. The source type must be smaller than the destination type. |
| 5103 | |
| 5104 | Semantics: |
| 5105 | """""""""" |
| 5106 | |
| 5107 | The '``fpext``' instruction extends the ``value`` from a smaller |
| 5108 | :ref:`floating point <t_floating>` type to a larger :ref:`floating |
| 5109 | point <t_floating>` type. The ``fpext`` cannot be used to make a |
| 5110 | *no-op cast* because it always changes bits. Use ``bitcast`` to make a |
| 5111 | *no-op cast* for a floating point cast. |
| 5112 | |
| 5113 | Example: |
| 5114 | """""""" |
| 5115 | |
| 5116 | .. code-block:: llvm |
| 5117 | |
| 5118 | %X = fpext float 3.125 to double ; yields double:3.125000e+00 |
| 5119 | %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000 |
| 5120 | |
| 5121 | '``fptoui .. to``' Instruction |
| 5122 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5123 | |
| 5124 | Syntax: |
| 5125 | """"""" |
| 5126 | |
| 5127 | :: |
| 5128 | |
| 5129 | <result> = fptoui <ty> <value> to <ty2> ; yields ty2 |
| 5130 | |
| 5131 | Overview: |
| 5132 | """"""""" |
| 5133 | |
| 5134 | The '``fptoui``' converts a floating point ``value`` to its unsigned |
| 5135 | integer equivalent of type ``ty2``. |
| 5136 | |
| 5137 | Arguments: |
| 5138 | """""""""" |
| 5139 | |
| 5140 | The '``fptoui``' instruction takes a value to cast, which must be a |
| 5141 | scalar or vector :ref:`floating point <t_floating>` value, and a type to |
| 5142 | cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If |
| 5143 | ``ty`` is a vector floating point type, ``ty2`` must be a vector integer |
| 5144 | type with the same number of elements as ``ty`` |
| 5145 | |
| 5146 | Semantics: |
| 5147 | """""""""" |
| 5148 | |
| 5149 | The '``fptoui``' instruction converts its :ref:`floating |
| 5150 | point <t_floating>` operand into the nearest (rounding towards zero) |
| 5151 | unsigned integer value. If the value cannot fit in ``ty2``, the results |
| 5152 | are undefined. |
| 5153 | |
| 5154 | Example: |
| 5155 | """""""" |
| 5156 | |
| 5157 | .. code-block:: llvm |
| 5158 | |
| 5159 | %X = fptoui double 123.0 to i32 ; yields i32:123 |
| 5160 | %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 |
| 5161 | %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1 |
| 5162 | |
| 5163 | '``fptosi .. to``' Instruction |
| 5164 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5165 | |
| 5166 | Syntax: |
| 5167 | """"""" |
| 5168 | |
| 5169 | :: |
| 5170 | |
| 5171 | <result> = fptosi <ty> <value> to <ty2> ; yields ty2 |
| 5172 | |
| 5173 | Overview: |
| 5174 | """"""""" |
| 5175 | |
| 5176 | The '``fptosi``' instruction converts :ref:`floating point <t_floating>` |
| 5177 | ``value`` to type ``ty2``. |
| 5178 | |
| 5179 | Arguments: |
| 5180 | """""""""" |
| 5181 | |
| 5182 | The '``fptosi``' instruction takes a value to cast, which must be a |
| 5183 | scalar or vector :ref:`floating point <t_floating>` value, and a type to |
| 5184 | cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If |
| 5185 | ``ty`` is a vector floating point type, ``ty2`` must be a vector integer |
| 5186 | type with the same number of elements as ``ty`` |
| 5187 | |
| 5188 | Semantics: |
| 5189 | """""""""" |
| 5190 | |
| 5191 | The '``fptosi``' instruction converts its :ref:`floating |
| 5192 | point <t_floating>` operand into the nearest (rounding towards zero) |
| 5193 | signed integer value. If the value cannot fit in ``ty2``, the results |
| 5194 | are undefined. |
| 5195 | |
| 5196 | Example: |
| 5197 | """""""" |
| 5198 | |
| 5199 | .. code-block:: llvm |
| 5200 | |
| 5201 | %X = fptosi double -123.0 to i32 ; yields i32:-123 |
| 5202 | %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 |
| 5203 | %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1 |
| 5204 | |
| 5205 | '``uitofp .. to``' Instruction |
| 5206 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5207 | |
| 5208 | Syntax: |
| 5209 | """"""" |
| 5210 | |
| 5211 | :: |
| 5212 | |
| 5213 | <result> = uitofp <ty> <value> to <ty2> ; yields ty2 |
| 5214 | |
| 5215 | Overview: |
| 5216 | """"""""" |
| 5217 | |
| 5218 | The '``uitofp``' instruction regards ``value`` as an unsigned integer |
| 5219 | and converts that value to the ``ty2`` type. |
| 5220 | |
| 5221 | Arguments: |
| 5222 | """""""""" |
| 5223 | |
| 5224 | The '``uitofp``' instruction takes a value to cast, which must be a |
| 5225 | scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to |
| 5226 | ``ty2``, which must be an :ref:`floating point <t_floating>` type. If |
| 5227 | ``ty`` is a vector integer type, ``ty2`` must be a vector floating point |
| 5228 | type with the same number of elements as ``ty`` |
| 5229 | |
| 5230 | Semantics: |
| 5231 | """""""""" |
| 5232 | |
| 5233 | The '``uitofp``' instruction interprets its operand as an unsigned |
| 5234 | integer quantity and converts it to the corresponding floating point |
| 5235 | value. If the value cannot fit in the floating point value, the results |
| 5236 | are undefined. |
| 5237 | |
| 5238 | Example: |
| 5239 | """""""" |
| 5240 | |
| 5241 | .. code-block:: llvm |
| 5242 | |
| 5243 | %X = uitofp i32 257 to float ; yields float:257.0 |
| 5244 | %Y = uitofp i8 -1 to double ; yields double:255.0 |
| 5245 | |
| 5246 | '``sitofp .. to``' Instruction |
| 5247 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5248 | |
| 5249 | Syntax: |
| 5250 | """"""" |
| 5251 | |
| 5252 | :: |
| 5253 | |
| 5254 | <result> = sitofp <ty> <value> to <ty2> ; yields ty2 |
| 5255 | |
| 5256 | Overview: |
| 5257 | """"""""" |
| 5258 | |
| 5259 | The '``sitofp``' instruction regards ``value`` as a signed integer and |
| 5260 | converts that value to the ``ty2`` type. |
| 5261 | |
| 5262 | Arguments: |
| 5263 | """""""""" |
| 5264 | |
| 5265 | The '``sitofp``' instruction takes a value to cast, which must be a |
| 5266 | scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to |
| 5267 | ``ty2``, which must be an :ref:`floating point <t_floating>` type. If |
| 5268 | ``ty`` is a vector integer type, ``ty2`` must be a vector floating point |
| 5269 | type with the same number of elements as ``ty`` |
| 5270 | |
| 5271 | Semantics: |
| 5272 | """""""""" |
| 5273 | |
| 5274 | The '``sitofp``' instruction interprets its operand as a signed integer |
| 5275 | quantity and converts it to the corresponding floating point value. If |
| 5276 | the value cannot fit in the floating point value, the results are |
| 5277 | undefined. |
| 5278 | |
| 5279 | Example: |
| 5280 | """""""" |
| 5281 | |
| 5282 | .. code-block:: llvm |
| 5283 | |
| 5284 | %X = sitofp i32 257 to float ; yields float:257.0 |
| 5285 | %Y = sitofp i8 -1 to double ; yields double:-1.0 |
| 5286 | |
| 5287 | .. _i_ptrtoint: |
| 5288 | |
| 5289 | '``ptrtoint .. to``' Instruction |
| 5290 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5291 | |
| 5292 | Syntax: |
| 5293 | """"""" |
| 5294 | |
| 5295 | :: |
| 5296 | |
| 5297 | <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2 |
| 5298 | |
| 5299 | Overview: |
| 5300 | """"""""" |
| 5301 | |
| 5302 | The '``ptrtoint``' instruction converts the pointer or a vector of |
| 5303 | pointers ``value`` to the integer (or vector of integers) type ``ty2``. |
| 5304 | |
| 5305 | Arguments: |
| 5306 | """""""""" |
| 5307 | |
| 5308 | The '``ptrtoint``' instruction takes a ``value`` to cast, which must be |
| 5309 | a a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a |
| 5310 | type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or |
| 5311 | a vector of integers type. |
| 5312 | |
| 5313 | Semantics: |
| 5314 | """""""""" |
| 5315 | |
| 5316 | The '``ptrtoint``' instruction converts ``value`` to integer type |
| 5317 | ``ty2`` by interpreting the pointer value as an integer and either |
| 5318 | truncating or zero extending that value to the size of the integer type. |
| 5319 | If ``value`` is smaller than ``ty2`` then a zero extension is done. If |
| 5320 | ``value`` is larger than ``ty2`` then a truncation is done. If they are |
| 5321 | the same size, then nothing is done (*no-op cast*) other than a type |
| 5322 | change. |
| 5323 | |
| 5324 | Example: |
| 5325 | """""""" |
| 5326 | |
| 5327 | .. code-block:: llvm |
| 5328 | |
| 5329 | %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture |
| 5330 | %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture |
| 5331 | %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture |
| 5332 | |
| 5333 | .. _i_inttoptr: |
| 5334 | |
| 5335 | '``inttoptr .. to``' Instruction |
| 5336 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5337 | |
| 5338 | Syntax: |
| 5339 | """"""" |
| 5340 | |
| 5341 | :: |
| 5342 | |
| 5343 | <result> = inttoptr <ty> <value> to <ty2> ; yields ty2 |
| 5344 | |
| 5345 | Overview: |
| 5346 | """"""""" |
| 5347 | |
| 5348 | The '``inttoptr``' instruction converts an integer ``value`` to a |
| 5349 | pointer type, ``ty2``. |
| 5350 | |
| 5351 | Arguments: |
| 5352 | """""""""" |
| 5353 | |
| 5354 | The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to |
| 5355 | cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` |
| 5356 | type. |
| 5357 | |
| 5358 | Semantics: |
| 5359 | """""""""" |
| 5360 | |
| 5361 | The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by |
| 5362 | applying either a zero extension or a truncation depending on the size |
| 5363 | of the integer ``value``. If ``value`` is larger than the size of a |
| 5364 | pointer then a truncation is done. If ``value`` is smaller than the size |
| 5365 | of a pointer then a zero extension is done. If they are the same size, |
| 5366 | nothing is done (*no-op cast*). |
| 5367 | |
| 5368 | Example: |
| 5369 | """""""" |
| 5370 | |
| 5371 | .. code-block:: llvm |
| 5372 | |
| 5373 | %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture |
| 5374 | %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture |
| 5375 | %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture |
| 5376 | %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers |
| 5377 | |
| 5378 | .. _i_bitcast: |
| 5379 | |
| 5380 | '``bitcast .. to``' Instruction |
| 5381 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5382 | |
| 5383 | Syntax: |
| 5384 | """"""" |
| 5385 | |
| 5386 | :: |
| 5387 | |
| 5388 | <result> = bitcast <ty> <value> to <ty2> ; yields ty2 |
| 5389 | |
| 5390 | Overview: |
| 5391 | """"""""" |
| 5392 | |
| 5393 | The '``bitcast``' instruction converts ``value`` to type ``ty2`` without |
| 5394 | changing any bits. |
| 5395 | |
| 5396 | Arguments: |
| 5397 | """""""""" |
| 5398 | |
| 5399 | The '``bitcast``' instruction takes a value to cast, which must be a |
| 5400 | non-aggregate first class value, and a type to cast it to, which must |
| 5401 | also be a non-aggregate :ref:`first class <t_firstclass>` type. The bit |
| 5402 | sizes of ``value`` and the destination type, ``ty2``, must be identical. |
| 5403 | If the source type is a pointer, the destination type must also be a |
| 5404 | pointer. This instruction supports bitwise conversion of vectors to |
| 5405 | integers and to vectors of other types (as long as they have the same |
| 5406 | size). |
| 5407 | |
| 5408 | Semantics: |
| 5409 | """""""""" |
| 5410 | |
| 5411 | The '``bitcast``' instruction converts ``value`` to type ``ty2``. It is |
| 5412 | always a *no-op cast* because no bits change with this conversion. The |
| 5413 | conversion is done as if the ``value`` had been stored to memory and |
| 5414 | read back as type ``ty2``. Pointer (or vector of pointers) types may |
| 5415 | only be converted to other pointer (or vector of pointers) types with |
| 5416 | this instruction. To convert pointers to other types, use the |
| 5417 | :ref:`inttoptr <i_inttoptr>` or :ref:`ptrtoint <i_ptrtoint>` instructions |
| 5418 | first. |
| 5419 | |
| 5420 | Example: |
| 5421 | """""""" |
| 5422 | |
| 5423 | .. code-block:: llvm |
| 5424 | |
| 5425 | %X = bitcast i8 255 to i8 ; yields i8 :-1 |
| 5426 | %Y = bitcast i32* %x to sint* ; yields sint*:%x |
| 5427 | %Z = bitcast <2 x int> %V to i64; ; yields i64: %V |
| 5428 | %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*> |
| 5429 | |
| 5430 | .. _otherops: |
| 5431 | |
| 5432 | Other Operations |
| 5433 | ---------------- |
| 5434 | |
| 5435 | The instructions in this category are the "miscellaneous" instructions, |
| 5436 | which defy better classification. |
| 5437 | |
| 5438 | .. _i_icmp: |
| 5439 | |
| 5440 | '``icmp``' Instruction |
| 5441 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 5442 | |
| 5443 | Syntax: |
| 5444 | """"""" |
| 5445 | |
| 5446 | :: |
| 5447 | |
| 5448 | <result> = icmp <cond> <ty> <op1>, <op2> ; yields {i1} or {<N x i1>}:result |
| 5449 | |
| 5450 | Overview: |
| 5451 | """"""""" |
| 5452 | |
| 5453 | The '``icmp``' instruction returns a boolean value or a vector of |
| 5454 | boolean values based on comparison of its two integer, integer vector, |
| 5455 | pointer, or pointer vector operands. |
| 5456 | |
| 5457 | Arguments: |
| 5458 | """""""""" |
| 5459 | |
| 5460 | The '``icmp``' instruction takes three operands. The first operand is |
| 5461 | the condition code indicating the kind of comparison to perform. It is |
| 5462 | not a value, just a keyword. The possible condition code are: |
| 5463 | |
| 5464 | #. ``eq``: equal |
| 5465 | #. ``ne``: not equal |
| 5466 | #. ``ugt``: unsigned greater than |
| 5467 | #. ``uge``: unsigned greater or equal |
| 5468 | #. ``ult``: unsigned less than |
| 5469 | #. ``ule``: unsigned less or equal |
| 5470 | #. ``sgt``: signed greater than |
| 5471 | #. ``sge``: signed greater or equal |
| 5472 | #. ``slt``: signed less than |
| 5473 | #. ``sle``: signed less or equal |
| 5474 | |
| 5475 | The remaining two arguments must be :ref:`integer <t_integer>` or |
| 5476 | :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They |
| 5477 | must also be identical types. |
| 5478 | |
| 5479 | Semantics: |
| 5480 | """""""""" |
| 5481 | |
| 5482 | The '``icmp``' compares ``op1`` and ``op2`` according to the condition |
| 5483 | code given as ``cond``. The comparison performed always yields either an |
| 5484 | :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows: |
| 5485 | |
| 5486 | #. ``eq``: yields ``true`` if the operands are equal, ``false`` |
| 5487 | otherwise. No sign interpretation is necessary or performed. |
| 5488 | #. ``ne``: yields ``true`` if the operands are unequal, ``false`` |
| 5489 | otherwise. No sign interpretation is necessary or performed. |
| 5490 | #. ``ugt``: interprets the operands as unsigned values and yields |
| 5491 | ``true`` if ``op1`` is greater than ``op2``. |
| 5492 | #. ``uge``: interprets the operands as unsigned values and yields |
| 5493 | ``true`` if ``op1`` is greater than or equal to ``op2``. |
| 5494 | #. ``ult``: interprets the operands as unsigned values and yields |
| 5495 | ``true`` if ``op1`` is less than ``op2``. |
| 5496 | #. ``ule``: interprets the operands as unsigned values and yields |
| 5497 | ``true`` if ``op1`` is less than or equal to ``op2``. |
| 5498 | #. ``sgt``: interprets the operands as signed values and yields ``true`` |
| 5499 | if ``op1`` is greater than ``op2``. |
| 5500 | #. ``sge``: interprets the operands as signed values and yields ``true`` |
| 5501 | if ``op1`` is greater than or equal to ``op2``. |
| 5502 | #. ``slt``: interprets the operands as signed values and yields ``true`` |
| 5503 | if ``op1`` is less than ``op2``. |
| 5504 | #. ``sle``: interprets the operands as signed values and yields ``true`` |
| 5505 | if ``op1`` is less than or equal to ``op2``. |
| 5506 | |
| 5507 | If the operands are :ref:`pointer <t_pointer>` typed, the pointer values |
| 5508 | are compared as if they were integers. |
| 5509 | |
| 5510 | If the operands are integer vectors, then they are compared element by |
| 5511 | element. The result is an ``i1`` vector with the same number of elements |
| 5512 | as the values being compared. Otherwise, the result is an ``i1``. |
| 5513 | |
| 5514 | Example: |
| 5515 | """""""" |
| 5516 | |
| 5517 | .. code-block:: llvm |
| 5518 | |
| 5519 | <result> = icmp eq i32 4, 5 ; yields: result=false |
| 5520 | <result> = icmp ne float* %X, %X ; yields: result=false |
| 5521 | <result> = icmp ult i16 4, 5 ; yields: result=true |
| 5522 | <result> = icmp sgt i16 4, 5 ; yields: result=false |
| 5523 | <result> = icmp ule i16 -4, 5 ; yields: result=false |
| 5524 | <result> = icmp sge i16 4, 5 ; yields: result=false |
| 5525 | |
| 5526 | Note that the code generator does not yet support vector types with the |
| 5527 | ``icmp`` instruction. |
| 5528 | |
| 5529 | .. _i_fcmp: |
| 5530 | |
| 5531 | '``fcmp``' Instruction |
| 5532 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 5533 | |
| 5534 | Syntax: |
| 5535 | """"""" |
| 5536 | |
| 5537 | :: |
| 5538 | |
| 5539 | <result> = fcmp <cond> <ty> <op1>, <op2> ; yields {i1} or {<N x i1>}:result |
| 5540 | |
| 5541 | Overview: |
| 5542 | """"""""" |
| 5543 | |
| 5544 | The '``fcmp``' instruction returns a boolean value or vector of boolean |
| 5545 | values based on comparison of its operands. |
| 5546 | |
| 5547 | If the operands are floating point scalars, then the result type is a |
| 5548 | boolean (:ref:`i1 <t_integer>`). |
| 5549 | |
| 5550 | If the operands are floating point vectors, then the result type is a |
| 5551 | vector of boolean with the same number of elements as the operands being |
| 5552 | compared. |
| 5553 | |
| 5554 | Arguments: |
| 5555 | """""""""" |
| 5556 | |
| 5557 | The '``fcmp``' instruction takes three operands. The first operand is |
| 5558 | the condition code indicating the kind of comparison to perform. It is |
| 5559 | not a value, just a keyword. The possible condition code are: |
| 5560 | |
| 5561 | #. ``false``: no comparison, always returns false |
| 5562 | #. ``oeq``: ordered and equal |
| 5563 | #. ``ogt``: ordered and greater than |
| 5564 | #. ``oge``: ordered and greater than or equal |
| 5565 | #. ``olt``: ordered and less than |
| 5566 | #. ``ole``: ordered and less than or equal |
| 5567 | #. ``one``: ordered and not equal |
| 5568 | #. ``ord``: ordered (no nans) |
| 5569 | #. ``ueq``: unordered or equal |
| 5570 | #. ``ugt``: unordered or greater than |
| 5571 | #. ``uge``: unordered or greater than or equal |
| 5572 | #. ``ult``: unordered or less than |
| 5573 | #. ``ule``: unordered or less than or equal |
| 5574 | #. ``une``: unordered or not equal |
| 5575 | #. ``uno``: unordered (either nans) |
| 5576 | #. ``true``: no comparison, always returns true |
| 5577 | |
| 5578 | *Ordered* means that neither operand is a QNAN while *unordered* means |
| 5579 | that either operand may be a QNAN. |
| 5580 | |
| 5581 | Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating |
| 5582 | point <t_floating>` type or a :ref:`vector <t_vector>` of floating point |
| 5583 | type. They must have identical types. |
| 5584 | |
| 5585 | Semantics: |
| 5586 | """""""""" |
| 5587 | |
| 5588 | The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the |
| 5589 | condition code given as ``cond``. If the operands are vectors, then the |
| 5590 | vectors are compared element by element. Each comparison performed |
| 5591 | always yields an :ref:`i1 <t_integer>` result, as follows: |
| 5592 | |
| 5593 | #. ``false``: always yields ``false``, regardless of operands. |
| 5594 | #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 5595 | is equal to ``op2``. |
| 5596 | #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 5597 | is greater than ``op2``. |
| 5598 | #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 5599 | is greater than or equal to ``op2``. |
| 5600 | #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 5601 | is less than ``op2``. |
| 5602 | #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 5603 | is less than or equal to ``op2``. |
| 5604 | #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 5605 | is not equal to ``op2``. |
| 5606 | #. ``ord``: yields ``true`` if both operands are not a QNAN. |
| 5607 | #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 5608 | equal to ``op2``. |
| 5609 | #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 5610 | greater than ``op2``. |
| 5611 | #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 5612 | greater than or equal to ``op2``. |
| 5613 | #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 5614 | less than ``op2``. |
| 5615 | #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 5616 | less than or equal to ``op2``. |
| 5617 | #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 5618 | not equal to ``op2``. |
| 5619 | #. ``uno``: yields ``true`` if either operand is a QNAN. |
| 5620 | #. ``true``: always yields ``true``, regardless of operands. |
| 5621 | |
| 5622 | Example: |
| 5623 | """""""" |
| 5624 | |
| 5625 | .. code-block:: llvm |
| 5626 | |
| 5627 | <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false |
| 5628 | <result> = fcmp one float 4.0, 5.0 ; yields: result=true |
| 5629 | <result> = fcmp olt float 4.0, 5.0 ; yields: result=true |
| 5630 | <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false |
| 5631 | |
| 5632 | Note that the code generator does not yet support vector types with the |
| 5633 | ``fcmp`` instruction. |
| 5634 | |
| 5635 | .. _i_phi: |
| 5636 | |
| 5637 | '``phi``' Instruction |
| 5638 | ^^^^^^^^^^^^^^^^^^^^^ |
| 5639 | |
| 5640 | Syntax: |
| 5641 | """"""" |
| 5642 | |
| 5643 | :: |
| 5644 | |
| 5645 | <result> = phi <ty> [ <val0>, <label0>], ... |
| 5646 | |
| 5647 | Overview: |
| 5648 | """"""""" |
| 5649 | |
| 5650 | The '``phi``' instruction is used to implement the φ node in the SSA |
| 5651 | graph representing the function. |
| 5652 | |
| 5653 | Arguments: |
| 5654 | """""""""" |
| 5655 | |
| 5656 | The type of the incoming values is specified with the first type field. |
| 5657 | After this, the '``phi``' instruction takes a list of pairs as |
| 5658 | arguments, with one pair for each predecessor basic block of the current |
| 5659 | block. Only values of :ref:`first class <t_firstclass>` type may be used as |
| 5660 | the value arguments to the PHI node. Only labels may be used as the |
| 5661 | label arguments. |
| 5662 | |
| 5663 | There must be no non-phi instructions between the start of a basic block |
| 5664 | and the PHI instructions: i.e. PHI instructions must be first in a basic |
| 5665 | block. |
| 5666 | |
| 5667 | For the purposes of the SSA form, the use of each incoming value is |
| 5668 | deemed to occur on the edge from the corresponding predecessor block to |
| 5669 | the current block (but after any definition of an '``invoke``' |
| 5670 | instruction's return value on the same edge). |
| 5671 | |
| 5672 | Semantics: |
| 5673 | """""""""" |
| 5674 | |
| 5675 | At runtime, the '``phi``' instruction logically takes on the value |
| 5676 | specified by the pair corresponding to the predecessor basic block that |
| 5677 | executed just prior to the current block. |
| 5678 | |
| 5679 | Example: |
| 5680 | """""""" |
| 5681 | |
| 5682 | .. code-block:: llvm |
| 5683 | |
| 5684 | Loop: ; Infinite loop that counts from 0 on up... |
| 5685 | %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ] |
| 5686 | %nextindvar = add i32 %indvar, 1 |
| 5687 | br label %Loop |
| 5688 | |
| 5689 | .. _i_select: |
| 5690 | |
| 5691 | '``select``' Instruction |
| 5692 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5693 | |
| 5694 | Syntax: |
| 5695 | """"""" |
| 5696 | |
| 5697 | :: |
| 5698 | |
| 5699 | <result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty |
| 5700 | |
| 5701 | selty is either i1 or {<N x i1>} |
| 5702 | |
| 5703 | Overview: |
| 5704 | """"""""" |
| 5705 | |
| 5706 | The '``select``' instruction is used to choose one value based on a |
| 5707 | condition, without branching. |
| 5708 | |
| 5709 | Arguments: |
| 5710 | """""""""" |
| 5711 | |
| 5712 | The '``select``' instruction requires an 'i1' value or a vector of 'i1' |
| 5713 | values indicating the condition, and two values of the same :ref:`first |
| 5714 | class <t_firstclass>` type. If the val1/val2 are vectors and the |
| 5715 | condition is a scalar, then entire vectors are selected, not individual |
| 5716 | elements. |
| 5717 | |
| 5718 | Semantics: |
| 5719 | """""""""" |
| 5720 | |
| 5721 | If the condition is an i1 and it evaluates to 1, the instruction returns |
| 5722 | the first value argument; otherwise, it returns the second value |
| 5723 | argument. |
| 5724 | |
| 5725 | If the condition is a vector of i1, then the value arguments must be |
| 5726 | vectors of the same size, and the selection is done element by element. |
| 5727 | |
| 5728 | Example: |
| 5729 | """""""" |
| 5730 | |
| 5731 | .. code-block:: llvm |
| 5732 | |
| 5733 | %X = select i1 true, i8 17, i8 42 ; yields i8:17 |
| 5734 | |
| 5735 | .. _i_call: |
| 5736 | |
| 5737 | '``call``' Instruction |
| 5738 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 5739 | |
| 5740 | Syntax: |
| 5741 | """"""" |
| 5742 | |
| 5743 | :: |
| 5744 | |
| 5745 | <result> = [tail] call [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn attrs] |
| 5746 | |
| 5747 | Overview: |
| 5748 | """"""""" |
| 5749 | |
| 5750 | The '``call``' instruction represents a simple function call. |
| 5751 | |
| 5752 | Arguments: |
| 5753 | """""""""" |
| 5754 | |
| 5755 | This instruction requires several arguments: |
| 5756 | |
| 5757 | #. The optional "tail" marker indicates that the callee function does |
| 5758 | not access any allocas or varargs in the caller. Note that calls may |
| 5759 | be marked "tail" even if they do not occur before a |
| 5760 | :ref:`ret <i_ret>` instruction. If the "tail" marker is present, the |
| 5761 | function call is eligible for tail call optimization, but `might not |
| 5762 | in fact be optimized into a jump <CodeGenerator.html#tailcallopt>`_. |
| 5763 | The code generator may optimize calls marked "tail" with either 1) |
| 5764 | automatic `sibling call |
| 5765 | optimization <CodeGenerator.html#sibcallopt>`_ when the caller and |
| 5766 | callee have matching signatures, or 2) forced tail call optimization |
| 5767 | when the following extra requirements are met: |
| 5768 | |
| 5769 | - Caller and callee both have the calling convention ``fastcc``. |
| 5770 | - The call is in tail position (ret immediately follows call and ret |
| 5771 | uses value of call or is void). |
| 5772 | - Option ``-tailcallopt`` is enabled, or |
| 5773 | ``llvm::GuaranteedTailCallOpt`` is ``true``. |
| 5774 | - `Platform specific constraints are |
| 5775 | met. <CodeGenerator.html#tailcallopt>`_ |
| 5776 | |
| 5777 | #. The optional "cconv" marker indicates which :ref:`calling |
| 5778 | convention <callingconv>` the call should use. If none is |
| 5779 | specified, the call defaults to using C calling conventions. The |
| 5780 | calling convention of the call must match the calling convention of |
| 5781 | the target function, or else the behavior is undefined. |
| 5782 | #. The optional :ref:`Parameter Attributes <paramattrs>` list for return |
| 5783 | values. Only '``zeroext``', '``signext``', and '``inreg``' attributes |
| 5784 | are valid here. |
| 5785 | #. '``ty``': the type of the call instruction itself which is also the |
| 5786 | type of the return value. Functions that return no value are marked |
| 5787 | ``void``. |
| 5788 | #. '``fnty``': shall be the signature of the pointer to function value |
| 5789 | being invoked. The argument types must match the types implied by |
| 5790 | this signature. This type can be omitted if the function is not |
| 5791 | varargs and if the function type does not return a pointer to a |
| 5792 | function. |
| 5793 | #. '``fnptrval``': An LLVM value containing a pointer to a function to |
| 5794 | be invoked. In most cases, this is a direct function invocation, but |
| 5795 | indirect ``call``'s are just as possible, calling an arbitrary pointer |
| 5796 | to function value. |
| 5797 | #. '``function args``': argument list whose types match the function |
| 5798 | signature argument types and parameter attributes. All arguments must |
| 5799 | be of :ref:`first class <t_firstclass>` type. If the function signature |
| 5800 | indicates the function accepts a variable number of arguments, the |
| 5801 | extra arguments can be specified. |
| 5802 | #. The optional :ref:`function attributes <fnattrs>` list. Only |
| 5803 | '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``' |
| 5804 | attributes are valid here. |
| 5805 | |
| 5806 | Semantics: |
| 5807 | """""""""" |
| 5808 | |
| 5809 | The '``call``' instruction is used to cause control flow to transfer to |
| 5810 | a specified function, with its incoming arguments bound to the specified |
| 5811 | values. Upon a '``ret``' instruction in the called function, control |
| 5812 | flow continues with the instruction after the function call, and the |
| 5813 | return value of the function is bound to the result argument. |
| 5814 | |
| 5815 | Example: |
| 5816 | """""""" |
| 5817 | |
| 5818 | .. code-block:: llvm |
| 5819 | |
| 5820 | %retval = call i32 @test(i32 %argc) |
| 5821 | call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32 |
| 5822 | %X = tail call i32 @foo() ; yields i32 |
| 5823 | %Y = tail call fastcc i32 @foo() ; yields i32 |
| 5824 | call void %foo(i8 97 signext) |
| 5825 | |
| 5826 | %struct.A = type { i32, i8 } |
| 5827 | %r = call %struct.A @foo() ; yields { 32, i8 } |
| 5828 | %gr = extractvalue %struct.A %r, 0 ; yields i32 |
| 5829 | %gr1 = extractvalue %struct.A %r, 1 ; yields i8 |
| 5830 | %Z = call void @foo() noreturn ; indicates that %foo never returns normally |
| 5831 | %ZZ = call zeroext i32 @bar() ; Return value is %zero extended |
| 5832 | |
| 5833 | llvm treats calls to some functions with names and arguments that match |
| 5834 | the standard C99 library as being the C99 library functions, and may |
| 5835 | perform optimizations or generate code for them under that assumption. |
| 5836 | This is something we'd like to change in the future to provide better |
| 5837 | support for freestanding environments and non-C-based languages. |
| 5838 | |
| 5839 | .. _i_va_arg: |
| 5840 | |
| 5841 | '``va_arg``' Instruction |
| 5842 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5843 | |
| 5844 | Syntax: |
| 5845 | """"""" |
| 5846 | |
| 5847 | :: |
| 5848 | |
| 5849 | <resultval> = va_arg <va_list*> <arglist>, <argty> |
| 5850 | |
| 5851 | Overview: |
| 5852 | """"""""" |
| 5853 | |
| 5854 | The '``va_arg``' instruction is used to access arguments passed through |
| 5855 | the "variable argument" area of a function call. It is used to implement |
| 5856 | the ``va_arg`` macro in C. |
| 5857 | |
| 5858 | Arguments: |
| 5859 | """""""""" |
| 5860 | |
| 5861 | This instruction takes a ``va_list*`` value and the type of the |
| 5862 | argument. It returns a value of the specified argument type and |
| 5863 | increments the ``va_list`` to point to the next argument. The actual |
| 5864 | type of ``va_list`` is target specific. |
| 5865 | |
| 5866 | Semantics: |
| 5867 | """""""""" |
| 5868 | |
| 5869 | The '``va_arg``' instruction loads an argument of the specified type |
| 5870 | from the specified ``va_list`` and causes the ``va_list`` to point to |
| 5871 | the next argument. For more information, see the variable argument |
| 5872 | handling :ref:`Intrinsic Functions <int_varargs>`. |
| 5873 | |
| 5874 | It is legal for this instruction to be called in a function which does |
| 5875 | not take a variable number of arguments, for example, the ``vfprintf`` |
| 5876 | function. |
| 5877 | |
| 5878 | ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic |
| 5879 | function <intrinsics>` because it takes a type as an argument. |
| 5880 | |
| 5881 | Example: |
| 5882 | """""""" |
| 5883 | |
| 5884 | See the :ref:`variable argument processing <int_varargs>` section. |
| 5885 | |
| 5886 | Note that the code generator does not yet fully support va\_arg on many |
| 5887 | targets. Also, it does not currently support va\_arg with aggregate |
| 5888 | types on any target. |
| 5889 | |
| 5890 | .. _i_landingpad: |
| 5891 | |
| 5892 | '``landingpad``' Instruction |
| 5893 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5894 | |
| 5895 | Syntax: |
| 5896 | """"""" |
| 5897 | |
| 5898 | :: |
| 5899 | |
| 5900 | <resultval> = landingpad <resultty> personality <type> <pers_fn> <clause>+ |
| 5901 | <resultval> = landingpad <resultty> personality <type> <pers_fn> cleanup <clause>* |
| 5902 | |
| 5903 | <clause> := catch <type> <value> |
| 5904 | <clause> := filter <array constant type> <array constant> |
| 5905 | |
| 5906 | Overview: |
| 5907 | """"""""" |
| 5908 | |
| 5909 | The '``landingpad``' instruction is used by `LLVM's exception handling |
| 5910 | system <ExceptionHandling.html#overview>`_ to specify that a basic block |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 5911 | is a landing pad --- one where the exception lands, and corresponds to the |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5912 | code found in the ``catch`` portion of a ``try``/``catch`` sequence. It |
| 5913 | defines values supplied by the personality function (``pers_fn``) upon |
| 5914 | re-entry to the function. The ``resultval`` has the type ``resultty``. |
| 5915 | |
| 5916 | Arguments: |
| 5917 | """""""""" |
| 5918 | |
| 5919 | This instruction takes a ``pers_fn`` value. This is the personality |
| 5920 | function associated with the unwinding mechanism. The optional |
| 5921 | ``cleanup`` flag indicates that the landing pad block is a cleanup. |
| 5922 | |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 5923 | A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5924 | contains the global variable representing the "type" that may be caught |
| 5925 | or filtered respectively. Unlike the ``catch`` clause, the ``filter`` |
| 5926 | clause takes an array constant as its argument. Use |
| 5927 | "``[0 x i8**] undef``" for a filter which cannot throw. The |
| 5928 | '``landingpad``' instruction must contain *at least* one ``clause`` or |
| 5929 | the ``cleanup`` flag. |
| 5930 | |
| 5931 | Semantics: |
| 5932 | """""""""" |
| 5933 | |
| 5934 | The '``landingpad``' instruction defines the values which are set by the |
| 5935 | personality function (``pers_fn``) upon re-entry to the function, and |
| 5936 | therefore the "result type" of the ``landingpad`` instruction. As with |
| 5937 | calling conventions, how the personality function results are |
| 5938 | represented in LLVM IR is target specific. |
| 5939 | |
| 5940 | The clauses are applied in order from top to bottom. If two |
| 5941 | ``landingpad`` instructions are merged together through inlining, the |
| 5942 | clauses from the calling function are appended to the list of clauses. |
| 5943 | When the call stack is being unwound due to an exception being thrown, |
| 5944 | the exception is compared against each ``clause`` in turn. If it doesn't |
| 5945 | match any of the clauses, and the ``cleanup`` flag is not set, then |
| 5946 | unwinding continues further up the call stack. |
| 5947 | |
| 5948 | The ``landingpad`` instruction has several restrictions: |
| 5949 | |
| 5950 | - A landing pad block is a basic block which is the unwind destination |
| 5951 | of an '``invoke``' instruction. |
| 5952 | - A landing pad block must have a '``landingpad``' instruction as its |
| 5953 | first non-PHI instruction. |
| 5954 | - There can be only one '``landingpad``' instruction within the landing |
| 5955 | pad block. |
| 5956 | - A basic block that is not a landing pad block may not include a |
| 5957 | '``landingpad``' instruction. |
| 5958 | - All '``landingpad``' instructions in a function must have the same |
| 5959 | personality function. |
| 5960 | |
| 5961 | Example: |
| 5962 | """""""" |
| 5963 | |
| 5964 | .. code-block:: llvm |
| 5965 | |
| 5966 | ;; A landing pad which can catch an integer. |
| 5967 | %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0 |
| 5968 | catch i8** @_ZTIi |
| 5969 | ;; A landing pad that is a cleanup. |
| 5970 | %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0 |
| 5971 | cleanup |
| 5972 | ;; A landing pad which can catch an integer and can only throw a double. |
| 5973 | %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0 |
| 5974 | catch i8** @_ZTIi |
| 5975 | filter [1 x i8**] [@_ZTId] |
| 5976 | |
| 5977 | .. _intrinsics: |
| 5978 | |
| 5979 | Intrinsic Functions |
| 5980 | =================== |
| 5981 | |
| 5982 | LLVM supports the notion of an "intrinsic function". These functions |
| 5983 | have well known names and semantics and are required to follow certain |
| 5984 | restrictions. Overall, these intrinsics represent an extension mechanism |
| 5985 | for the LLVM language that does not require changing all of the |
| 5986 | transformations in LLVM when adding to the language (or the bitcode |
| 5987 | reader/writer, the parser, etc...). |
| 5988 | |
| 5989 | Intrinsic function names must all start with an "``llvm.``" prefix. This |
| 5990 | prefix is reserved in LLVM for intrinsic names; thus, function names may |
| 5991 | not begin with this prefix. Intrinsic functions must always be external |
| 5992 | functions: you cannot define the body of intrinsic functions. Intrinsic |
| 5993 | functions may only be used in call or invoke instructions: it is illegal |
| 5994 | to take the address of an intrinsic function. Additionally, because |
| 5995 | intrinsic functions are part of the LLVM language, it is required if any |
| 5996 | are added that they be documented here. |
| 5997 | |
| 5998 | Some intrinsic functions can be overloaded, i.e., the intrinsic |
| 5999 | represents a family of functions that perform the same operation but on |
| 6000 | different data types. Because LLVM can represent over 8 million |
| 6001 | different integer types, overloading is used commonly to allow an |
| 6002 | intrinsic function to operate on any integer type. One or more of the |
| 6003 | argument types or the result type can be overloaded to accept any |
| 6004 | integer type. Argument types may also be defined as exactly matching a |
| 6005 | previous argument's type or the result type. This allows an intrinsic |
| 6006 | function which accepts multiple arguments, but needs all of them to be |
| 6007 | of the same type, to only be overloaded with respect to a single |
| 6008 | argument or the result. |
| 6009 | |
| 6010 | Overloaded intrinsics will have the names of its overloaded argument |
| 6011 | types encoded into its function name, each preceded by a period. Only |
| 6012 | those types which are overloaded result in a name suffix. Arguments |
| 6013 | whose type is matched against another type do not. For example, the |
| 6014 | ``llvm.ctpop`` function can take an integer of any width and returns an |
| 6015 | integer of exactly the same integer width. This leads to a family of |
| 6016 | functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and |
| 6017 | ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is |
| 6018 | overloaded, and only one type suffix is required. Because the argument's |
| 6019 | type is matched against the return type, it does not require its own |
| 6020 | name suffix. |
| 6021 | |
| 6022 | To learn how to add an intrinsic function, please see the `Extending |
| 6023 | LLVM Guide <ExtendingLLVM.html>`_. |
| 6024 | |
| 6025 | .. _int_varargs: |
| 6026 | |
| 6027 | Variable Argument Handling Intrinsics |
| 6028 | ------------------------------------- |
| 6029 | |
| 6030 | Variable argument support is defined in LLVM with the |
| 6031 | :ref:`va_arg <i_va_arg>` instruction and these three intrinsic |
| 6032 | functions. These functions are related to the similarly named macros |
| 6033 | defined in the ``<stdarg.h>`` header file. |
| 6034 | |
| 6035 | All of these functions operate on arguments that use a target-specific |
| 6036 | value type "``va_list``". The LLVM assembly language reference manual |
| 6037 | does not define what this type is, so all transformations should be |
| 6038 | prepared to handle these functions regardless of the type used. |
| 6039 | |
| 6040 | This example shows how the :ref:`va_arg <i_va_arg>` instruction and the |
| 6041 | variable argument handling intrinsic functions are used. |
| 6042 | |
| 6043 | .. code-block:: llvm |
| 6044 | |
| 6045 | define i32 @test(i32 %X, ...) { |
| 6046 | ; Initialize variable argument processing |
| 6047 | %ap = alloca i8* |
| 6048 | %ap2 = bitcast i8** %ap to i8* |
| 6049 | call void @llvm.va_start(i8* %ap2) |
| 6050 | |
| 6051 | ; Read a single integer argument |
| 6052 | %tmp = va_arg i8** %ap, i32 |
| 6053 | |
| 6054 | ; Demonstrate usage of llvm.va_copy and llvm.va_end |
| 6055 | %aq = alloca i8* |
| 6056 | %aq2 = bitcast i8** %aq to i8* |
| 6057 | call void @llvm.va_copy(i8* %aq2, i8* %ap2) |
| 6058 | call void @llvm.va_end(i8* %aq2) |
| 6059 | |
| 6060 | ; Stop processing of arguments. |
| 6061 | call void @llvm.va_end(i8* %ap2) |
| 6062 | ret i32 %tmp |
| 6063 | } |
| 6064 | |
| 6065 | declare void @llvm.va_start(i8*) |
| 6066 | declare void @llvm.va_copy(i8*, i8*) |
| 6067 | declare void @llvm.va_end(i8*) |
| 6068 | |
| 6069 | .. _int_va_start: |
| 6070 | |
| 6071 | '``llvm.va_start``' Intrinsic |
| 6072 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6073 | |
| 6074 | Syntax: |
| 6075 | """"""" |
| 6076 | |
| 6077 | :: |
| 6078 | |
| 6079 | declare void %llvm.va_start(i8* <arglist>) |
| 6080 | |
| 6081 | Overview: |
| 6082 | """"""""" |
| 6083 | |
| 6084 | The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for |
| 6085 | subsequent use by ``va_arg``. |
| 6086 | |
| 6087 | Arguments: |
| 6088 | """""""""" |
| 6089 | |
| 6090 | The argument is a pointer to a ``va_list`` element to initialize. |
| 6091 | |
| 6092 | Semantics: |
| 6093 | """""""""" |
| 6094 | |
| 6095 | The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro |
| 6096 | available in C. In a target-dependent way, it initializes the |
| 6097 | ``va_list`` element to which the argument points, so that the next call |
| 6098 | to ``va_arg`` will produce the first variable argument passed to the |
| 6099 | function. Unlike the C ``va_start`` macro, this intrinsic does not need |
| 6100 | to know the last argument of the function as the compiler can figure |
| 6101 | that out. |
| 6102 | |
| 6103 | '``llvm.va_end``' Intrinsic |
| 6104 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6105 | |
| 6106 | Syntax: |
| 6107 | """"""" |
| 6108 | |
| 6109 | :: |
| 6110 | |
| 6111 | declare void @llvm.va_end(i8* <arglist>) |
| 6112 | |
| 6113 | Overview: |
| 6114 | """"""""" |
| 6115 | |
| 6116 | The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been |
| 6117 | initialized previously with ``llvm.va_start`` or ``llvm.va_copy``. |
| 6118 | |
| 6119 | Arguments: |
| 6120 | """""""""" |
| 6121 | |
| 6122 | The argument is a pointer to a ``va_list`` to destroy. |
| 6123 | |
| 6124 | Semantics: |
| 6125 | """""""""" |
| 6126 | |
| 6127 | The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro |
| 6128 | available in C. In a target-dependent way, it destroys the ``va_list`` |
| 6129 | element to which the argument points. Calls to |
| 6130 | :ref:`llvm.va_start <int_va_start>` and |
| 6131 | :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to |
| 6132 | ``llvm.va_end``. |
| 6133 | |
| 6134 | .. _int_va_copy: |
| 6135 | |
| 6136 | '``llvm.va_copy``' Intrinsic |
| 6137 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6138 | |
| 6139 | Syntax: |
| 6140 | """"""" |
| 6141 | |
| 6142 | :: |
| 6143 | |
| 6144 | declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>) |
| 6145 | |
| 6146 | Overview: |
| 6147 | """"""""" |
| 6148 | |
| 6149 | The '``llvm.va_copy``' intrinsic copies the current argument position |
| 6150 | from the source argument list to the destination argument list. |
| 6151 | |
| 6152 | Arguments: |
| 6153 | """""""""" |
| 6154 | |
| 6155 | The first argument is a pointer to a ``va_list`` element to initialize. |
| 6156 | The second argument is a pointer to a ``va_list`` element to copy from. |
| 6157 | |
| 6158 | Semantics: |
| 6159 | """""""""" |
| 6160 | |
| 6161 | The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro |
| 6162 | available in C. In a target-dependent way, it copies the source |
| 6163 | ``va_list`` element into the destination ``va_list`` element. This |
| 6164 | intrinsic is necessary because the `` llvm.va_start`` intrinsic may be |
| 6165 | arbitrarily complex and require, for example, memory allocation. |
| 6166 | |
| 6167 | Accurate Garbage Collection Intrinsics |
| 6168 | -------------------------------------- |
| 6169 | |
| 6170 | LLVM support for `Accurate Garbage Collection <GarbageCollection.html>`_ |
| 6171 | (GC) requires the implementation and generation of these intrinsics. |
| 6172 | These intrinsics allow identification of :ref:`GC roots on the |
| 6173 | stack <int_gcroot>`, as well as garbage collector implementations that |
| 6174 | require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. |
| 6175 | Front-ends for type-safe garbage collected languages should generate |
| 6176 | these intrinsics to make use of the LLVM garbage collectors. For more |
| 6177 | details, see `Accurate Garbage Collection with |
| 6178 | LLVM <GarbageCollection.html>`_. |
| 6179 | |
| 6180 | The garbage collection intrinsics only operate on objects in the generic |
| 6181 | address space (address space zero). |
| 6182 | |
| 6183 | .. _int_gcroot: |
| 6184 | |
| 6185 | '``llvm.gcroot``' Intrinsic |
| 6186 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6187 | |
| 6188 | Syntax: |
| 6189 | """"""" |
| 6190 | |
| 6191 | :: |
| 6192 | |
| 6193 | declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata) |
| 6194 | |
| 6195 | Overview: |
| 6196 | """"""""" |
| 6197 | |
| 6198 | The '``llvm.gcroot``' intrinsic declares the existence of a GC root to |
| 6199 | the code generator, and allows some metadata to be associated with it. |
| 6200 | |
| 6201 | Arguments: |
| 6202 | """""""""" |
| 6203 | |
| 6204 | The first argument specifies the address of a stack object that contains |
| 6205 | the root pointer. The second pointer (which must be either a constant or |
| 6206 | a global value address) contains the meta-data to be associated with the |
| 6207 | root. |
| 6208 | |
| 6209 | Semantics: |
| 6210 | """""""""" |
| 6211 | |
| 6212 | At runtime, a call to this intrinsic stores a null pointer into the |
| 6213 | "ptrloc" location. At compile-time, the code generator generates |
| 6214 | information to allow the runtime to find the pointer at GC safe points. |
| 6215 | The '``llvm.gcroot``' intrinsic may only be used in a function which |
| 6216 | :ref:`specifies a GC algorithm <gc>`. |
| 6217 | |
| 6218 | .. _int_gcread: |
| 6219 | |
| 6220 | '``llvm.gcread``' Intrinsic |
| 6221 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6222 | |
| 6223 | Syntax: |
| 6224 | """"""" |
| 6225 | |
| 6226 | :: |
| 6227 | |
| 6228 | declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr) |
| 6229 | |
| 6230 | Overview: |
| 6231 | """"""""" |
| 6232 | |
| 6233 | The '``llvm.gcread``' intrinsic identifies reads of references from heap |
| 6234 | locations, allowing garbage collector implementations that require read |
| 6235 | barriers. |
| 6236 | |
| 6237 | Arguments: |
| 6238 | """""""""" |
| 6239 | |
| 6240 | The second argument is the address to read from, which should be an |
| 6241 | address allocated from the garbage collector. The first object is a |
| 6242 | pointer to the start of the referenced object, if needed by the language |
| 6243 | runtime (otherwise null). |
| 6244 | |
| 6245 | Semantics: |
| 6246 | """""""""" |
| 6247 | |
| 6248 | The '``llvm.gcread``' intrinsic has the same semantics as a load |
| 6249 | instruction, but may be replaced with substantially more complex code by |
| 6250 | the garbage collector runtime, as needed. The '``llvm.gcread``' |
| 6251 | intrinsic may only be used in a function which :ref:`specifies a GC |
| 6252 | algorithm <gc>`. |
| 6253 | |
| 6254 | .. _int_gcwrite: |
| 6255 | |
| 6256 | '``llvm.gcwrite``' Intrinsic |
| 6257 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6258 | |
| 6259 | Syntax: |
| 6260 | """"""" |
| 6261 | |
| 6262 | :: |
| 6263 | |
| 6264 | declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2) |
| 6265 | |
| 6266 | Overview: |
| 6267 | """"""""" |
| 6268 | |
| 6269 | The '``llvm.gcwrite``' intrinsic identifies writes of references to heap |
| 6270 | locations, allowing garbage collector implementations that require write |
| 6271 | barriers (such as generational or reference counting collectors). |
| 6272 | |
| 6273 | Arguments: |
| 6274 | """""""""" |
| 6275 | |
| 6276 | The first argument is the reference to store, the second is the start of |
| 6277 | the object to store it to, and the third is the address of the field of |
| 6278 | Obj to store to. If the runtime does not require a pointer to the |
| 6279 | object, Obj may be null. |
| 6280 | |
| 6281 | Semantics: |
| 6282 | """""""""" |
| 6283 | |
| 6284 | The '``llvm.gcwrite``' intrinsic has the same semantics as a store |
| 6285 | instruction, but may be replaced with substantially more complex code by |
| 6286 | the garbage collector runtime, as needed. The '``llvm.gcwrite``' |
| 6287 | intrinsic may only be used in a function which :ref:`specifies a GC |
| 6288 | algorithm <gc>`. |
| 6289 | |
| 6290 | Code Generator Intrinsics |
| 6291 | ------------------------- |
| 6292 | |
| 6293 | These intrinsics are provided by LLVM to expose special features that |
| 6294 | may only be implemented with code generator support. |
| 6295 | |
| 6296 | '``llvm.returnaddress``' Intrinsic |
| 6297 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6298 | |
| 6299 | Syntax: |
| 6300 | """"""" |
| 6301 | |
| 6302 | :: |
| 6303 | |
| 6304 | declare i8 *@llvm.returnaddress(i32 <level>) |
| 6305 | |
| 6306 | Overview: |
| 6307 | """"""""" |
| 6308 | |
| 6309 | The '``llvm.returnaddress``' intrinsic attempts to compute a |
| 6310 | target-specific value indicating the return address of the current |
| 6311 | function or one of its callers. |
| 6312 | |
| 6313 | Arguments: |
| 6314 | """""""""" |
| 6315 | |
| 6316 | The argument to this intrinsic indicates which function to return the |
| 6317 | address for. Zero indicates the calling function, one indicates its |
| 6318 | caller, etc. The argument is **required** to be a constant integer |
| 6319 | value. |
| 6320 | |
| 6321 | Semantics: |
| 6322 | """""""""" |
| 6323 | |
| 6324 | The '``llvm.returnaddress``' intrinsic either returns a pointer |
| 6325 | indicating the return address of the specified call frame, or zero if it |
| 6326 | cannot be identified. The value returned by this intrinsic is likely to |
| 6327 | be incorrect or 0 for arguments other than zero, so it should only be |
| 6328 | used for debugging purposes. |
| 6329 | |
| 6330 | Note that calling this intrinsic does not prevent function inlining or |
| 6331 | other aggressive transformations, so the value returned may not be that |
| 6332 | of the obvious source-language caller. |
| 6333 | |
| 6334 | '``llvm.frameaddress``' Intrinsic |
| 6335 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6336 | |
| 6337 | Syntax: |
| 6338 | """"""" |
| 6339 | |
| 6340 | :: |
| 6341 | |
| 6342 | declare i8* @llvm.frameaddress(i32 <level>) |
| 6343 | |
| 6344 | Overview: |
| 6345 | """"""""" |
| 6346 | |
| 6347 | The '``llvm.frameaddress``' intrinsic attempts to return the |
| 6348 | target-specific frame pointer value for the specified stack frame. |
| 6349 | |
| 6350 | Arguments: |
| 6351 | """""""""" |
| 6352 | |
| 6353 | The argument to this intrinsic indicates which function to return the |
| 6354 | frame pointer for. Zero indicates the calling function, one indicates |
| 6355 | its caller, etc. The argument is **required** to be a constant integer |
| 6356 | value. |
| 6357 | |
| 6358 | Semantics: |
| 6359 | """""""""" |
| 6360 | |
| 6361 | The '``llvm.frameaddress``' intrinsic either returns a pointer |
| 6362 | indicating the frame address of the specified call frame, or zero if it |
| 6363 | cannot be identified. The value returned by this intrinsic is likely to |
| 6364 | be incorrect or 0 for arguments other than zero, so it should only be |
| 6365 | used for debugging purposes. |
| 6366 | |
| 6367 | Note that calling this intrinsic does not prevent function inlining or |
| 6368 | other aggressive transformations, so the value returned may not be that |
| 6369 | of the obvious source-language caller. |
| 6370 | |
| 6371 | .. _int_stacksave: |
| 6372 | |
| 6373 | '``llvm.stacksave``' Intrinsic |
| 6374 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6375 | |
| 6376 | Syntax: |
| 6377 | """"""" |
| 6378 | |
| 6379 | :: |
| 6380 | |
| 6381 | declare i8* @llvm.stacksave() |
| 6382 | |
| 6383 | Overview: |
| 6384 | """"""""" |
| 6385 | |
| 6386 | The '``llvm.stacksave``' intrinsic is used to remember the current state |
| 6387 | of the function stack, for use with |
| 6388 | :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for |
| 6389 | implementing language features like scoped automatic variable sized |
| 6390 | arrays in C99. |
| 6391 | |
| 6392 | Semantics: |
| 6393 | """""""""" |
| 6394 | |
| 6395 | This intrinsic returns a opaque pointer value that can be passed to |
| 6396 | :ref:`llvm.stackrestore <int_stackrestore>`. When an |
| 6397 | ``llvm.stackrestore`` intrinsic is executed with a value saved from |
| 6398 | ``llvm.stacksave``, it effectively restores the state of the stack to |
| 6399 | the state it was in when the ``llvm.stacksave`` intrinsic executed. In |
| 6400 | practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that |
| 6401 | were allocated after the ``llvm.stacksave`` was executed. |
| 6402 | |
| 6403 | .. _int_stackrestore: |
| 6404 | |
| 6405 | '``llvm.stackrestore``' Intrinsic |
| 6406 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6407 | |
| 6408 | Syntax: |
| 6409 | """"""" |
| 6410 | |
| 6411 | :: |
| 6412 | |
| 6413 | declare void @llvm.stackrestore(i8* %ptr) |
| 6414 | |
| 6415 | Overview: |
| 6416 | """"""""" |
| 6417 | |
| 6418 | The '``llvm.stackrestore``' intrinsic is used to restore the state of |
| 6419 | the function stack to the state it was in when the corresponding |
| 6420 | :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is |
| 6421 | useful for implementing language features like scoped automatic variable |
| 6422 | sized arrays in C99. |
| 6423 | |
| 6424 | Semantics: |
| 6425 | """""""""" |
| 6426 | |
| 6427 | See the description for :ref:`llvm.stacksave <int_stacksave>`. |
| 6428 | |
| 6429 | '``llvm.prefetch``' Intrinsic |
| 6430 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6431 | |
| 6432 | Syntax: |
| 6433 | """"""" |
| 6434 | |
| 6435 | :: |
| 6436 | |
| 6437 | declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>) |
| 6438 | |
| 6439 | Overview: |
| 6440 | """"""""" |
| 6441 | |
| 6442 | The '``llvm.prefetch``' intrinsic is a hint to the code generator to |
| 6443 | insert a prefetch instruction if supported; otherwise, it is a noop. |
| 6444 | Prefetches have no effect on the behavior of the program but can change |
| 6445 | its performance characteristics. |
| 6446 | |
| 6447 | Arguments: |
| 6448 | """""""""" |
| 6449 | |
| 6450 | ``address`` is the address to be prefetched, ``rw`` is the specifier |
| 6451 | determining if the fetch should be for a read (0) or write (1), and |
| 6452 | ``locality`` is a temporal locality specifier ranging from (0) - no |
| 6453 | locality, to (3) - extremely local keep in cache. The ``cache type`` |
| 6454 | specifies whether the prefetch is performed on the data (1) or |
| 6455 | instruction (0) cache. The ``rw``, ``locality`` and ``cache type`` |
| 6456 | arguments must be constant integers. |
| 6457 | |
| 6458 | Semantics: |
| 6459 | """""""""" |
| 6460 | |
| 6461 | This intrinsic does not modify the behavior of the program. In |
| 6462 | particular, prefetches cannot trap and do not produce a value. On |
| 6463 | targets that support this intrinsic, the prefetch can provide hints to |
| 6464 | the processor cache for better performance. |
| 6465 | |
| 6466 | '``llvm.pcmarker``' Intrinsic |
| 6467 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6468 | |
| 6469 | Syntax: |
| 6470 | """"""" |
| 6471 | |
| 6472 | :: |
| 6473 | |
| 6474 | declare void @llvm.pcmarker(i32 <id>) |
| 6475 | |
| 6476 | Overview: |
| 6477 | """"""""" |
| 6478 | |
| 6479 | The '``llvm.pcmarker``' intrinsic is a method to export a Program |
| 6480 | Counter (PC) in a region of code to simulators and other tools. The |
| 6481 | method is target specific, but it is expected that the marker will use |
| 6482 | exported symbols to transmit the PC of the marker. The marker makes no |
| 6483 | guarantees that it will remain with any specific instruction after |
| 6484 | optimizations. It is possible that the presence of a marker will inhibit |
| 6485 | optimizations. The intended use is to be inserted after optimizations to |
| 6486 | allow correlations of simulation runs. |
| 6487 | |
| 6488 | Arguments: |
| 6489 | """""""""" |
| 6490 | |
| 6491 | ``id`` is a numerical id identifying the marker. |
| 6492 | |
| 6493 | Semantics: |
| 6494 | """""""""" |
| 6495 | |
| 6496 | This intrinsic does not modify the behavior of the program. Backends |
| 6497 | that do not support this intrinsic may ignore it. |
| 6498 | |
| 6499 | '``llvm.readcyclecounter``' Intrinsic |
| 6500 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6501 | |
| 6502 | Syntax: |
| 6503 | """"""" |
| 6504 | |
| 6505 | :: |
| 6506 | |
| 6507 | declare i64 @llvm.readcyclecounter() |
| 6508 | |
| 6509 | Overview: |
| 6510 | """"""""" |
| 6511 | |
| 6512 | The '``llvm.readcyclecounter``' intrinsic provides access to the cycle |
| 6513 | counter register (or similar low latency, high accuracy clocks) on those |
| 6514 | targets that support it. On X86, it should map to RDTSC. On Alpha, it |
| 6515 | should map to RPCC. As the backing counters overflow quickly (on the |
| 6516 | order of 9 seconds on alpha), this should only be used for small |
| 6517 | timings. |
| 6518 | |
| 6519 | Semantics: |
| 6520 | """""""""" |
| 6521 | |
| 6522 | When directly supported, reading the cycle counter should not modify any |
| 6523 | memory. Implementations are allowed to either return a application |
| 6524 | specific value or a system wide value. On backends without support, this |
| 6525 | is lowered to a constant 0. |
| 6526 | |
| 6527 | Standard C Library Intrinsics |
| 6528 | ----------------------------- |
| 6529 | |
| 6530 | LLVM provides intrinsics for a few important standard C library |
| 6531 | functions. These intrinsics allow source-language front-ends to pass |
| 6532 | information about the alignment of the pointer arguments to the code |
| 6533 | generator, providing opportunity for more efficient code generation. |
| 6534 | |
| 6535 | .. _int_memcpy: |
| 6536 | |
| 6537 | '``llvm.memcpy``' Intrinsic |
| 6538 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6539 | |
| 6540 | Syntax: |
| 6541 | """"""" |
| 6542 | |
| 6543 | This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any |
| 6544 | integer bit width and for different address spaces. Not all targets |
| 6545 | support all bit widths however. |
| 6546 | |
| 6547 | :: |
| 6548 | |
| 6549 | declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, |
| 6550 | i32 <len>, i32 <align>, i1 <isvolatile>) |
| 6551 | declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>, |
| 6552 | i64 <len>, i32 <align>, i1 <isvolatile>) |
| 6553 | |
| 6554 | Overview: |
| 6555 | """"""""" |
| 6556 | |
| 6557 | The '``llvm.memcpy.*``' intrinsics copy a block of memory from the |
| 6558 | source location to the destination location. |
| 6559 | |
| 6560 | Note that, unlike the standard libc function, the ``llvm.memcpy.*`` |
| 6561 | intrinsics do not return a value, takes extra alignment/isvolatile |
| 6562 | arguments and the pointers can be in specified address spaces. |
| 6563 | |
| 6564 | Arguments: |
| 6565 | """""""""" |
| 6566 | |
| 6567 | The first argument is a pointer to the destination, the second is a |
| 6568 | pointer to the source. The third argument is an integer argument |
| 6569 | specifying the number of bytes to copy, the fourth argument is the |
| 6570 | alignment of the source and destination locations, and the fifth is a |
| 6571 | boolean indicating a volatile access. |
| 6572 | |
| 6573 | If the call to this intrinsic has an alignment value that is not 0 or 1, |
| 6574 | then the caller guarantees that both the source and destination pointers |
| 6575 | are aligned to that boundary. |
| 6576 | |
| 6577 | If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is |
| 6578 | a :ref:`volatile operation <volatile>`. The detailed access behavior is not |
| 6579 | very cleanly specified and it is unwise to depend on it. |
| 6580 | |
| 6581 | Semantics: |
| 6582 | """""""""" |
| 6583 | |
| 6584 | The '``llvm.memcpy.*``' intrinsics copy a block of memory from the |
| 6585 | source location to the destination location, which are not allowed to |
| 6586 | overlap. It copies "len" bytes of memory over. If the argument is known |
| 6587 | to be aligned to some boundary, this can be specified as the fourth |
| 6588 | argument, otherwise it should be set to 0 or 1. |
| 6589 | |
| 6590 | '``llvm.memmove``' Intrinsic |
| 6591 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6592 | |
| 6593 | Syntax: |
| 6594 | """"""" |
| 6595 | |
| 6596 | This is an overloaded intrinsic. You can use llvm.memmove on any integer |
| 6597 | bit width and for different address space. Not all targets support all |
| 6598 | bit widths however. |
| 6599 | |
| 6600 | :: |
| 6601 | |
| 6602 | declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>, |
| 6603 | i32 <len>, i32 <align>, i1 <isvolatile>) |
| 6604 | declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>, |
| 6605 | i64 <len>, i32 <align>, i1 <isvolatile>) |
| 6606 | |
| 6607 | Overview: |
| 6608 | """"""""" |
| 6609 | |
| 6610 | The '``llvm.memmove.*``' intrinsics move a block of memory from the |
| 6611 | source location to the destination location. It is similar to the |
| 6612 | '``llvm.memcpy``' intrinsic but allows the two memory locations to |
| 6613 | overlap. |
| 6614 | |
| 6615 | Note that, unlike the standard libc function, the ``llvm.memmove.*`` |
| 6616 | intrinsics do not return a value, takes extra alignment/isvolatile |
| 6617 | arguments and the pointers can be in specified address spaces. |
| 6618 | |
| 6619 | Arguments: |
| 6620 | """""""""" |
| 6621 | |
| 6622 | The first argument is a pointer to the destination, the second is a |
| 6623 | pointer to the source. The third argument is an integer argument |
| 6624 | specifying the number of bytes to copy, the fourth argument is the |
| 6625 | alignment of the source and destination locations, and the fifth is a |
| 6626 | boolean indicating a volatile access. |
| 6627 | |
| 6628 | If the call to this intrinsic has an alignment value that is not 0 or 1, |
| 6629 | then the caller guarantees that the source and destination pointers are |
| 6630 | aligned to that boundary. |
| 6631 | |
| 6632 | If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call |
| 6633 | is a :ref:`volatile operation <volatile>`. The detailed access behavior is |
| 6634 | not very cleanly specified and it is unwise to depend on it. |
| 6635 | |
| 6636 | Semantics: |
| 6637 | """""""""" |
| 6638 | |
| 6639 | The '``llvm.memmove.*``' intrinsics copy a block of memory from the |
| 6640 | source location to the destination location, which may overlap. It |
| 6641 | copies "len" bytes of memory over. If the argument is known to be |
| 6642 | aligned to some boundary, this can be specified as the fourth argument, |
| 6643 | otherwise it should be set to 0 or 1. |
| 6644 | |
| 6645 | '``llvm.memset.*``' Intrinsics |
| 6646 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6647 | |
| 6648 | Syntax: |
| 6649 | """"""" |
| 6650 | |
| 6651 | This is an overloaded intrinsic. You can use llvm.memset on any integer |
| 6652 | bit width and for different address spaces. However, not all targets |
| 6653 | support all bit widths. |
| 6654 | |
| 6655 | :: |
| 6656 | |
| 6657 | declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>, |
| 6658 | i32 <len>, i32 <align>, i1 <isvolatile>) |
| 6659 | declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>, |
| 6660 | i64 <len>, i32 <align>, i1 <isvolatile>) |
| 6661 | |
| 6662 | Overview: |
| 6663 | """"""""" |
| 6664 | |
| 6665 | The '``llvm.memset.*``' intrinsics fill a block of memory with a |
| 6666 | particular byte value. |
| 6667 | |
| 6668 | Note that, unlike the standard libc function, the ``llvm.memset`` |
| 6669 | intrinsic does not return a value and takes extra alignment/volatile |
| 6670 | arguments. Also, the destination can be in an arbitrary address space. |
| 6671 | |
| 6672 | Arguments: |
| 6673 | """""""""" |
| 6674 | |
| 6675 | The first argument is a pointer to the destination to fill, the second |
| 6676 | is the byte value with which to fill it, the third argument is an |
| 6677 | integer argument specifying the number of bytes to fill, and the fourth |
| 6678 | argument is the known alignment of the destination location. |
| 6679 | |
| 6680 | If the call to this intrinsic has an alignment value that is not 0 or 1, |
| 6681 | then the caller guarantees that the destination pointer is aligned to |
| 6682 | that boundary. |
| 6683 | |
| 6684 | If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is |
| 6685 | a :ref:`volatile operation <volatile>`. The detailed access behavior is not |
| 6686 | very cleanly specified and it is unwise to depend on it. |
| 6687 | |
| 6688 | Semantics: |
| 6689 | """""""""" |
| 6690 | |
| 6691 | The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting |
| 6692 | at the destination location. If the argument is known to be aligned to |
| 6693 | some boundary, this can be specified as the fourth argument, otherwise |
| 6694 | it should be set to 0 or 1. |
| 6695 | |
| 6696 | '``llvm.sqrt.*``' Intrinsic |
| 6697 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6698 | |
| 6699 | Syntax: |
| 6700 | """"""" |
| 6701 | |
| 6702 | This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any |
| 6703 | floating point or vector of floating point type. Not all targets support |
| 6704 | all types however. |
| 6705 | |
| 6706 | :: |
| 6707 | |
| 6708 | declare float @llvm.sqrt.f32(float %Val) |
| 6709 | declare double @llvm.sqrt.f64(double %Val) |
| 6710 | declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val) |
| 6711 | declare fp128 @llvm.sqrt.f128(fp128 %Val) |
| 6712 | declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val) |
| 6713 | |
| 6714 | Overview: |
| 6715 | """"""""" |
| 6716 | |
| 6717 | The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand, |
| 6718 | returning the same value as the libm '``sqrt``' functions would. Unlike |
| 6719 | ``sqrt`` in libm, however, ``llvm.sqrt`` has undefined behavior for |
| 6720 | negative numbers other than -0.0 (which allows for better optimization, |
| 6721 | because there is no need to worry about errno being set). |
| 6722 | ``llvm.sqrt(-0.0)`` is defined to return -0.0 like IEEE sqrt. |
| 6723 | |
| 6724 | Arguments: |
| 6725 | """""""""" |
| 6726 | |
| 6727 | The argument and return value are floating point numbers of the same |
| 6728 | type. |
| 6729 | |
| 6730 | Semantics: |
| 6731 | """""""""" |
| 6732 | |
| 6733 | This function returns the sqrt of the specified operand if it is a |
| 6734 | nonnegative floating point number. |
| 6735 | |
| 6736 | '``llvm.powi.*``' Intrinsic |
| 6737 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6738 | |
| 6739 | Syntax: |
| 6740 | """"""" |
| 6741 | |
| 6742 | This is an overloaded intrinsic. You can use ``llvm.powi`` on any |
| 6743 | floating point or vector of floating point type. Not all targets support |
| 6744 | all types however. |
| 6745 | |
| 6746 | :: |
| 6747 | |
| 6748 | declare float @llvm.powi.f32(float %Val, i32 %power) |
| 6749 | declare double @llvm.powi.f64(double %Val, i32 %power) |
| 6750 | declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power) |
| 6751 | declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power) |
| 6752 | declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power) |
| 6753 | |
| 6754 | Overview: |
| 6755 | """"""""" |
| 6756 | |
| 6757 | The '``llvm.powi.*``' intrinsics return the first operand raised to the |
| 6758 | specified (positive or negative) power. The order of evaluation of |
| 6759 | multiplications is not defined. When a vector of floating point type is |
| 6760 | used, the second argument remains a scalar integer value. |
| 6761 | |
| 6762 | Arguments: |
| 6763 | """""""""" |
| 6764 | |
| 6765 | The second argument is an integer power, and the first is a value to |
| 6766 | raise to that power. |
| 6767 | |
| 6768 | Semantics: |
| 6769 | """""""""" |
| 6770 | |
| 6771 | This function returns the first value raised to the second power with an |
| 6772 | unspecified sequence of rounding operations. |
| 6773 | |
| 6774 | '``llvm.sin.*``' Intrinsic |
| 6775 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6776 | |
| 6777 | Syntax: |
| 6778 | """"""" |
| 6779 | |
| 6780 | This is an overloaded intrinsic. You can use ``llvm.sin`` on any |
| 6781 | floating point or vector of floating point type. Not all targets support |
| 6782 | all types however. |
| 6783 | |
| 6784 | :: |
| 6785 | |
| 6786 | declare float @llvm.sin.f32(float %Val) |
| 6787 | declare double @llvm.sin.f64(double %Val) |
| 6788 | declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val) |
| 6789 | declare fp128 @llvm.sin.f128(fp128 %Val) |
| 6790 | declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val) |
| 6791 | |
| 6792 | Overview: |
| 6793 | """"""""" |
| 6794 | |
| 6795 | The '``llvm.sin.*``' intrinsics return the sine of the operand. |
| 6796 | |
| 6797 | Arguments: |
| 6798 | """""""""" |
| 6799 | |
| 6800 | The argument and return value are floating point numbers of the same |
| 6801 | type. |
| 6802 | |
| 6803 | Semantics: |
| 6804 | """""""""" |
| 6805 | |
| 6806 | This function returns the sine of the specified operand, returning the |
| 6807 | same values as the libm ``sin`` functions would, and handles error |
| 6808 | conditions in the same way. |
| 6809 | |
| 6810 | '``llvm.cos.*``' Intrinsic |
| 6811 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6812 | |
| 6813 | Syntax: |
| 6814 | """"""" |
| 6815 | |
| 6816 | This is an overloaded intrinsic. You can use ``llvm.cos`` on any |
| 6817 | floating point or vector of floating point type. Not all targets support |
| 6818 | all types however. |
| 6819 | |
| 6820 | :: |
| 6821 | |
| 6822 | declare float @llvm.cos.f32(float %Val) |
| 6823 | declare double @llvm.cos.f64(double %Val) |
| 6824 | declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val) |
| 6825 | declare fp128 @llvm.cos.f128(fp128 %Val) |
| 6826 | declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val) |
| 6827 | |
| 6828 | Overview: |
| 6829 | """"""""" |
| 6830 | |
| 6831 | The '``llvm.cos.*``' intrinsics return the cosine of the operand. |
| 6832 | |
| 6833 | Arguments: |
| 6834 | """""""""" |
| 6835 | |
| 6836 | The argument and return value are floating point numbers of the same |
| 6837 | type. |
| 6838 | |
| 6839 | Semantics: |
| 6840 | """""""""" |
| 6841 | |
| 6842 | This function returns the cosine of the specified operand, returning the |
| 6843 | same values as the libm ``cos`` functions would, and handles error |
| 6844 | conditions in the same way. |
| 6845 | |
| 6846 | '``llvm.pow.*``' Intrinsic |
| 6847 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6848 | |
| 6849 | Syntax: |
| 6850 | """"""" |
| 6851 | |
| 6852 | This is an overloaded intrinsic. You can use ``llvm.pow`` on any |
| 6853 | floating point or vector of floating point type. Not all targets support |
| 6854 | all types however. |
| 6855 | |
| 6856 | :: |
| 6857 | |
| 6858 | declare float @llvm.pow.f32(float %Val, float %Power) |
| 6859 | declare double @llvm.pow.f64(double %Val, double %Power) |
| 6860 | declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power) |
| 6861 | declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power) |
| 6862 | declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power) |
| 6863 | |
| 6864 | Overview: |
| 6865 | """"""""" |
| 6866 | |
| 6867 | The '``llvm.pow.*``' intrinsics return the first operand raised to the |
| 6868 | specified (positive or negative) power. |
| 6869 | |
| 6870 | Arguments: |
| 6871 | """""""""" |
| 6872 | |
| 6873 | The second argument is a floating point power, and the first is a value |
| 6874 | to raise to that power. |
| 6875 | |
| 6876 | Semantics: |
| 6877 | """""""""" |
| 6878 | |
| 6879 | This function returns the first value raised to the second power, |
| 6880 | returning the same values as the libm ``pow`` functions would, and |
| 6881 | handles error conditions in the same way. |
| 6882 | |
| 6883 | '``llvm.exp.*``' Intrinsic |
| 6884 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6885 | |
| 6886 | Syntax: |
| 6887 | """"""" |
| 6888 | |
| 6889 | This is an overloaded intrinsic. You can use ``llvm.exp`` on any |
| 6890 | floating point or vector of floating point type. Not all targets support |
| 6891 | all types however. |
| 6892 | |
| 6893 | :: |
| 6894 | |
| 6895 | declare float @llvm.exp.f32(float %Val) |
| 6896 | declare double @llvm.exp.f64(double %Val) |
| 6897 | declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val) |
| 6898 | declare fp128 @llvm.exp.f128(fp128 %Val) |
| 6899 | declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val) |
| 6900 | |
| 6901 | Overview: |
| 6902 | """"""""" |
| 6903 | |
| 6904 | The '``llvm.exp.*``' intrinsics perform the exp function. |
| 6905 | |
| 6906 | Arguments: |
| 6907 | """""""""" |
| 6908 | |
| 6909 | The argument and return value are floating point numbers of the same |
| 6910 | type. |
| 6911 | |
| 6912 | Semantics: |
| 6913 | """""""""" |
| 6914 | |
| 6915 | This function returns the same values as the libm ``exp`` functions |
| 6916 | would, and handles error conditions in the same way. |
| 6917 | |
| 6918 | '``llvm.exp2.*``' Intrinsic |
| 6919 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6920 | |
| 6921 | Syntax: |
| 6922 | """"""" |
| 6923 | |
| 6924 | This is an overloaded intrinsic. You can use ``llvm.exp2`` on any |
| 6925 | floating point or vector of floating point type. Not all targets support |
| 6926 | all types however. |
| 6927 | |
| 6928 | :: |
| 6929 | |
| 6930 | declare float @llvm.exp2.f32(float %Val) |
| 6931 | declare double @llvm.exp2.f64(double %Val) |
| 6932 | declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val) |
| 6933 | declare fp128 @llvm.exp2.f128(fp128 %Val) |
| 6934 | declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val) |
| 6935 | |
| 6936 | Overview: |
| 6937 | """"""""" |
| 6938 | |
| 6939 | The '``llvm.exp2.*``' intrinsics perform the exp2 function. |
| 6940 | |
| 6941 | Arguments: |
| 6942 | """""""""" |
| 6943 | |
| 6944 | The argument and return value are floating point numbers of the same |
| 6945 | type. |
| 6946 | |
| 6947 | Semantics: |
| 6948 | """""""""" |
| 6949 | |
| 6950 | This function returns the same values as the libm ``exp2`` functions |
| 6951 | would, and handles error conditions in the same way. |
| 6952 | |
| 6953 | '``llvm.log.*``' Intrinsic |
| 6954 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6955 | |
| 6956 | Syntax: |
| 6957 | """"""" |
| 6958 | |
| 6959 | This is an overloaded intrinsic. You can use ``llvm.log`` on any |
| 6960 | floating point or vector of floating point type. Not all targets support |
| 6961 | all types however. |
| 6962 | |
| 6963 | :: |
| 6964 | |
| 6965 | declare float @llvm.log.f32(float %Val) |
| 6966 | declare double @llvm.log.f64(double %Val) |
| 6967 | declare x86_fp80 @llvm.log.f80(x86_fp80 %Val) |
| 6968 | declare fp128 @llvm.log.f128(fp128 %Val) |
| 6969 | declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val) |
| 6970 | |
| 6971 | Overview: |
| 6972 | """"""""" |
| 6973 | |
| 6974 | The '``llvm.log.*``' intrinsics perform the log function. |
| 6975 | |
| 6976 | Arguments: |
| 6977 | """""""""" |
| 6978 | |
| 6979 | The argument and return value are floating point numbers of the same |
| 6980 | type. |
| 6981 | |
| 6982 | Semantics: |
| 6983 | """""""""" |
| 6984 | |
| 6985 | This function returns the same values as the libm ``log`` functions |
| 6986 | would, and handles error conditions in the same way. |
| 6987 | |
| 6988 | '``llvm.log10.*``' Intrinsic |
| 6989 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6990 | |
| 6991 | Syntax: |
| 6992 | """"""" |
| 6993 | |
| 6994 | This is an overloaded intrinsic. You can use ``llvm.log10`` on any |
| 6995 | floating point or vector of floating point type. Not all targets support |
| 6996 | all types however. |
| 6997 | |
| 6998 | :: |
| 6999 | |
| 7000 | declare float @llvm.log10.f32(float %Val) |
| 7001 | declare double @llvm.log10.f64(double %Val) |
| 7002 | declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val) |
| 7003 | declare fp128 @llvm.log10.f128(fp128 %Val) |
| 7004 | declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val) |
| 7005 | |
| 7006 | Overview: |
| 7007 | """"""""" |
| 7008 | |
| 7009 | The '``llvm.log10.*``' intrinsics perform the log10 function. |
| 7010 | |
| 7011 | Arguments: |
| 7012 | """""""""" |
| 7013 | |
| 7014 | The argument and return value are floating point numbers of the same |
| 7015 | type. |
| 7016 | |
| 7017 | Semantics: |
| 7018 | """""""""" |
| 7019 | |
| 7020 | This function returns the same values as the libm ``log10`` functions |
| 7021 | would, and handles error conditions in the same way. |
| 7022 | |
| 7023 | '``llvm.log2.*``' Intrinsic |
| 7024 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7025 | |
| 7026 | Syntax: |
| 7027 | """"""" |
| 7028 | |
| 7029 | This is an overloaded intrinsic. You can use ``llvm.log2`` on any |
| 7030 | floating point or vector of floating point type. Not all targets support |
| 7031 | all types however. |
| 7032 | |
| 7033 | :: |
| 7034 | |
| 7035 | declare float @llvm.log2.f32(float %Val) |
| 7036 | declare double @llvm.log2.f64(double %Val) |
| 7037 | declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val) |
| 7038 | declare fp128 @llvm.log2.f128(fp128 %Val) |
| 7039 | declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val) |
| 7040 | |
| 7041 | Overview: |
| 7042 | """"""""" |
| 7043 | |
| 7044 | The '``llvm.log2.*``' intrinsics perform the log2 function. |
| 7045 | |
| 7046 | Arguments: |
| 7047 | """""""""" |
| 7048 | |
| 7049 | The argument and return value are floating point numbers of the same |
| 7050 | type. |
| 7051 | |
| 7052 | Semantics: |
| 7053 | """""""""" |
| 7054 | |
| 7055 | This function returns the same values as the libm ``log2`` functions |
| 7056 | would, and handles error conditions in the same way. |
| 7057 | |
| 7058 | '``llvm.fma.*``' Intrinsic |
| 7059 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7060 | |
| 7061 | Syntax: |
| 7062 | """"""" |
| 7063 | |
| 7064 | This is an overloaded intrinsic. You can use ``llvm.fma`` on any |
| 7065 | floating point or vector of floating point type. Not all targets support |
| 7066 | all types however. |
| 7067 | |
| 7068 | :: |
| 7069 | |
| 7070 | declare float @llvm.fma.f32(float %a, float %b, float %c) |
| 7071 | declare double @llvm.fma.f64(double %a, double %b, double %c) |
| 7072 | declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c) |
| 7073 | declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c) |
| 7074 | declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c) |
| 7075 | |
| 7076 | Overview: |
| 7077 | """"""""" |
| 7078 | |
| 7079 | The '``llvm.fma.*``' intrinsics perform the fused multiply-add |
| 7080 | operation. |
| 7081 | |
| 7082 | Arguments: |
| 7083 | """""""""" |
| 7084 | |
| 7085 | The argument and return value are floating point numbers of the same |
| 7086 | type. |
| 7087 | |
| 7088 | Semantics: |
| 7089 | """""""""" |
| 7090 | |
| 7091 | This function returns the same values as the libm ``fma`` functions |
| 7092 | would. |
| 7093 | |
| 7094 | '``llvm.fabs.*``' Intrinsic |
| 7095 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7096 | |
| 7097 | Syntax: |
| 7098 | """"""" |
| 7099 | |
| 7100 | This is an overloaded intrinsic. You can use ``llvm.fabs`` on any |
| 7101 | floating point or vector of floating point type. Not all targets support |
| 7102 | all types however. |
| 7103 | |
| 7104 | :: |
| 7105 | |
| 7106 | declare float @llvm.fabs.f32(float %Val) |
| 7107 | declare double @llvm.fabs.f64(double %Val) |
| 7108 | declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) |
| 7109 | declare fp128 @llvm.fabs.f128(fp128 %Val) |
| 7110 | declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) |
| 7111 | |
| 7112 | Overview: |
| 7113 | """"""""" |
| 7114 | |
| 7115 | The '``llvm.fabs.*``' intrinsics return the absolute value of the |
| 7116 | operand. |
| 7117 | |
| 7118 | Arguments: |
| 7119 | """""""""" |
| 7120 | |
| 7121 | The argument and return value are floating point numbers of the same |
| 7122 | type. |
| 7123 | |
| 7124 | Semantics: |
| 7125 | """""""""" |
| 7126 | |
| 7127 | This function returns the same values as the libm ``fabs`` functions |
| 7128 | would, and handles error conditions in the same way. |
| 7129 | |
| 7130 | '``llvm.floor.*``' Intrinsic |
| 7131 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7132 | |
| 7133 | Syntax: |
| 7134 | """"""" |
| 7135 | |
| 7136 | This is an overloaded intrinsic. You can use ``llvm.floor`` on any |
| 7137 | floating point or vector of floating point type. Not all targets support |
| 7138 | all types however. |
| 7139 | |
| 7140 | :: |
| 7141 | |
| 7142 | declare float @llvm.floor.f32(float %Val) |
| 7143 | declare double @llvm.floor.f64(double %Val) |
| 7144 | declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) |
| 7145 | declare fp128 @llvm.floor.f128(fp128 %Val) |
| 7146 | declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) |
| 7147 | |
| 7148 | Overview: |
| 7149 | """"""""" |
| 7150 | |
| 7151 | The '``llvm.floor.*``' intrinsics return the floor of the operand. |
| 7152 | |
| 7153 | Arguments: |
| 7154 | """""""""" |
| 7155 | |
| 7156 | The argument and return value are floating point numbers of the same |
| 7157 | type. |
| 7158 | |
| 7159 | Semantics: |
| 7160 | """""""""" |
| 7161 | |
| 7162 | This function returns the same values as the libm ``floor`` functions |
| 7163 | would, and handles error conditions in the same way. |
| 7164 | |
| 7165 | '``llvm.ceil.*``' Intrinsic |
| 7166 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7167 | |
| 7168 | Syntax: |
| 7169 | """"""" |
| 7170 | |
| 7171 | This is an overloaded intrinsic. You can use ``llvm.ceil`` on any |
| 7172 | floating point or vector of floating point type. Not all targets support |
| 7173 | all types however. |
| 7174 | |
| 7175 | :: |
| 7176 | |
| 7177 | declare float @llvm.ceil.f32(float %Val) |
| 7178 | declare double @llvm.ceil.f64(double %Val) |
| 7179 | declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val) |
| 7180 | declare fp128 @llvm.ceil.f128(fp128 %Val) |
| 7181 | declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val) |
| 7182 | |
| 7183 | Overview: |
| 7184 | """"""""" |
| 7185 | |
| 7186 | The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. |
| 7187 | |
| 7188 | Arguments: |
| 7189 | """""""""" |
| 7190 | |
| 7191 | The argument and return value are floating point numbers of the same |
| 7192 | type. |
| 7193 | |
| 7194 | Semantics: |
| 7195 | """""""""" |
| 7196 | |
| 7197 | This function returns the same values as the libm ``ceil`` functions |
| 7198 | would, and handles error conditions in the same way. |
| 7199 | |
| 7200 | '``llvm.trunc.*``' Intrinsic |
| 7201 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7202 | |
| 7203 | Syntax: |
| 7204 | """"""" |
| 7205 | |
| 7206 | This is an overloaded intrinsic. You can use ``llvm.trunc`` on any |
| 7207 | floating point or vector of floating point type. Not all targets support |
| 7208 | all types however. |
| 7209 | |
| 7210 | :: |
| 7211 | |
| 7212 | declare float @llvm.trunc.f32(float %Val) |
| 7213 | declare double @llvm.trunc.f64(double %Val) |
| 7214 | declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val) |
| 7215 | declare fp128 @llvm.trunc.f128(fp128 %Val) |
| 7216 | declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val) |
| 7217 | |
| 7218 | Overview: |
| 7219 | """"""""" |
| 7220 | |
| 7221 | The '``llvm.trunc.*``' intrinsics returns the operand rounded to the |
| 7222 | nearest integer not larger in magnitude than the operand. |
| 7223 | |
| 7224 | Arguments: |
| 7225 | """""""""" |
| 7226 | |
| 7227 | The argument and return value are floating point numbers of the same |
| 7228 | type. |
| 7229 | |
| 7230 | Semantics: |
| 7231 | """""""""" |
| 7232 | |
| 7233 | This function returns the same values as the libm ``trunc`` functions |
| 7234 | would, and handles error conditions in the same way. |
| 7235 | |
| 7236 | '``llvm.rint.*``' Intrinsic |
| 7237 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7238 | |
| 7239 | Syntax: |
| 7240 | """"""" |
| 7241 | |
| 7242 | This is an overloaded intrinsic. You can use ``llvm.rint`` on any |
| 7243 | floating point or vector of floating point type. Not all targets support |
| 7244 | all types however. |
| 7245 | |
| 7246 | :: |
| 7247 | |
| 7248 | declare float @llvm.rint.f32(float %Val) |
| 7249 | declare double @llvm.rint.f64(double %Val) |
| 7250 | declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val) |
| 7251 | declare fp128 @llvm.rint.f128(fp128 %Val) |
| 7252 | declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val) |
| 7253 | |
| 7254 | Overview: |
| 7255 | """"""""" |
| 7256 | |
| 7257 | The '``llvm.rint.*``' intrinsics returns the operand rounded to the |
| 7258 | nearest integer. It may raise an inexact floating-point exception if the |
| 7259 | operand isn't an integer. |
| 7260 | |
| 7261 | Arguments: |
| 7262 | """""""""" |
| 7263 | |
| 7264 | The argument and return value are floating point numbers of the same |
| 7265 | type. |
| 7266 | |
| 7267 | Semantics: |
| 7268 | """""""""" |
| 7269 | |
| 7270 | This function returns the same values as the libm ``rint`` functions |
| 7271 | would, and handles error conditions in the same way. |
| 7272 | |
| 7273 | '``llvm.nearbyint.*``' Intrinsic |
| 7274 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7275 | |
| 7276 | Syntax: |
| 7277 | """"""" |
| 7278 | |
| 7279 | This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any |
| 7280 | floating point or vector of floating point type. Not all targets support |
| 7281 | all types however. |
| 7282 | |
| 7283 | :: |
| 7284 | |
| 7285 | declare float @llvm.nearbyint.f32(float %Val) |
| 7286 | declare double @llvm.nearbyint.f64(double %Val) |
| 7287 | declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val) |
| 7288 | declare fp128 @llvm.nearbyint.f128(fp128 %Val) |
| 7289 | declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val) |
| 7290 | |
| 7291 | Overview: |
| 7292 | """"""""" |
| 7293 | |
| 7294 | The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the |
| 7295 | nearest integer. |
| 7296 | |
| 7297 | Arguments: |
| 7298 | """""""""" |
| 7299 | |
| 7300 | The argument and return value are floating point numbers of the same |
| 7301 | type. |
| 7302 | |
| 7303 | Semantics: |
| 7304 | """""""""" |
| 7305 | |
| 7306 | This function returns the same values as the libm ``nearbyint`` |
| 7307 | functions would, and handles error conditions in the same way. |
| 7308 | |
| 7309 | Bit Manipulation Intrinsics |
| 7310 | --------------------------- |
| 7311 | |
| 7312 | LLVM provides intrinsics for a few important bit manipulation |
| 7313 | operations. These allow efficient code generation for some algorithms. |
| 7314 | |
| 7315 | '``llvm.bswap.*``' Intrinsics |
| 7316 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7317 | |
| 7318 | Syntax: |
| 7319 | """"""" |
| 7320 | |
| 7321 | This is an overloaded intrinsic function. You can use bswap on any |
| 7322 | integer type that is an even number of bytes (i.e. BitWidth % 16 == 0). |
| 7323 | |
| 7324 | :: |
| 7325 | |
| 7326 | declare i16 @llvm.bswap.i16(i16 <id>) |
| 7327 | declare i32 @llvm.bswap.i32(i32 <id>) |
| 7328 | declare i64 @llvm.bswap.i64(i64 <id>) |
| 7329 | |
| 7330 | Overview: |
| 7331 | """"""""" |
| 7332 | |
| 7333 | The '``llvm.bswap``' family of intrinsics is used to byte swap integer |
| 7334 | values with an even number of bytes (positive multiple of 16 bits). |
| 7335 | These are useful for performing operations on data that is not in the |
| 7336 | target's native byte order. |
| 7337 | |
| 7338 | Semantics: |
| 7339 | """""""""" |
| 7340 | |
| 7341 | The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high |
| 7342 | and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32`` |
| 7343 | intrinsic returns an i32 value that has the four bytes of the input i32 |
| 7344 | swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the |
| 7345 | returned i32 will have its bytes in 3, 2, 1, 0 order. The |
| 7346 | ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this |
| 7347 | concept to additional even-byte lengths (6 bytes, 8 bytes and more, |
| 7348 | respectively). |
| 7349 | |
| 7350 | '``llvm.ctpop.*``' Intrinsic |
| 7351 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7352 | |
| 7353 | Syntax: |
| 7354 | """"""" |
| 7355 | |
| 7356 | This is an overloaded intrinsic. You can use llvm.ctpop on any integer |
| 7357 | bit width, or on any vector with integer elements. Not all targets |
| 7358 | support all bit widths or vector types, however. |
| 7359 | |
| 7360 | :: |
| 7361 | |
| 7362 | declare i8 @llvm.ctpop.i8(i8 <src>) |
| 7363 | declare i16 @llvm.ctpop.i16(i16 <src>) |
| 7364 | declare i32 @llvm.ctpop.i32(i32 <src>) |
| 7365 | declare i64 @llvm.ctpop.i64(i64 <src>) |
| 7366 | declare i256 @llvm.ctpop.i256(i256 <src>) |
| 7367 | declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>) |
| 7368 | |
| 7369 | Overview: |
| 7370 | """"""""" |
| 7371 | |
| 7372 | The '``llvm.ctpop``' family of intrinsics counts the number of bits set |
| 7373 | in a value. |
| 7374 | |
| 7375 | Arguments: |
| 7376 | """""""""" |
| 7377 | |
| 7378 | The only argument is the value to be counted. The argument may be of any |
| 7379 | integer type, or a vector with integer elements. The return type must |
| 7380 | match the argument type. |
| 7381 | |
| 7382 | Semantics: |
| 7383 | """""""""" |
| 7384 | |
| 7385 | The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within |
| 7386 | each element of a vector. |
| 7387 | |
| 7388 | '``llvm.ctlz.*``' Intrinsic |
| 7389 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7390 | |
| 7391 | Syntax: |
| 7392 | """"""" |
| 7393 | |
| 7394 | This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any |
| 7395 | integer bit width, or any vector whose elements are integers. Not all |
| 7396 | targets support all bit widths or vector types, however. |
| 7397 | |
| 7398 | :: |
| 7399 | |
| 7400 | declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>) |
| 7401 | declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>) |
| 7402 | declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) |
| 7403 | declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) |
| 7404 | declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) |
| 7405 | declase <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) |
| 7406 | |
| 7407 | Overview: |
| 7408 | """"""""" |
| 7409 | |
| 7410 | The '``llvm.ctlz``' family of intrinsic functions counts the number of |
| 7411 | leading zeros in a variable. |
| 7412 | |
| 7413 | Arguments: |
| 7414 | """""""""" |
| 7415 | |
| 7416 | The first argument is the value to be counted. This argument may be of |
| 7417 | any integer type, or a vectory with integer element type. The return |
| 7418 | type must match the first argument type. |
| 7419 | |
| 7420 | The second argument must be a constant and is a flag to indicate whether |
| 7421 | the intrinsic should ensure that a zero as the first argument produces a |
| 7422 | defined result. Historically some architectures did not provide a |
| 7423 | defined result for zero values as efficiently, and many algorithms are |
| 7424 | now predicated on avoiding zero-value inputs. |
| 7425 | |
| 7426 | Semantics: |
| 7427 | """""""""" |
| 7428 | |
| 7429 | The '``llvm.ctlz``' intrinsic counts the leading (most significant) |
| 7430 | zeros in a variable, or within each element of the vector. If |
| 7431 | ``src == 0`` then the result is the size in bits of the type of ``src`` |
| 7432 | if ``is_zero_undef == 0`` and ``undef`` otherwise. For example, |
| 7433 | ``llvm.ctlz(i32 2) = 30``. |
| 7434 | |
| 7435 | '``llvm.cttz.*``' Intrinsic |
| 7436 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7437 | |
| 7438 | Syntax: |
| 7439 | """"""" |
| 7440 | |
| 7441 | This is an overloaded intrinsic. You can use ``llvm.cttz`` on any |
| 7442 | integer bit width, or any vector of integer elements. Not all targets |
| 7443 | support all bit widths or vector types, however. |
| 7444 | |
| 7445 | :: |
| 7446 | |
| 7447 | declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>) |
| 7448 | declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>) |
| 7449 | declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) |
| 7450 | declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) |
| 7451 | declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) |
| 7452 | declase <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) |
| 7453 | |
| 7454 | Overview: |
| 7455 | """"""""" |
| 7456 | |
| 7457 | The '``llvm.cttz``' family of intrinsic functions counts the number of |
| 7458 | trailing zeros. |
| 7459 | |
| 7460 | Arguments: |
| 7461 | """""""""" |
| 7462 | |
| 7463 | The first argument is the value to be counted. This argument may be of |
| 7464 | any integer type, or a vectory with integer element type. The return |
| 7465 | type must match the first argument type. |
| 7466 | |
| 7467 | The second argument must be a constant and is a flag to indicate whether |
| 7468 | the intrinsic should ensure that a zero as the first argument produces a |
| 7469 | defined result. Historically some architectures did not provide a |
| 7470 | defined result for zero values as efficiently, and many algorithms are |
| 7471 | now predicated on avoiding zero-value inputs. |
| 7472 | |
| 7473 | Semantics: |
| 7474 | """""""""" |
| 7475 | |
| 7476 | The '``llvm.cttz``' intrinsic counts the trailing (least significant) |
| 7477 | zeros in a variable, or within each element of a vector. If ``src == 0`` |
| 7478 | then the result is the size in bits of the type of ``src`` if |
| 7479 | ``is_zero_undef == 0`` and ``undef`` otherwise. For example, |
| 7480 | ``llvm.cttz(2) = 1``. |
| 7481 | |
| 7482 | Arithmetic with Overflow Intrinsics |
| 7483 | ----------------------------------- |
| 7484 | |
| 7485 | LLVM provides intrinsics for some arithmetic with overflow operations. |
| 7486 | |
| 7487 | '``llvm.sadd.with.overflow.*``' Intrinsics |
| 7488 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7489 | |
| 7490 | Syntax: |
| 7491 | """"""" |
| 7492 | |
| 7493 | This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` |
| 7494 | on any integer bit width. |
| 7495 | |
| 7496 | :: |
| 7497 | |
| 7498 | declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) |
| 7499 | declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) |
| 7500 | declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) |
| 7501 | |
| 7502 | Overview: |
| 7503 | """"""""" |
| 7504 | |
| 7505 | The '``llvm.sadd.with.overflow``' family of intrinsic functions perform |
| 7506 | a signed addition of the two arguments, and indicate whether an overflow |
| 7507 | occurred during the signed summation. |
| 7508 | |
| 7509 | Arguments: |
| 7510 | """""""""" |
| 7511 | |
| 7512 | The arguments (%a and %b) and the first element of the result structure |
| 7513 | may be of integer types of any bit width, but they must have the same |
| 7514 | bit width. The second element of the result structure must be of type |
| 7515 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed |
| 7516 | addition. |
| 7517 | |
| 7518 | Semantics: |
| 7519 | """""""""" |
| 7520 | |
| 7521 | The '``llvm.sadd.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 7522 | a signed addition of the two variables. They return a structure --- the |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7523 | first element of which is the signed summation, and the second element |
| 7524 | of which is a bit specifying if the signed summation resulted in an |
| 7525 | overflow. |
| 7526 | |
| 7527 | Examples: |
| 7528 | """"""""" |
| 7529 | |
| 7530 | .. code-block:: llvm |
| 7531 | |
| 7532 | %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) |
| 7533 | %sum = extractvalue {i32, i1} %res, 0 |
| 7534 | %obit = extractvalue {i32, i1} %res, 1 |
| 7535 | br i1 %obit, label %overflow, label %normal |
| 7536 | |
| 7537 | '``llvm.uadd.with.overflow.*``' Intrinsics |
| 7538 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7539 | |
| 7540 | Syntax: |
| 7541 | """"""" |
| 7542 | |
| 7543 | This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` |
| 7544 | on any integer bit width. |
| 7545 | |
| 7546 | :: |
| 7547 | |
| 7548 | declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) |
| 7549 | declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) |
| 7550 | declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) |
| 7551 | |
| 7552 | Overview: |
| 7553 | """"""""" |
| 7554 | |
| 7555 | The '``llvm.uadd.with.overflow``' family of intrinsic functions perform |
| 7556 | an unsigned addition of the two arguments, and indicate whether a carry |
| 7557 | occurred during the unsigned summation. |
| 7558 | |
| 7559 | Arguments: |
| 7560 | """""""""" |
| 7561 | |
| 7562 | The arguments (%a and %b) and the first element of the result structure |
| 7563 | may be of integer types of any bit width, but they must have the same |
| 7564 | bit width. The second element of the result structure must be of type |
| 7565 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned |
| 7566 | addition. |
| 7567 | |
| 7568 | Semantics: |
| 7569 | """""""""" |
| 7570 | |
| 7571 | The '``llvm.uadd.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 7572 | an unsigned addition of the two arguments. They return a structure --- the |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7573 | first element of which is the sum, and the second element of which is a |
| 7574 | bit specifying if the unsigned summation resulted in a carry. |
| 7575 | |
| 7576 | Examples: |
| 7577 | """"""""" |
| 7578 | |
| 7579 | .. code-block:: llvm |
| 7580 | |
| 7581 | %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) |
| 7582 | %sum = extractvalue {i32, i1} %res, 0 |
| 7583 | %obit = extractvalue {i32, i1} %res, 1 |
| 7584 | br i1 %obit, label %carry, label %normal |
| 7585 | |
| 7586 | '``llvm.ssub.with.overflow.*``' Intrinsics |
| 7587 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7588 | |
| 7589 | Syntax: |
| 7590 | """"""" |
| 7591 | |
| 7592 | This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` |
| 7593 | on any integer bit width. |
| 7594 | |
| 7595 | :: |
| 7596 | |
| 7597 | declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) |
| 7598 | declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) |
| 7599 | declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) |
| 7600 | |
| 7601 | Overview: |
| 7602 | """"""""" |
| 7603 | |
| 7604 | The '``llvm.ssub.with.overflow``' family of intrinsic functions perform |
| 7605 | a signed subtraction of the two arguments, and indicate whether an |
| 7606 | overflow occurred during the signed subtraction. |
| 7607 | |
| 7608 | Arguments: |
| 7609 | """""""""" |
| 7610 | |
| 7611 | The arguments (%a and %b) and the first element of the result structure |
| 7612 | may be of integer types of any bit width, but they must have the same |
| 7613 | bit width. The second element of the result structure must be of type |
| 7614 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed |
| 7615 | subtraction. |
| 7616 | |
| 7617 | Semantics: |
| 7618 | """""""""" |
| 7619 | |
| 7620 | The '``llvm.ssub.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 7621 | a signed subtraction of the two arguments. They return a structure --- the |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7622 | first element of which is the subtraction, and the second element of |
| 7623 | which is a bit specifying if the signed subtraction resulted in an |
| 7624 | overflow. |
| 7625 | |
| 7626 | Examples: |
| 7627 | """"""""" |
| 7628 | |
| 7629 | .. code-block:: llvm |
| 7630 | |
| 7631 | %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) |
| 7632 | %sum = extractvalue {i32, i1} %res, 0 |
| 7633 | %obit = extractvalue {i32, i1} %res, 1 |
| 7634 | br i1 %obit, label %overflow, label %normal |
| 7635 | |
| 7636 | '``llvm.usub.with.overflow.*``' Intrinsics |
| 7637 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7638 | |
| 7639 | Syntax: |
| 7640 | """"""" |
| 7641 | |
| 7642 | This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` |
| 7643 | on any integer bit width. |
| 7644 | |
| 7645 | :: |
| 7646 | |
| 7647 | declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) |
| 7648 | declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) |
| 7649 | declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) |
| 7650 | |
| 7651 | Overview: |
| 7652 | """"""""" |
| 7653 | |
| 7654 | The '``llvm.usub.with.overflow``' family of intrinsic functions perform |
| 7655 | an unsigned subtraction of the two arguments, and indicate whether an |
| 7656 | overflow occurred during the unsigned subtraction. |
| 7657 | |
| 7658 | Arguments: |
| 7659 | """""""""" |
| 7660 | |
| 7661 | The arguments (%a and %b) and the first element of the result structure |
| 7662 | may be of integer types of any bit width, but they must have the same |
| 7663 | bit width. The second element of the result structure must be of type |
| 7664 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned |
| 7665 | subtraction. |
| 7666 | |
| 7667 | Semantics: |
| 7668 | """""""""" |
| 7669 | |
| 7670 | The '``llvm.usub.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 7671 | an unsigned subtraction of the two arguments. They return a structure --- |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7672 | the first element of which is the subtraction, and the second element of |
| 7673 | which is a bit specifying if the unsigned subtraction resulted in an |
| 7674 | overflow. |
| 7675 | |
| 7676 | Examples: |
| 7677 | """"""""" |
| 7678 | |
| 7679 | .. code-block:: llvm |
| 7680 | |
| 7681 | %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) |
| 7682 | %sum = extractvalue {i32, i1} %res, 0 |
| 7683 | %obit = extractvalue {i32, i1} %res, 1 |
| 7684 | br i1 %obit, label %overflow, label %normal |
| 7685 | |
| 7686 | '``llvm.smul.with.overflow.*``' Intrinsics |
| 7687 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7688 | |
| 7689 | Syntax: |
| 7690 | """"""" |
| 7691 | |
| 7692 | This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` |
| 7693 | on any integer bit width. |
| 7694 | |
| 7695 | :: |
| 7696 | |
| 7697 | declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) |
| 7698 | declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) |
| 7699 | declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) |
| 7700 | |
| 7701 | Overview: |
| 7702 | """"""""" |
| 7703 | |
| 7704 | The '``llvm.smul.with.overflow``' family of intrinsic functions perform |
| 7705 | a signed multiplication of the two arguments, and indicate whether an |
| 7706 | overflow occurred during the signed multiplication. |
| 7707 | |
| 7708 | Arguments: |
| 7709 | """""""""" |
| 7710 | |
| 7711 | The arguments (%a and %b) and the first element of the result structure |
| 7712 | may be of integer types of any bit width, but they must have the same |
| 7713 | bit width. The second element of the result structure must be of type |
| 7714 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed |
| 7715 | multiplication. |
| 7716 | |
| 7717 | Semantics: |
| 7718 | """""""""" |
| 7719 | |
| 7720 | The '``llvm.smul.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 7721 | a signed multiplication of the two arguments. They return a structure --- |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7722 | the first element of which is the multiplication, and the second element |
| 7723 | of which is a bit specifying if the signed multiplication resulted in an |
| 7724 | overflow. |
| 7725 | |
| 7726 | Examples: |
| 7727 | """"""""" |
| 7728 | |
| 7729 | .. code-block:: llvm |
| 7730 | |
| 7731 | %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) |
| 7732 | %sum = extractvalue {i32, i1} %res, 0 |
| 7733 | %obit = extractvalue {i32, i1} %res, 1 |
| 7734 | br i1 %obit, label %overflow, label %normal |
| 7735 | |
| 7736 | '``llvm.umul.with.overflow.*``' Intrinsics |
| 7737 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7738 | |
| 7739 | Syntax: |
| 7740 | """"""" |
| 7741 | |
| 7742 | This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` |
| 7743 | on any integer bit width. |
| 7744 | |
| 7745 | :: |
| 7746 | |
| 7747 | declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) |
| 7748 | declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) |
| 7749 | declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) |
| 7750 | |
| 7751 | Overview: |
| 7752 | """"""""" |
| 7753 | |
| 7754 | The '``llvm.umul.with.overflow``' family of intrinsic functions perform |
| 7755 | a unsigned multiplication of the two arguments, and indicate whether an |
| 7756 | overflow occurred during the unsigned multiplication. |
| 7757 | |
| 7758 | Arguments: |
| 7759 | """""""""" |
| 7760 | |
| 7761 | The arguments (%a and %b) and the first element of the result structure |
| 7762 | may be of integer types of any bit width, but they must have the same |
| 7763 | bit width. The second element of the result structure must be of type |
| 7764 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned |
| 7765 | multiplication. |
| 7766 | |
| 7767 | Semantics: |
| 7768 | """""""""" |
| 7769 | |
| 7770 | The '``llvm.umul.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | ae4a9ae | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 7771 | an unsigned multiplication of the two arguments. They return a structure --- |
| 7772 | the first element of which is the multiplication, and the second |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7773 | element of which is a bit specifying if the unsigned multiplication |
| 7774 | resulted in an overflow. |
| 7775 | |
| 7776 | Examples: |
| 7777 | """"""""" |
| 7778 | |
| 7779 | .. code-block:: llvm |
| 7780 | |
| 7781 | %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) |
| 7782 | %sum = extractvalue {i32, i1} %res, 0 |
| 7783 | %obit = extractvalue {i32, i1} %res, 1 |
| 7784 | br i1 %obit, label %overflow, label %normal |
| 7785 | |
| 7786 | Specialised Arithmetic Intrinsics |
| 7787 | --------------------------------- |
| 7788 | |
| 7789 | '``llvm.fmuladd.*``' Intrinsic |
| 7790 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7791 | |
| 7792 | Syntax: |
| 7793 | """"""" |
| 7794 | |
| 7795 | :: |
| 7796 | |
| 7797 | declare float @llvm.fmuladd.f32(float %a, float %b, float %c) |
| 7798 | declare double @llvm.fmuladd.f64(double %a, double %b, double %c) |
| 7799 | |
| 7800 | Overview: |
| 7801 | """"""""" |
| 7802 | |
| 7803 | The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add |
Lang Hames | b0ec16b | 2013-01-17 00:00:49 +0000 | [diff] [blame] | 7804 | expressions that can be fused if the code generator determines that (a) the |
| 7805 | target instruction set has support for a fused operation, and (b) that the |
| 7806 | fused operation is more efficient than the equivalent, separate pair of mul |
| 7807 | and add instructions. |
Sean Silva | f722b00 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7808 | |
| 7809 | Arguments: |
| 7810 | """""""""" |
| 7811 | |
| 7812 | The '``llvm.fmuladd.*``' intrinsics each take three arguments: two |
| 7813 | multiplicands, a and b, and an addend c. |
| 7814 | |
| 7815 | Semantics: |
| 7816 | """""""""" |
| 7817 | |
| 7818 | The expression: |
| 7819 | |
| 7820 | :: |
| 7821 | |
| 7822 | %0 = call float @llvm.fmuladd.f32(%a, %b, %c) |
| 7823 | |
| 7824 | is equivalent to the expression a \* b + c, except that rounding will |
| 7825 | not be performed between the multiplication and addition steps if the |
| 7826 | code generator fuses the operations. Fusion is not guaranteed, even if |
| 7827 | the target platform supports it. If a fused multiply-add is required the |
| 7828 | corresponding llvm.fma.\* intrinsic function should be used instead. |
| 7829 | |
| 7830 | Examples: |
| 7831 | """"""""" |
| 7832 | |
| 7833 | .. code-block:: llvm |
| 7834 | |
| 7835 | %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields {float}:r2 = (a * b) + c |
| 7836 | |
| 7837 | Half Precision Floating Point Intrinsics |
| 7838 | ---------------------------------------- |
| 7839 | |
| 7840 | For most target platforms, half precision floating point is a |
| 7841 | storage-only format. This means that it is a dense encoding (in memory) |
| 7842 | but does not support computation in the format. |
| 7843 | |
| 7844 | This means that code must first load the half-precision floating point |
| 7845 | value as an i16, then convert it to float with |
| 7846 | :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can |
| 7847 | then be performed on the float value (including extending to double |
| 7848 | etc). To store the value back to memory, it is first converted to float |
| 7849 | if needed, then converted to i16 with |
| 7850 | :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an |
| 7851 | i16 value. |
| 7852 | |
| 7853 | .. _int_convert_to_fp16: |
| 7854 | |
| 7855 | '``llvm.convert.to.fp16``' Intrinsic |
| 7856 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7857 | |
| 7858 | Syntax: |
| 7859 | """"""" |
| 7860 | |
| 7861 | :: |
| 7862 | |
| 7863 | declare i16 @llvm.convert.to.fp16(f32 %a) |
| 7864 | |
| 7865 | Overview: |
| 7866 | """"""""" |
| 7867 | |
| 7868 | The '``llvm.convert.to.fp16``' intrinsic function performs a conversion |
| 7869 | from single precision floating point format to half precision floating |
| 7870 | point format. |
| 7871 | |
| 7872 | Arguments: |
| 7873 | """""""""" |
| 7874 | |
| 7875 | The intrinsic function contains single argument - the value to be |
| 7876 | converted. |
| 7877 | |
| 7878 | Semantics: |
| 7879 | """""""""" |
| 7880 | |
| 7881 | The '``llvm.convert.to.fp16``' intrinsic function performs a conversion |
| 7882 | from single precision floating point format to half precision floating |
| 7883 | point format. The return value is an ``i16`` which contains the |
| 7884 | converted number. |
| 7885 | |
| 7886 | Examples: |
| 7887 | """"""""" |
| 7888 | |
| 7889 | .. code-block:: llvm |
| 7890 | |
| 7891 | %res = call i16 @llvm.convert.to.fp16(f32 %a) |
| 7892 | store i16 %res, i16* @x, align 2 |
| 7893 | |
| 7894 | .. _int_convert_from_fp16: |
| 7895 | |
| 7896 | '``llvm.convert.from.fp16``' Intrinsic |
| 7897 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7898 | |
| 7899 | Syntax: |
| 7900 | """"""" |
| 7901 | |
| 7902 | :: |
| 7903 | |
| 7904 | declare f32 @llvm.convert.from.fp16(i16 %a) |
| 7905 | |
| 7906 | Overview: |
| 7907 | """"""""" |
| 7908 | |
| 7909 | The '``llvm.convert.from.fp16``' intrinsic function performs a |
| 7910 | conversion from half precision floating point format to single precision |
| 7911 | floating point format. |
| 7912 | |
| 7913 | Arguments: |
| 7914 | """""""""" |
| 7915 | |
| 7916 | The intrinsic function contains single argument - the value to be |
| 7917 | converted. |
| 7918 | |
| 7919 | Semantics: |
| 7920 | """""""""" |
| 7921 | |
| 7922 | The '``llvm.convert.from.fp16``' intrinsic function performs a |
| 7923 | conversion from half single precision floating point format to single |
| 7924 | precision floating point format. The input half-float value is |
| 7925 | represented by an ``i16`` value. |
| 7926 | |
| 7927 | Examples: |
| 7928 | """"""""" |
| 7929 | |
| 7930 | .. code-block:: llvm |
| 7931 | |
| 7932 | %a = load i16* @x, align 2 |
| 7933 | %res = call f32 @llvm.convert.from.fp16(i16 %a) |
| 7934 | |
| 7935 | Debugger Intrinsics |
| 7936 | ------------------- |
| 7937 | |
| 7938 | The LLVM debugger intrinsics (which all start with ``llvm.dbg.`` |
| 7939 | prefix), are described in the `LLVM Source Level |
| 7940 | Debugging <SourceLevelDebugging.html#format_common_intrinsics>`_ |
| 7941 | document. |
| 7942 | |
| 7943 | Exception Handling Intrinsics |
| 7944 | ----------------------------- |
| 7945 | |
| 7946 | The LLVM exception handling intrinsics (which all start with |
| 7947 | ``llvm.eh.`` prefix), are described in the `LLVM Exception |
| 7948 | Handling <ExceptionHandling.html#format_common_intrinsics>`_ document. |
| 7949 | |
| 7950 | .. _int_trampoline: |
| 7951 | |
| 7952 | Trampoline Intrinsics |
| 7953 | --------------------- |
| 7954 | |
| 7955 | These intrinsics make it possible to excise one parameter, marked with |
| 7956 | the :ref:`nest <nest>` attribute, from a function. The result is a |
| 7957 | callable function pointer lacking the nest parameter - the caller does |
| 7958 | not need to provide a value for it. Instead, the value to use is stored |
| 7959 | in advance in a "trampoline", a block of memory usually allocated on the |
| 7960 | stack, which also contains code to splice the nest value into the |
| 7961 | argument list. This is used to implement the GCC nested function address |
| 7962 | extension. |
| 7963 | |
| 7964 | For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)`` |
| 7965 | then the resulting function pointer has signature ``i32 (i32, i32)*``. |
| 7966 | It can be created as follows: |
| 7967 | |
| 7968 | .. code-block:: llvm |
| 7969 | |
| 7970 | %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86 |
| 7971 | %tramp1 = getelementptr [10 x i8]* %tramp, i32 0, i32 0 |
| 7972 | call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval) |
| 7973 | %p = call i8* @llvm.adjust.trampoline(i8* %tramp1) |
| 7974 | %fp = bitcast i8* %p to i32 (i32, i32)* |
| 7975 | |
| 7976 | The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to |
| 7977 | ``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``. |
| 7978 | |
| 7979 | .. _int_it: |
| 7980 | |
| 7981 | '``llvm.init.trampoline``' Intrinsic |
| 7982 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7983 | |
| 7984 | Syntax: |
| 7985 | """"""" |
| 7986 | |
| 7987 | :: |
| 7988 | |
| 7989 | declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>) |
| 7990 | |
| 7991 | Overview: |
| 7992 | """"""""" |
| 7993 | |
| 7994 | This fills the memory pointed to by ``tramp`` with executable code, |
| 7995 | turning it into a trampoline. |
| 7996 | |
| 7997 | Arguments: |
| 7998 | """""""""" |
| 7999 | |
| 8000 | The ``llvm.init.trampoline`` intrinsic takes three arguments, all |
| 8001 | pointers. The ``tramp`` argument must point to a sufficiently large and |
| 8002 | sufficiently aligned block of memory; this memory is written to by the |
| 8003 | intrinsic. Note that the size and the alignment are target-specific - |
| 8004 | LLVM currently provides no portable way of determining them, so a |
| 8005 | front-end that generates this intrinsic needs to have some |
| 8006 | target-specific knowledge. The ``func`` argument must hold a function |
| 8007 | bitcast to an ``i8*``. |
| 8008 | |
| 8009 | Semantics: |
| 8010 | """""""""" |
| 8011 | |
| 8012 | The block of memory pointed to by ``tramp`` is filled with target |
| 8013 | dependent code, turning it into a function. Then ``tramp`` needs to be |
| 8014 | passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can |
| 8015 | be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new |
| 8016 | function's signature is the same as that of ``func`` with any arguments |
| 8017 | marked with the ``nest`` attribute removed. At most one such ``nest`` |
| 8018 | argument is allowed, and it must be of pointer type. Calling the new |
| 8019 | function is equivalent to calling ``func`` with the same argument list, |
| 8020 | but with ``nval`` used for the missing ``nest`` argument. If, after |
| 8021 | calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is |
| 8022 | modified, then the effect of any later call to the returned function |
| 8023 | pointer is undefined. |
| 8024 | |
| 8025 | .. _int_at: |
| 8026 | |
| 8027 | '``llvm.adjust.trampoline``' Intrinsic |
| 8028 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8029 | |
| 8030 | Syntax: |
| 8031 | """"""" |
| 8032 | |
| 8033 | :: |
| 8034 | |
| 8035 | declare i8* @llvm.adjust.trampoline(i8* <tramp>) |
| 8036 | |
| 8037 | Overview: |
| 8038 | """"""""" |
| 8039 | |
| 8040 | This performs any required machine-specific adjustment to the address of |
| 8041 | a trampoline (passed as ``tramp``). |
| 8042 | |
| 8043 | Arguments: |
| 8044 | """""""""" |
| 8045 | |
| 8046 | ``tramp`` must point to a block of memory which already has trampoline |
| 8047 | code filled in by a previous call to |
| 8048 | :ref:`llvm.init.trampoline <int_it>`. |
| 8049 | |
| 8050 | Semantics: |
| 8051 | """""""""" |
| 8052 | |
| 8053 | On some architectures the address of the code to be executed needs to be |
| 8054 | different to the address where the trampoline is actually stored. This |
| 8055 | intrinsic returns the executable address corresponding to ``tramp`` |
| 8056 | after performing the required machine specific adjustments. The pointer |
| 8057 | returned can then be :ref:`bitcast and executed <int_trampoline>`. |
| 8058 | |
| 8059 | Memory Use Markers |
| 8060 | ------------------ |
| 8061 | |
| 8062 | This class of intrinsics exists to information about the lifetime of |
| 8063 | memory objects and ranges where variables are immutable. |
| 8064 | |
| 8065 | '``llvm.lifetime.start``' Intrinsic |
| 8066 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8067 | |
| 8068 | Syntax: |
| 8069 | """"""" |
| 8070 | |
| 8071 | :: |
| 8072 | |
| 8073 | declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) |
| 8074 | |
| 8075 | Overview: |
| 8076 | """"""""" |
| 8077 | |
| 8078 | The '``llvm.lifetime.start``' intrinsic specifies the start of a memory |
| 8079 | object's lifetime. |
| 8080 | |
| 8081 | Arguments: |
| 8082 | """""""""" |
| 8083 | |
| 8084 | The first argument is a constant integer representing the size of the |
| 8085 | object, or -1 if it is variable sized. The second argument is a pointer |
| 8086 | to the object. |
| 8087 | |
| 8088 | Semantics: |
| 8089 | """""""""" |
| 8090 | |
| 8091 | This intrinsic indicates that before this point in the code, the value |
| 8092 | of the memory pointed to by ``ptr`` is dead. This means that it is known |
| 8093 | to never be used and has an undefined value. A load from the pointer |
| 8094 | that precedes this intrinsic can be replaced with ``'undef'``. |
| 8095 | |
| 8096 | '``llvm.lifetime.end``' Intrinsic |
| 8097 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8098 | |
| 8099 | Syntax: |
| 8100 | """"""" |
| 8101 | |
| 8102 | :: |
| 8103 | |
| 8104 | declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) |
| 8105 | |
| 8106 | Overview: |
| 8107 | """"""""" |
| 8108 | |
| 8109 | The '``llvm.lifetime.end``' intrinsic specifies the end of a memory |
| 8110 | object's lifetime. |
| 8111 | |
| 8112 | Arguments: |
| 8113 | """""""""" |
| 8114 | |
| 8115 | The first argument is a constant integer representing the size of the |
| 8116 | object, or -1 if it is variable sized. The second argument is a pointer |
| 8117 | to the object. |
| 8118 | |
| 8119 | Semantics: |
| 8120 | """""""""" |
| 8121 | |
| 8122 | This intrinsic indicates that after this point in the code, the value of |
| 8123 | the memory pointed to by ``ptr`` is dead. This means that it is known to |
| 8124 | never be used and has an undefined value. Any stores into the memory |
| 8125 | object following this intrinsic may be removed as dead. |
| 8126 | |
| 8127 | '``llvm.invariant.start``' Intrinsic |
| 8128 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8129 | |
| 8130 | Syntax: |
| 8131 | """"""" |
| 8132 | |
| 8133 | :: |
| 8134 | |
| 8135 | declare {}* @llvm.invariant.start(i64 <size>, i8* nocapture <ptr>) |
| 8136 | |
| 8137 | Overview: |
| 8138 | """"""""" |
| 8139 | |
| 8140 | The '``llvm.invariant.start``' intrinsic specifies that the contents of |
| 8141 | a memory object will not change. |
| 8142 | |
| 8143 | Arguments: |
| 8144 | """""""""" |
| 8145 | |
| 8146 | The first argument is a constant integer representing the size of the |
| 8147 | object, or -1 if it is variable sized. The second argument is a pointer |
| 8148 | to the object. |
| 8149 | |
| 8150 | Semantics: |
| 8151 | """""""""" |
| 8152 | |
| 8153 | This intrinsic indicates that until an ``llvm.invariant.end`` that uses |
| 8154 | the return value, the referenced memory location is constant and |
| 8155 | unchanging. |
| 8156 | |
| 8157 | '``llvm.invariant.end``' Intrinsic |
| 8158 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8159 | |
| 8160 | Syntax: |
| 8161 | """"""" |
| 8162 | |
| 8163 | :: |
| 8164 | |
| 8165 | declare void @llvm.invariant.end({}* <start>, i64 <size>, i8* nocapture <ptr>) |
| 8166 | |
| 8167 | Overview: |
| 8168 | """"""""" |
| 8169 | |
| 8170 | The '``llvm.invariant.end``' intrinsic specifies that the contents of a |
| 8171 | memory object are mutable. |
| 8172 | |
| 8173 | Arguments: |
| 8174 | """""""""" |
| 8175 | |
| 8176 | The first argument is the matching ``llvm.invariant.start`` intrinsic. |
| 8177 | The second argument is a constant integer representing the size of the |
| 8178 | object, or -1 if it is variable sized and the third argument is a |
| 8179 | pointer to the object. |
| 8180 | |
| 8181 | Semantics: |
| 8182 | """""""""" |
| 8183 | |
| 8184 | This intrinsic indicates that the memory is mutable again. |
| 8185 | |
| 8186 | General Intrinsics |
| 8187 | ------------------ |
| 8188 | |
| 8189 | This class of intrinsics is designed to be generic and has no specific |
| 8190 | purpose. |
| 8191 | |
| 8192 | '``llvm.var.annotation``' Intrinsic |
| 8193 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8194 | |
| 8195 | Syntax: |
| 8196 | """"""" |
| 8197 | |
| 8198 | :: |
| 8199 | |
| 8200 | declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>) |
| 8201 | |
| 8202 | Overview: |
| 8203 | """"""""" |
| 8204 | |
| 8205 | The '``llvm.var.annotation``' intrinsic. |
| 8206 | |
| 8207 | Arguments: |
| 8208 | """""""""" |
| 8209 | |
| 8210 | The first argument is a pointer to a value, the second is a pointer to a |
| 8211 | global string, the third is a pointer to a global string which is the |
| 8212 | source file name, and the last argument is the line number. |
| 8213 | |
| 8214 | Semantics: |
| 8215 | """""""""" |
| 8216 | |
| 8217 | This intrinsic allows annotation of local variables with arbitrary |
| 8218 | strings. This can be useful for special purpose optimizations that want |
| 8219 | to look for these annotations. These have no other defined use; they are |
| 8220 | ignored by code generation and optimization. |
| 8221 | |
| 8222 | '``llvm.annotation.*``' Intrinsic |
| 8223 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8224 | |
| 8225 | Syntax: |
| 8226 | """"""" |
| 8227 | |
| 8228 | This is an overloaded intrinsic. You can use '``llvm.annotation``' on |
| 8229 | any integer bit width. |
| 8230 | |
| 8231 | :: |
| 8232 | |
| 8233 | declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 8234 | declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 8235 | declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 8236 | declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 8237 | declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 8238 | |
| 8239 | Overview: |
| 8240 | """"""""" |
| 8241 | |
| 8242 | The '``llvm.annotation``' intrinsic. |
| 8243 | |
| 8244 | Arguments: |
| 8245 | """""""""" |
| 8246 | |
| 8247 | The first argument is an integer value (result of some expression), the |
| 8248 | second is a pointer to a global string, the third is a pointer to a |
| 8249 | global string which is the source file name, and the last argument is |
| 8250 | the line number. It returns the value of the first argument. |
| 8251 | |
| 8252 | Semantics: |
| 8253 | """""""""" |
| 8254 | |
| 8255 | This intrinsic allows annotations to be put on arbitrary expressions |
| 8256 | with arbitrary strings. This can be useful for special purpose |
| 8257 | optimizations that want to look for these annotations. These have no |
| 8258 | other defined use; they are ignored by code generation and optimization. |
| 8259 | |
| 8260 | '``llvm.trap``' Intrinsic |
| 8261 | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8262 | |
| 8263 | Syntax: |
| 8264 | """"""" |
| 8265 | |
| 8266 | :: |
| 8267 | |
| 8268 | declare void @llvm.trap() noreturn nounwind |
| 8269 | |
| 8270 | Overview: |
| 8271 | """"""""" |
| 8272 | |
| 8273 | The '``llvm.trap``' intrinsic. |
| 8274 | |
| 8275 | Arguments: |
| 8276 | """""""""" |
| 8277 | |
| 8278 | None. |
| 8279 | |
| 8280 | Semantics: |
| 8281 | """""""""" |
| 8282 | |
| 8283 | This intrinsic is lowered to the target dependent trap instruction. If |
| 8284 | the target does not have a trap instruction, this intrinsic will be |
| 8285 | lowered to a call of the ``abort()`` function. |
| 8286 | |
| 8287 | '``llvm.debugtrap``' Intrinsic |
| 8288 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8289 | |
| 8290 | Syntax: |
| 8291 | """"""" |
| 8292 | |
| 8293 | :: |
| 8294 | |
| 8295 | declare void @llvm.debugtrap() nounwind |
| 8296 | |
| 8297 | Overview: |
| 8298 | """"""""" |
| 8299 | |
| 8300 | The '``llvm.debugtrap``' intrinsic. |
| 8301 | |
| 8302 | Arguments: |
| 8303 | """""""""" |
| 8304 | |
| 8305 | None. |
| 8306 | |
| 8307 | Semantics: |
| 8308 | """""""""" |
| 8309 | |
| 8310 | This intrinsic is lowered to code which is intended to cause an |
| 8311 | execution trap with the intention of requesting the attention of a |
| 8312 | debugger. |
| 8313 | |
| 8314 | '``llvm.stackprotector``' Intrinsic |
| 8315 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8316 | |
| 8317 | Syntax: |
| 8318 | """"""" |
| 8319 | |
| 8320 | :: |
| 8321 | |
| 8322 | declare void @llvm.stackprotector(i8* <guard>, i8** <slot>) |
| 8323 | |
| 8324 | Overview: |
| 8325 | """"""""" |
| 8326 | |
| 8327 | The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it |
| 8328 | onto the stack at ``slot``. The stack slot is adjusted to ensure that it |
| 8329 | is placed on the stack before local variables. |
| 8330 | |
| 8331 | Arguments: |
| 8332 | """""""""" |
| 8333 | |
| 8334 | The ``llvm.stackprotector`` intrinsic requires two pointer arguments. |
| 8335 | The first argument is the value loaded from the stack guard |
| 8336 | ``@__stack_chk_guard``. The second variable is an ``alloca`` that has |
| 8337 | enough space to hold the value of the guard. |
| 8338 | |
| 8339 | Semantics: |
| 8340 | """""""""" |
| 8341 | |
| 8342 | This intrinsic causes the prologue/epilogue inserter to force the |
| 8343 | position of the ``AllocaInst`` stack slot to be before local variables |
| 8344 | on the stack. This is to ensure that if a local variable on the stack is |
| 8345 | overwritten, it will destroy the value of the guard. When the function |
| 8346 | exits, the guard on the stack is checked against the original guard. If |
| 8347 | they are different, then the program aborts by calling the |
| 8348 | ``__stack_chk_fail()`` function. |
| 8349 | |
| 8350 | '``llvm.objectsize``' Intrinsic |
| 8351 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8352 | |
| 8353 | Syntax: |
| 8354 | """"""" |
| 8355 | |
| 8356 | :: |
| 8357 | |
| 8358 | declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>) |
| 8359 | declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>) |
| 8360 | |
| 8361 | Overview: |
| 8362 | """"""""" |
| 8363 | |
| 8364 | The ``llvm.objectsize`` intrinsic is designed to provide information to |
| 8365 | the optimizers to determine at compile time whether a) an operation |
| 8366 | (like memcpy) will overflow a buffer that corresponds to an object, or |
| 8367 | b) that a runtime check for overflow isn't necessary. An object in this |
| 8368 | context means an allocation of a specific class, structure, array, or |
| 8369 | other object. |
| 8370 | |
| 8371 | Arguments: |
| 8372 | """""""""" |
| 8373 | |
| 8374 | The ``llvm.objectsize`` intrinsic takes two arguments. The first |
| 8375 | argument is a pointer to or into the ``object``. The second argument is |
| 8376 | a boolean and determines whether ``llvm.objectsize`` returns 0 (if true) |
| 8377 | or -1 (if false) when the object size is unknown. The second argument |
| 8378 | only accepts constants. |
| 8379 | |
| 8380 | Semantics: |
| 8381 | """""""""" |
| 8382 | |
| 8383 | The ``llvm.objectsize`` intrinsic is lowered to a constant representing |
| 8384 | the size of the object concerned. If the size cannot be determined at |
| 8385 | compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending |
| 8386 | on the ``min`` argument). |
| 8387 | |
| 8388 | '``llvm.expect``' Intrinsic |
| 8389 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8390 | |
| 8391 | Syntax: |
| 8392 | """"""" |
| 8393 | |
| 8394 | :: |
| 8395 | |
| 8396 | declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>) |
| 8397 | declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>) |
| 8398 | |
| 8399 | Overview: |
| 8400 | """"""""" |
| 8401 | |
| 8402 | The ``llvm.expect`` intrinsic provides information about expected (the |
| 8403 | most probable) value of ``val``, which can be used by optimizers. |
| 8404 | |
| 8405 | Arguments: |
| 8406 | """""""""" |
| 8407 | |
| 8408 | The ``llvm.expect`` intrinsic takes two arguments. The first argument is |
| 8409 | a value. The second argument is an expected value, this needs to be a |
| 8410 | constant value, variables are not allowed. |
| 8411 | |
| 8412 | Semantics: |
| 8413 | """""""""" |
| 8414 | |
| 8415 | This intrinsic is lowered to the ``val``. |
| 8416 | |
| 8417 | '``llvm.donothing``' Intrinsic |
| 8418 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8419 | |
| 8420 | Syntax: |
| 8421 | """"""" |
| 8422 | |
| 8423 | :: |
| 8424 | |
| 8425 | declare void @llvm.donothing() nounwind readnone |
| 8426 | |
| 8427 | Overview: |
| 8428 | """"""""" |
| 8429 | |
| 8430 | The ``llvm.donothing`` intrinsic doesn't perform any operation. It's the |
| 8431 | only intrinsic that can be called with an invoke instruction. |
| 8432 | |
| 8433 | Arguments: |
| 8434 | """""""""" |
| 8435 | |
| 8436 | None. |
| 8437 | |
| 8438 | Semantics: |
| 8439 | """""""""" |
| 8440 | |
| 8441 | This intrinsic does nothing, and it's removed by optimizers and ignored |
| 8442 | by codegen. |