Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1 | ============================== |
| 2 | LLVM Language Reference Manual |
| 3 | ============================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 7 | :depth: 4 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9 | Abstract |
| 10 | ======== |
| 11 | |
| 12 | This document is a reference manual for the LLVM assembly language. LLVM |
| 13 | is a Static Single Assignment (SSA) based representation that provides |
| 14 | type safety, low-level operations, flexibility, and the capability of |
| 15 | representing 'all' high-level languages cleanly. It is the common code |
| 16 | representation used throughout all phases of the LLVM compilation |
| 17 | strategy. |
| 18 | |
| 19 | Introduction |
| 20 | ============ |
| 21 | |
| 22 | The LLVM code representation is designed to be used in three different |
| 23 | forms: as an in-memory compiler IR, as an on-disk bitcode representation |
| 24 | (suitable for fast loading by a Just-In-Time compiler), and as a human |
| 25 | readable assembly language representation. This allows LLVM to provide a |
| 26 | powerful intermediate representation for efficient compiler |
| 27 | transformations and analysis, while providing a natural means to debug |
| 28 | and visualize the transformations. The three different forms of LLVM are |
| 29 | all equivalent. This document describes the human readable |
| 30 | representation and notation. |
| 31 | |
| 32 | The LLVM representation aims to be light-weight and low-level while |
| 33 | being expressive, typed, and extensible at the same time. It aims to be |
| 34 | a "universal IR" of sorts, by being at a low enough level that |
| 35 | high-level ideas may be cleanly mapped to it (similar to how |
| 36 | microprocessors are "universal IR's", allowing many source languages to |
| 37 | be mapped to them). By providing type information, LLVM can be used as |
| 38 | the target of optimizations: for example, through pointer analysis, it |
| 39 | can be proven that a C automatic variable is never accessed outside of |
| 40 | the current function, allowing it to be promoted to a simple SSA value |
| 41 | instead of a memory location. |
| 42 | |
| 43 | .. _wellformed: |
| 44 | |
| 45 | Well-Formedness |
| 46 | --------------- |
| 47 | |
| 48 | It is important to note that this document describes 'well formed' LLVM |
| 49 | assembly language. There is a difference between what the parser accepts |
| 50 | and what is considered 'well formed'. For example, the following |
| 51 | instruction is syntactically okay, but not well formed: |
| 52 | |
| 53 | .. code-block:: llvm |
| 54 | |
| 55 | %x = add i32 1, %x |
| 56 | |
| 57 | because the definition of ``%x`` does not dominate all of its uses. The |
| 58 | LLVM infrastructure provides a verification pass that may be used to |
| 59 | verify that an LLVM module is well formed. This pass is automatically |
| 60 | run by the parser after parsing input assembly and by the optimizer |
| 61 | before it outputs bitcode. The violations pointed out by the verifier |
| 62 | pass indicate bugs in transformation passes or input to the parser. |
| 63 | |
| 64 | .. _identifiers: |
| 65 | |
| 66 | Identifiers |
| 67 | =========== |
| 68 | |
| 69 | LLVM identifiers come in two basic types: global and local. Global |
| 70 | identifiers (functions, global variables) begin with the ``'@'`` |
| 71 | character. Local identifiers (register names, types) begin with the |
| 72 | ``'%'`` character. Additionally, there are three different formats for |
| 73 | identifiers, for different purposes: |
| 74 | |
| 75 | #. Named values are represented as a string of characters with their |
| 76 | prefix. For example, ``%foo``, ``@DivisionByZero``, |
| 77 | ``%a.really.long.identifier``. The actual regular expression used is |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 78 | '``[%@][a-zA-Z$._][a-zA-Z$._0-9]*``'. Identifiers that require other |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 79 | characters in their names can be surrounded with quotes. Special |
| 80 | characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII |
| 81 | code for the character in hexadecimal. In this way, any character can |
Hans Wennborg | 85e0653 | 2014-07-30 20:02:08 +0000 | [diff] [blame] | 82 | be used in a name value, even quotes themselves. The ``"\01"`` prefix |
| 83 | can be used on global variables to suppress mangling. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 84 | #. Unnamed values are represented as an unsigned numeric value with |
| 85 | their prefix. For example, ``%12``, ``@2``, ``%44``. |
| 86 | #. Constants, which are described in the section Constants_ below. |
| 87 | |
| 88 | LLVM requires that values start with a prefix for two reasons: Compilers |
| 89 | don't need to worry about name clashes with reserved words, and the set |
| 90 | of reserved words may be expanded in the future without penalty. |
| 91 | Additionally, unnamed identifiers allow a compiler to quickly come up |
| 92 | with a temporary variable without having to avoid symbol table |
| 93 | conflicts. |
| 94 | |
| 95 | Reserved words in LLVM are very similar to reserved words in other |
| 96 | languages. There are keywords for different opcodes ('``add``', |
| 97 | '``bitcast``', '``ret``', etc...), for primitive type names ('``void``', |
| 98 | '``i32``', etc...), and others. These reserved words cannot conflict |
| 99 | with variable names, because none of them start with a prefix character |
| 100 | (``'%'`` or ``'@'``). |
| 101 | |
| 102 | Here is an example of LLVM code to multiply the integer variable |
| 103 | '``%X``' by 8: |
| 104 | |
| 105 | The easy way: |
| 106 | |
| 107 | .. code-block:: llvm |
| 108 | |
| 109 | %result = mul i32 %X, 8 |
| 110 | |
| 111 | After strength reduction: |
| 112 | |
| 113 | .. code-block:: llvm |
| 114 | |
Dmitri Gribenko | 675911d | 2013-01-26 13:30:13 +0000 | [diff] [blame] | 115 | %result = shl i32 %X, 3 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 116 | |
| 117 | And the hard way: |
| 118 | |
| 119 | .. code-block:: llvm |
| 120 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 121 | %0 = add i32 %X, %X ; yields i32:%0 |
| 122 | %1 = add i32 %0, %0 ; yields i32:%1 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 123 | %result = add i32 %1, %1 |
| 124 | |
| 125 | This last way of multiplying ``%X`` by 8 illustrates several important |
| 126 | lexical features of LLVM: |
| 127 | |
| 128 | #. Comments are delimited with a '``;``' and go until the end of line. |
| 129 | #. Unnamed temporaries are created when the result of a computation is |
| 130 | not assigned to a named value. |
Sean Silva | 8ca1178 | 2013-05-20 23:31:12 +0000 | [diff] [blame] | 131 | #. Unnamed temporaries are numbered sequentially (using a per-function |
Dan Liew | 2661dfc | 2014-08-20 15:06:30 +0000 | [diff] [blame] | 132 | incrementing counter, starting with 0). Note that basic blocks and unnamed |
| 133 | function parameters are included in this numbering. For example, if the |
| 134 | entry basic block is not given a label name and all function parameters are |
| 135 | named, then it will get number 0. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 136 | |
| 137 | It also shows a convention that we follow in this document. When |
| 138 | demonstrating instructions, we will follow an instruction with a comment |
| 139 | that defines the type and name of value produced. |
| 140 | |
| 141 | High Level Structure |
| 142 | ==================== |
| 143 | |
| 144 | Module Structure |
| 145 | ---------------- |
| 146 | |
| 147 | LLVM programs are composed of ``Module``'s, each of which is a |
| 148 | translation unit of the input programs. Each module consists of |
| 149 | functions, global variables, and symbol table entries. Modules may be |
| 150 | combined together with the LLVM linker, which merges function (and |
| 151 | global variable) definitions, resolves forward declarations, and merges |
| 152 | symbol table entries. Here is an example of the "hello world" module: |
| 153 | |
| 154 | .. code-block:: llvm |
| 155 | |
Michael Liao | a769908 | 2013-03-06 18:24:34 +0000 | [diff] [blame] | 156 | ; Declare the string constant as a global constant. |
| 157 | @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 158 | |
Michael Liao | a769908 | 2013-03-06 18:24:34 +0000 | [diff] [blame] | 159 | ; External declaration of the puts function |
| 160 | declare i32 @puts(i8* nocapture) nounwind |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 161 | |
| 162 | ; Definition of main function |
Michael Liao | a769908 | 2013-03-06 18:24:34 +0000 | [diff] [blame] | 163 | define i32 @main() { ; i32()* |
| 164 | ; Convert [13 x i8]* to i8 *... |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 165 | %cast210 = getelementptr [13 x i8]* @.str, i64 0, i64 0 |
| 166 | |
Michael Liao | a769908 | 2013-03-06 18:24:34 +0000 | [diff] [blame] | 167 | ; Call puts function to write out the string to stdout. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 168 | call i32 @puts(i8* %cast210) |
Michael Liao | a769908 | 2013-03-06 18:24:34 +0000 | [diff] [blame] | 169 | ret i32 0 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 170 | } |
| 171 | |
| 172 | ; Named metadata |
Nick Lewycky | a0de40a | 2014-08-13 04:54:05 +0000 | [diff] [blame] | 173 | !0 = metadata !{i32 42, null, metadata !"string"} |
| 174 | !foo = !{!0} |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 175 | |
| 176 | This example is made up of a :ref:`global variable <globalvars>` named |
| 177 | "``.str``", an external declaration of the "``puts``" function, a |
| 178 | :ref:`function definition <functionstructure>` for "``main``" and |
| 179 | :ref:`named metadata <namedmetadatastructure>` "``foo``". |
| 180 | |
| 181 | In general, a module is made up of a list of global values (where both |
| 182 | functions and global variables are global values). Global values are |
| 183 | represented by a pointer to a memory location (in this case, a pointer |
| 184 | to an array of char, and a pointer to a function), and have one of the |
| 185 | following :ref:`linkage types <linkage>`. |
| 186 | |
| 187 | .. _linkage: |
| 188 | |
| 189 | Linkage Types |
| 190 | ------------- |
| 191 | |
| 192 | All Global Variables and Functions have one of the following types of |
| 193 | linkage: |
| 194 | |
| 195 | ``private`` |
| 196 | Global values with "``private``" linkage are only directly |
| 197 | accessible by objects in the current module. In particular, linking |
| 198 | code into a module with an private global value may cause the |
| 199 | private to be renamed as necessary to avoid collisions. Because the |
| 200 | symbol is private to the module, all references can be updated. This |
| 201 | doesn't show up in any symbol table in the object file. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 202 | ``internal`` |
| 203 | Similar to private, but the value shows as a local symbol |
| 204 | (``STB_LOCAL`` in the case of ELF) in the object file. This |
| 205 | corresponds to the notion of the '``static``' keyword in C. |
| 206 | ``available_externally`` |
| 207 | Globals with "``available_externally``" linkage are never emitted |
| 208 | into the object file corresponding to the LLVM module. They exist to |
| 209 | allow inlining and other optimizations to take place given knowledge |
| 210 | of the definition of the global, which is known to be somewhere |
| 211 | outside the module. Globals with ``available_externally`` linkage |
| 212 | are allowed to be discarded at will, and are otherwise the same as |
| 213 | ``linkonce_odr``. This linkage type is only allowed on definitions, |
| 214 | not declarations. |
| 215 | ``linkonce`` |
| 216 | Globals with "``linkonce``" linkage are merged with other globals of |
| 217 | the same name when linkage occurs. This can be used to implement |
| 218 | some forms of inline functions, templates, or other code which must |
| 219 | be generated in each translation unit that uses it, but where the |
| 220 | body may be overridden with a more definitive definition later. |
| 221 | Unreferenced ``linkonce`` globals are allowed to be discarded. Note |
| 222 | that ``linkonce`` linkage does not actually allow the optimizer to |
| 223 | inline the body of this function into callers because it doesn't |
| 224 | know if this definition of the function is the definitive definition |
| 225 | within the program or whether it will be overridden by a stronger |
| 226 | definition. To enable inlining and other optimizations, use |
| 227 | "``linkonce_odr``" linkage. |
| 228 | ``weak`` |
| 229 | "``weak``" linkage has the same merging semantics as ``linkonce`` |
| 230 | linkage, except that unreferenced globals with ``weak`` linkage may |
| 231 | not be discarded. This is used for globals that are declared "weak" |
| 232 | in C source code. |
| 233 | ``common`` |
| 234 | "``common``" linkage is most similar to "``weak``" linkage, but they |
| 235 | are used for tentative definitions in C, such as "``int X;``" at |
| 236 | global scope. Symbols with "``common``" linkage are merged in the |
| 237 | same way as ``weak symbols``, and they may not be deleted if |
| 238 | unreferenced. ``common`` symbols may not have an explicit section, |
| 239 | must have a zero initializer, and may not be marked |
| 240 | ':ref:`constant <globalvars>`'. Functions and aliases may not have |
| 241 | common linkage. |
| 242 | |
| 243 | .. _linkage_appending: |
| 244 | |
| 245 | ``appending`` |
| 246 | "``appending``" linkage may only be applied to global variables of |
| 247 | pointer to array type. When two global variables with appending |
| 248 | linkage are linked together, the two global arrays are appended |
| 249 | together. This is the LLVM, typesafe, equivalent of having the |
| 250 | system linker append together "sections" with identical names when |
| 251 | .o files are linked. |
| 252 | ``extern_weak`` |
| 253 | The semantics of this linkage follow the ELF object file model: the |
| 254 | symbol is weak until linked, if not linked, the symbol becomes null |
| 255 | instead of being an undefined reference. |
| 256 | ``linkonce_odr``, ``weak_odr`` |
| 257 | Some languages allow differing globals to be merged, such as two |
| 258 | functions with different semantics. Other languages, such as |
| 259 | ``C++``, ensure that only equivalent globals are ever merged (the |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 260 | "one definition rule" --- "ODR"). Such languages can use the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 261 | ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the |
| 262 | global will only be merged with equivalent globals. These linkage |
| 263 | types are otherwise the same as their non-``odr`` versions. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 264 | ``external`` |
| 265 | If none of the above identifiers are used, the global is externally |
| 266 | visible, meaning that it participates in linkage and can be used to |
| 267 | resolve external symbol references. |
| 268 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 269 | It is illegal for a function *declaration* to have any linkage type |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 270 | other than ``external`` or ``extern_weak``. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 271 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 272 | .. _callingconv: |
| 273 | |
| 274 | Calling Conventions |
| 275 | ------------------- |
| 276 | |
| 277 | LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and |
| 278 | :ref:`invokes <i_invoke>` can all have an optional calling convention |
| 279 | specified for the call. The calling convention of any pair of dynamic |
| 280 | caller/callee must match, or the behavior of the program is undefined. |
| 281 | The following calling conventions are supported by LLVM, and more may be |
| 282 | added in the future: |
| 283 | |
| 284 | "``ccc``" - The C calling convention |
| 285 | This calling convention (the default if no other calling convention |
| 286 | is specified) matches the target C calling conventions. This calling |
| 287 | convention supports varargs function calls and tolerates some |
| 288 | mismatch in the declared prototype and implemented declaration of |
| 289 | the function (as does normal C). |
| 290 | "``fastcc``" - The fast calling convention |
| 291 | This calling convention attempts to make calls as fast as possible |
| 292 | (e.g. by passing things in registers). This calling convention |
| 293 | allows the target to use whatever tricks it wants to produce fast |
| 294 | code for the target, without having to conform to an externally |
| 295 | specified ABI (Application Binary Interface). `Tail calls can only |
| 296 | be optimized when this, the GHC or the HiPE convention is |
| 297 | used. <CodeGenerator.html#id80>`_ This calling convention does not |
| 298 | support varargs and requires the prototype of all callees to exactly |
| 299 | match the prototype of the function definition. |
| 300 | "``coldcc``" - The cold calling convention |
| 301 | This calling convention attempts to make code in the caller as |
| 302 | efficient as possible under the assumption that the call is not |
| 303 | commonly executed. As such, these calls often preserve all registers |
| 304 | so that the call does not break any live ranges in the caller side. |
| 305 | This calling convention does not support varargs and requires the |
| 306 | prototype of all callees to exactly match the prototype of the |
Juergen Ributzka | 5d05ed1 | 2014-01-17 22:24:35 +0000 | [diff] [blame] | 307 | function definition. Furthermore the inliner doesn't consider such function |
| 308 | calls for inlining. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 309 | "``cc 10``" - GHC convention |
| 310 | This calling convention has been implemented specifically for use by |
| 311 | the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_. |
| 312 | It passes everything in registers, going to extremes to achieve this |
| 313 | by disabling callee save registers. This calling convention should |
| 314 | not be used lightly but only for specific situations such as an |
| 315 | alternative to the *register pinning* performance technique often |
| 316 | used when implementing functional programming languages. At the |
| 317 | moment only X86 supports this convention and it has the following |
| 318 | limitations: |
| 319 | |
| 320 | - On *X86-32* only supports up to 4 bit type parameters. No |
| 321 | floating point types are supported. |
| 322 | - On *X86-64* only supports up to 10 bit type parameters and 6 |
| 323 | floating point parameters. |
| 324 | |
| 325 | This calling convention supports `tail call |
| 326 | optimization <CodeGenerator.html#id80>`_ but requires both the |
| 327 | caller and callee are using it. |
| 328 | "``cc 11``" - The HiPE calling convention |
| 329 | This calling convention has been implemented specifically for use by |
| 330 | the `High-Performance Erlang |
| 331 | (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the* |
| 332 | native code compiler of the `Ericsson's Open Source Erlang/OTP |
| 333 | system <http://www.erlang.org/download.shtml>`_. It uses more |
| 334 | registers for argument passing than the ordinary C calling |
| 335 | convention and defines no callee-saved registers. The calling |
| 336 | convention properly supports `tail call |
| 337 | optimization <CodeGenerator.html#id80>`_ but requires that both the |
| 338 | caller and the callee use it. It uses a *register pinning* |
| 339 | mechanism, similar to GHC's convention, for keeping frequently |
| 340 | accessed runtime components pinned to specific hardware registers. |
| 341 | At the moment only X86 supports this convention (both 32 and 64 |
| 342 | bit). |
Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 343 | "``webkit_jscc``" - WebKit's JavaScript calling convention |
| 344 | This calling convention has been implemented for `WebKit FTL JIT |
| 345 | <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the |
| 346 | stack right to left (as cdecl does), and returns a value in the |
| 347 | platform's customary return register. |
| 348 | "``anyregcc``" - Dynamic calling convention for code patching |
| 349 | This is a special convention that supports patching an arbitrary code |
| 350 | sequence in place of a call site. This convention forces the call |
| 351 | arguments into registers but allows them to be dynamcially |
| 352 | allocated. This can currently only be used with calls to |
| 353 | llvm.experimental.patchpoint because only this intrinsic records |
| 354 | the location of its arguments in a side table. See :doc:`StackMaps`. |
Juergen Ributzka | e625013 | 2014-01-17 19:47:03 +0000 | [diff] [blame] | 355 | "``preserve_mostcc``" - The `PreserveMost` calling convention |
| 356 | This calling convention attempts to make the code in the caller as little |
| 357 | intrusive as possible. This calling convention behaves identical to the `C` |
| 358 | calling convention on how arguments and return values are passed, but it |
| 359 | uses a different set of caller/callee-saved registers. This alleviates the |
| 360 | burden of saving and recovering a large register set before and after the |
Juergen Ributzka | 980f2dc | 2014-01-30 02:39:00 +0000 | [diff] [blame] | 361 | call in the caller. If the arguments are passed in callee-saved registers, |
| 362 | then they will be preserved by the callee across the call. This doesn't |
| 363 | apply for values returned in callee-saved registers. |
Juergen Ributzka | e625013 | 2014-01-17 19:47:03 +0000 | [diff] [blame] | 364 | |
| 365 | - On X86-64 the callee preserves all general purpose registers, except for |
| 366 | R11. R11 can be used as a scratch register. Floating-point registers |
| 367 | (XMMs/YMMs) are not preserved and need to be saved by the caller. |
| 368 | |
| 369 | The idea behind this convention is to support calls to runtime functions |
| 370 | that have a hot path and a cold path. The hot path is usually a small piece |
| 371 | of code that doesn't many registers. The cold path might need to call out to |
| 372 | another function and therefore only needs to preserve the caller-saved |
Juergen Ributzka | 5d05ed1 | 2014-01-17 22:24:35 +0000 | [diff] [blame] | 373 | registers, which haven't already been saved by the caller. The |
| 374 | `PreserveMost` calling convention is very similar to the `cold` calling |
| 375 | convention in terms of caller/callee-saved registers, but they are used for |
| 376 | different types of function calls. `coldcc` is for function calls that are |
| 377 | rarely executed, whereas `preserve_mostcc` function calls are intended to be |
| 378 | on the hot path and definitely executed a lot. Furthermore `preserve_mostcc` |
| 379 | doesn't prevent the inliner from inlining the function call. |
Juergen Ributzka | e625013 | 2014-01-17 19:47:03 +0000 | [diff] [blame] | 380 | |
| 381 | This calling convention will be used by a future version of the ObjectiveC |
| 382 | runtime and should therefore still be considered experimental at this time. |
| 383 | Although this convention was created to optimize certain runtime calls to |
| 384 | the ObjectiveC runtime, it is not limited to this runtime and might be used |
| 385 | by other runtimes in the future too. The current implementation only |
| 386 | supports X86-64, but the intention is to support more architectures in the |
| 387 | future. |
| 388 | "``preserve_allcc``" - The `PreserveAll` calling convention |
| 389 | This calling convention attempts to make the code in the caller even less |
| 390 | intrusive than the `PreserveMost` calling convention. This calling |
| 391 | convention also behaves identical to the `C` calling convention on how |
| 392 | arguments and return values are passed, but it uses a different set of |
| 393 | caller/callee-saved registers. This removes the burden of saving and |
Juergen Ributzka | 980f2dc | 2014-01-30 02:39:00 +0000 | [diff] [blame] | 394 | recovering a large register set before and after the call in the caller. If |
| 395 | the arguments are passed in callee-saved registers, then they will be |
| 396 | preserved by the callee across the call. This doesn't apply for values |
| 397 | returned in callee-saved registers. |
Juergen Ributzka | e625013 | 2014-01-17 19:47:03 +0000 | [diff] [blame] | 398 | |
| 399 | - On X86-64 the callee preserves all general purpose registers, except for |
| 400 | R11. R11 can be used as a scratch register. Furthermore it also preserves |
| 401 | all floating-point registers (XMMs/YMMs). |
| 402 | |
| 403 | The idea behind this convention is to support calls to runtime functions |
| 404 | that don't need to call out to any other functions. |
| 405 | |
| 406 | This calling convention, like the `PreserveMost` calling convention, will be |
| 407 | used by a future version of the ObjectiveC runtime and should be considered |
| 408 | experimental at this time. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 409 | "``cc <n>``" - Numbered convention |
| 410 | Any calling convention may be specified by number, allowing |
| 411 | target-specific calling conventions to be used. Target specific |
| 412 | calling conventions start at 64. |
| 413 | |
| 414 | More calling conventions can be added/defined on an as-needed basis, to |
| 415 | support Pascal conventions or any other well-known target-independent |
| 416 | convention. |
| 417 | |
Eli Bendersky | fdc529a | 2013-06-07 19:40:08 +0000 | [diff] [blame] | 418 | .. _visibilitystyles: |
| 419 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 420 | Visibility Styles |
| 421 | ----------------- |
| 422 | |
| 423 | All Global Variables and Functions have one of the following visibility |
| 424 | styles: |
| 425 | |
| 426 | "``default``" - Default style |
| 427 | On targets that use the ELF object file format, default visibility |
| 428 | means that the declaration is visible to other modules and, in |
| 429 | shared libraries, means that the declared entity may be overridden. |
| 430 | On Darwin, default visibility means that the declaration is visible |
| 431 | to other modules. Default visibility corresponds to "external |
| 432 | linkage" in the language. |
| 433 | "``hidden``" - Hidden style |
| 434 | Two declarations of an object with hidden visibility refer to the |
| 435 | same object if they are in the same shared object. Usually, hidden |
| 436 | visibility indicates that the symbol will not be placed into the |
| 437 | dynamic symbol table, so no other module (executable or shared |
| 438 | library) can reference it directly. |
| 439 | "``protected``" - Protected style |
| 440 | On ELF, protected visibility indicates that the symbol will be |
| 441 | placed in the dynamic symbol table, but that references within the |
| 442 | defining module will bind to the local symbol. That is, the symbol |
| 443 | cannot be overridden by another module. |
| 444 | |
Duncan P. N. Exon Smith | b80de10 | 2014-05-07 22:57:20 +0000 | [diff] [blame] | 445 | A symbol with ``internal`` or ``private`` linkage must have ``default`` |
| 446 | visibility. |
| 447 | |
Rafael Espindola | 3bc64d5 | 2014-05-26 21:30:40 +0000 | [diff] [blame] | 448 | .. _dllstorageclass: |
Eli Bendersky | fdc529a | 2013-06-07 19:40:08 +0000 | [diff] [blame] | 449 | |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 450 | DLL Storage Classes |
| 451 | ------------------- |
| 452 | |
| 453 | All Global Variables, Functions and Aliases can have one of the following |
| 454 | DLL storage class: |
| 455 | |
| 456 | ``dllimport`` |
| 457 | "``dllimport``" causes the compiler to reference a function or variable via |
| 458 | a global pointer to a pointer that is set up by the DLL exporting the |
| 459 | symbol. On Microsoft Windows targets, the pointer name is formed by |
| 460 | combining ``__imp_`` and the function or variable name. |
| 461 | ``dllexport`` |
| 462 | "``dllexport``" causes the compiler to provide a global pointer to a pointer |
| 463 | in a DLL, so that it can be referenced with the ``dllimport`` attribute. On |
| 464 | Microsoft Windows targets, the pointer name is formed by combining |
| 465 | ``__imp_`` and the function or variable name. Since this storage class |
| 466 | exists for defining a dll interface, the compiler, assembler and linker know |
| 467 | it is externally referenced and must refrain from deleting the symbol. |
| 468 | |
Rafael Espindola | 59f7eba | 2014-05-28 18:15:43 +0000 | [diff] [blame] | 469 | .. _tls_model: |
| 470 | |
| 471 | Thread Local Storage Models |
| 472 | --------------------------- |
| 473 | |
| 474 | A variable may be defined as ``thread_local``, which means that it will |
| 475 | not be shared by threads (each thread will have a separated copy of the |
| 476 | variable). Not all targets support thread-local variables. Optionally, a |
| 477 | TLS model may be specified: |
| 478 | |
| 479 | ``localdynamic`` |
| 480 | For variables that are only used within the current shared library. |
| 481 | ``initialexec`` |
| 482 | For variables in modules that will not be loaded dynamically. |
| 483 | ``localexec`` |
| 484 | For variables defined in the executable and only used within it. |
| 485 | |
| 486 | If no explicit model is given, the "general dynamic" model is used. |
| 487 | |
| 488 | The models correspond to the ELF TLS models; see `ELF Handling For |
| 489 | Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for |
| 490 | more information on under which circumstances the different models may |
| 491 | be used. The target may choose a different TLS model if the specified |
| 492 | model is not supported, or if a better choice of model can be made. |
| 493 | |
| 494 | A model can also be specified in a alias, but then it only governs how |
| 495 | the alias is accessed. It will not have any effect in the aliasee. |
| 496 | |
Rafael Espindola | 3bc64d5 | 2014-05-26 21:30:40 +0000 | [diff] [blame] | 497 | .. _namedtypes: |
| 498 | |
Reid Kleckner | 7c84d1d | 2014-03-05 02:21:50 +0000 | [diff] [blame] | 499 | Structure Types |
| 500 | --------------- |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 501 | |
Reid Kleckner | 7c84d1d | 2014-03-05 02:21:50 +0000 | [diff] [blame] | 502 | LLVM IR allows you to specify both "identified" and "literal" :ref:`structure |
| 503 | types <t_struct>`. Literal types are uniqued structurally, but identified types |
| 504 | are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 505 | to forward declare a type that is not yet available. |
Reid Kleckner | 7c84d1d | 2014-03-05 02:21:50 +0000 | [diff] [blame] | 506 | |
| 507 | An example of a identified structure specification is: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 508 | |
| 509 | .. code-block:: llvm |
| 510 | |
| 511 | %mytype = type { %mytype*, i32 } |
| 512 | |
Reid Kleckner | 7c84d1d | 2014-03-05 02:21:50 +0000 | [diff] [blame] | 513 | Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only |
| 514 | literal types are uniqued in recent versions of LLVM. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 515 | |
| 516 | .. _globalvars: |
| 517 | |
| 518 | Global Variables |
| 519 | ---------------- |
| 520 | |
| 521 | Global variables define regions of memory allocated at compilation time |
Rafael Espindola | 5d1b745 | 2013-10-29 13:44:11 +0000 | [diff] [blame] | 522 | instead of run-time. |
| 523 | |
Bob Wilson | 85b24f2 | 2014-06-12 20:40:33 +0000 | [diff] [blame] | 524 | Global variables definitions must be initialized. |
Rafael Espindola | 5d1b745 | 2013-10-29 13:44:11 +0000 | [diff] [blame] | 525 | |
| 526 | Global variables in other translation units can also be declared, in which |
| 527 | case they don't have an initializer. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 528 | |
Bob Wilson | 85b24f2 | 2014-06-12 20:40:33 +0000 | [diff] [blame] | 529 | Either global variable definitions or declarations may have an explicit section |
| 530 | to be placed in and may have an optional explicit alignment specified. |
| 531 | |
Michael Gottesman | 006039c | 2013-01-31 05:48:48 +0000 | [diff] [blame] | 532 | A variable may be defined as a global ``constant``, which indicates that |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 533 | the contents of the variable will **never** be modified (enabling better |
| 534 | optimization, allowing the global data to be placed in the read-only |
| 535 | section of an executable, etc). Note that variables that need runtime |
Michael Gottesman | 1cffcf74 | 2013-01-31 05:44:04 +0000 | [diff] [blame] | 536 | initialization cannot be marked ``constant`` as there is a store to the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 537 | variable. |
| 538 | |
| 539 | LLVM explicitly allows *declarations* of global variables to be marked |
| 540 | constant, even if the final definition of the global is not. This |
| 541 | capability can be used to enable slightly better optimization of the |
| 542 | program, but requires the language definition to guarantee that |
| 543 | optimizations based on the 'constantness' are valid for the translation |
| 544 | units that do not include the definition. |
| 545 | |
| 546 | As SSA values, global variables define pointer values that are in scope |
| 547 | (i.e. they dominate) all basic blocks in the program. Global variables |
| 548 | always define a pointer to their "content" type because they describe a |
| 549 | region of memory, and all memory objects in LLVM are accessed through |
| 550 | pointers. |
| 551 | |
| 552 | Global variables can be marked with ``unnamed_addr`` which indicates |
| 553 | that the address is not significant, only the content. Constants marked |
| 554 | like this can be merged with other constants if they have the same |
| 555 | initializer. Note that a constant with significant address *can* be |
| 556 | merged with a ``unnamed_addr`` constant, the result being a constant |
| 557 | whose address is significant. |
| 558 | |
| 559 | A global variable may be declared to reside in a target-specific |
| 560 | numbered address space. For targets that support them, address spaces |
| 561 | may affect how optimizations are performed and/or what target |
| 562 | instructions are used to access the variable. The default address space |
| 563 | is zero. The address space qualifier must precede any other attributes. |
| 564 | |
| 565 | LLVM allows an explicit section to be specified for globals. If the |
| 566 | target supports it, it will emit globals to the section specified. |
David Majnemer | dad0a64 | 2014-06-27 18:19:56 +0000 | [diff] [blame] | 567 | Additionally, the global can placed in a comdat if the target has the necessary |
| 568 | support. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 569 | |
Michael Gottesman | e743a30 | 2013-02-04 03:22:00 +0000 | [diff] [blame] | 570 | By default, global initializers are optimized by assuming that global |
Michael Gottesman | ef2bc77 | 2013-02-03 09:57:15 +0000 | [diff] [blame] | 571 | variables defined within the module are not modified from their |
| 572 | initial values before the start of the global initializer. This is |
| 573 | true even for variables potentially accessible from outside the |
| 574 | module, including those with external linkage or appearing in |
Yunzhong Gao | f5b769e | 2013-12-05 18:37:54 +0000 | [diff] [blame] | 575 | ``@llvm.used`` or dllexported variables. This assumption may be suppressed |
| 576 | by marking the variable with ``externally_initialized``. |
Michael Gottesman | ef2bc77 | 2013-02-03 09:57:15 +0000 | [diff] [blame] | 577 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 578 | An explicit alignment may be specified for a global, which must be a |
| 579 | power of 2. If not present, or if the alignment is set to zero, the |
| 580 | alignment of the global is set by the target to whatever it feels |
| 581 | convenient. If an explicit alignment is specified, the global is forced |
| 582 | to have exactly that alignment. Targets and optimizers are not allowed |
| 583 | to over-align the global if the global has an assigned section. In this |
| 584 | case, the extra alignment could be observable: for example, code could |
| 585 | assume that the globals are densely packed in their section and try to |
| 586 | iterate over them as an array, alignment padding would break this |
Reid Kleckner | 15fe7a5 | 2014-07-15 01:16:09 +0000 | [diff] [blame] | 587 | iteration. The maximum alignment is ``1 << 29``. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 588 | |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 589 | Globals can also have a :ref:`DLL storage class <dllstorageclass>`. |
| 590 | |
Rafael Espindola | 59f7eba | 2014-05-28 18:15:43 +0000 | [diff] [blame] | 591 | Variables and aliasaes can have a |
| 592 | :ref:`Thread Local Storage Model <tls_model>`. |
| 593 | |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 594 | Syntax:: |
| 595 | |
| 596 | [@<GlobalVarName> =] [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] |
Rafael Espindola | 28f3ca6 | 2014-06-09 21:21:33 +0000 | [diff] [blame] | 597 | [unnamed_addr] [AddrSpace] [ExternallyInitialized] |
Bob Wilson | 85b24f2 | 2014-06-12 20:40:33 +0000 | [diff] [blame] | 598 | <global | constant> <Type> [<InitializerConstant>] |
| 599 | [, section "name"] [, align <Alignment>] |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 600 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 601 | For example, the following defines a global in a numbered address space |
| 602 | with an initializer, section, and alignment: |
| 603 | |
| 604 | .. code-block:: llvm |
| 605 | |
| 606 | @G = addrspace(5) constant float 1.0, section "foo", align 4 |
| 607 | |
Rafael Espindola | 5d1b745 | 2013-10-29 13:44:11 +0000 | [diff] [blame] | 608 | The following example just declares a global variable |
| 609 | |
| 610 | .. code-block:: llvm |
| 611 | |
| 612 | @G = external global i32 |
| 613 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 614 | The following example defines a thread-local global with the |
| 615 | ``initialexec`` TLS model: |
| 616 | |
| 617 | .. code-block:: llvm |
| 618 | |
| 619 | @G = thread_local(initialexec) global i32 0, align 4 |
| 620 | |
| 621 | .. _functionstructure: |
| 622 | |
| 623 | Functions |
| 624 | --------- |
| 625 | |
| 626 | LLVM function definitions consist of the "``define``" keyword, an |
| 627 | optional :ref:`linkage type <linkage>`, an optional :ref:`visibility |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 628 | style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, |
| 629 | an optional :ref:`calling convention <callingconv>`, |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 630 | an optional ``unnamed_addr`` attribute, a return type, an optional |
| 631 | :ref:`parameter attribute <paramattrs>` for the return type, a function |
| 632 | name, a (possibly empty) argument list (each with optional :ref:`parameter |
| 633 | attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`, |
David Majnemer | dad0a64 | 2014-06-27 18:19:56 +0000 | [diff] [blame] | 634 | an optional section, an optional alignment, |
| 635 | an optional :ref:`comdat <langref_comdats>`, |
| 636 | an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`, an opening |
Peter Collingbourne | 3fa50f9 | 2013-09-16 01:08:15 +0000 | [diff] [blame] | 637 | curly brace, a list of basic blocks, and a closing curly brace. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 638 | |
| 639 | LLVM function declarations consist of the "``declare``" keyword, an |
| 640 | optional :ref:`linkage type <linkage>`, an optional :ref:`visibility |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 641 | style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, |
| 642 | an optional :ref:`calling convention <callingconv>`, |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 643 | an optional ``unnamed_addr`` attribute, a return type, an optional |
| 644 | :ref:`parameter attribute <paramattrs>` for the return type, a function |
Peter Collingbourne | 3fa50f9 | 2013-09-16 01:08:15 +0000 | [diff] [blame] | 645 | name, a possibly empty list of arguments, an optional alignment, an optional |
| 646 | :ref:`garbage collector name <gc>` and an optional :ref:`prefix <prefixdata>`. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 647 | |
Bill Wendling | 6822ecb | 2013-10-27 05:09:12 +0000 | [diff] [blame] | 648 | A function definition contains a list of basic blocks, forming the CFG (Control |
| 649 | Flow Graph) for the function. Each basic block may optionally start with a label |
| 650 | (giving the basic block a symbol table entry), contains a list of instructions, |
| 651 | and ends with a :ref:`terminator <terminators>` instruction (such as a branch or |
| 652 | function return). If an explicit label is not provided, a block is assigned an |
| 653 | implicit numbered label, using the next value from the same counter as used for |
| 654 | unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function |
| 655 | entry block does not have an explicit label, it will be assigned label "%0", |
| 656 | then the first unnamed temporary in that block will be "%1", etc. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 657 | |
| 658 | The first basic block in a function is special in two ways: it is |
| 659 | immediately executed on entrance to the function, and it is not allowed |
| 660 | to have predecessor basic blocks (i.e. there can not be any branches to |
| 661 | the entry block of a function). Because the block can have no |
| 662 | predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`. |
| 663 | |
| 664 | LLVM allows an explicit section to be specified for functions. If the |
| 665 | target supports it, it will emit functions to the section specified. |
David Majnemer | dad0a64 | 2014-06-27 18:19:56 +0000 | [diff] [blame] | 666 | Additionally, the function can placed in a COMDAT. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 667 | |
| 668 | An explicit alignment may be specified for a function. If not present, |
| 669 | or if the alignment is set to zero, the alignment of the function is set |
| 670 | by the target to whatever it feels convenient. If an explicit alignment |
| 671 | is specified, the function is forced to have at least that much |
| 672 | alignment. All alignments must be a power of 2. |
| 673 | |
| 674 | If the ``unnamed_addr`` attribute is given, the address is know to not |
| 675 | be significant and two identical functions can be merged. |
| 676 | |
| 677 | Syntax:: |
| 678 | |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 679 | define [linkage] [visibility] [DLLStorageClass] |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 680 | [cconv] [ret attrs] |
| 681 | <ResultType> @<FunctionName> ([argument list]) |
David Majnemer | dad0a64 | 2014-06-27 18:19:56 +0000 | [diff] [blame] | 682 | [unnamed_addr] [fn Attrs] [section "name"] [comdat $<ComdatName>] |
| 683 | [align N] [gc] [prefix Constant] { ... } |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 684 | |
Dan Liew | 2661dfc | 2014-08-20 15:06:30 +0000 | [diff] [blame] | 685 | The argument list is a comma seperated sequence of arguments where each |
| 686 | argument is of the following form |
| 687 | |
| 688 | Syntax:: |
| 689 | |
| 690 | <type> [parameter Attrs] [name] |
| 691 | |
| 692 | |
Eli Bendersky | fdc529a | 2013-06-07 19:40:08 +0000 | [diff] [blame] | 693 | .. _langref_aliases: |
| 694 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 695 | Aliases |
| 696 | ------- |
| 697 | |
Rafael Espindola | 64c1e18 | 2014-06-03 02:41:57 +0000 | [diff] [blame] | 698 | Aliases, unlike function or variables, don't create any new data. They |
| 699 | are just a new symbol and metadata for an existing position. |
| 700 | |
| 701 | Aliases have a name and an aliasee that is either a global value or a |
| 702 | constant expression. |
| 703 | |
Nico Rieck | 7157bb7 | 2014-01-14 15:22:47 +0000 | [diff] [blame] | 704 | Aliases may have an optional :ref:`linkage type <linkage>`, an optional |
Rafael Espindola | 64c1e18 | 2014-06-03 02:41:57 +0000 | [diff] [blame] | 705 | :ref:`visibility style <visibility>`, an optional :ref:`DLL storage class |
| 706 | <dllstorageclass>` and an optional :ref:`tls model <tls_model>`. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 707 | |
| 708 | Syntax:: |
| 709 | |
Rafael Espindola | 464fe02 | 2014-07-30 22:51:54 +0000 | [diff] [blame] | 710 | @<Name> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] [unnamed_addr] alias <AliaseeTy> @<Aliasee> |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 711 | |
Rafael Espindola | 2fb5bc3 | 2014-03-13 23:18:37 +0000 | [diff] [blame] | 712 | The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``, |
Rafael Espindola | 716e740 | 2013-11-01 17:09:14 +0000 | [diff] [blame] | 713 | ``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers |
Rafael Espindola | 64c1e18 | 2014-06-03 02:41:57 +0000 | [diff] [blame] | 714 | might not correctly handle dropping a weak symbol that is aliased. |
Rafael Espindola | 7852705 | 2013-10-06 15:10:43 +0000 | [diff] [blame] | 715 | |
Rafael Espindola | f3336bc | 2014-03-12 20:15:49 +0000 | [diff] [blame] | 716 | Alias that are not ``unnamed_addr`` are guaranteed to have the same address as |
Rafael Espindola | 42a4c9f | 2014-06-06 01:20:28 +0000 | [diff] [blame] | 717 | the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point |
| 718 | to the same content. |
Rafael Espindola | f3336bc | 2014-03-12 20:15:49 +0000 | [diff] [blame] | 719 | |
Rafael Espindola | 64c1e18 | 2014-06-03 02:41:57 +0000 | [diff] [blame] | 720 | Since aliases are only a second name, some restrictions apply, of which |
| 721 | some can only be checked when producing an object file: |
Rafael Espindola | f3336bc | 2014-03-12 20:15:49 +0000 | [diff] [blame] | 722 | |
Rafael Espindola | 64c1e18 | 2014-06-03 02:41:57 +0000 | [diff] [blame] | 723 | * The expression defining the aliasee must be computable at assembly |
| 724 | time. Since it is just a name, no relocations can be used. |
| 725 | |
| 726 | * No alias in the expression can be weak as the possibility of the |
| 727 | intermediate alias being overridden cannot be represented in an |
| 728 | object file. |
| 729 | |
| 730 | * No global value in the expression can be a declaration, since that |
| 731 | would require a relocation, which is not possible. |
Rafael Espindola | 24a669d | 2014-03-27 15:26:56 +0000 | [diff] [blame] | 732 | |
David Majnemer | dad0a64 | 2014-06-27 18:19:56 +0000 | [diff] [blame] | 733 | .. _langref_comdats: |
| 734 | |
| 735 | Comdats |
| 736 | ------- |
| 737 | |
| 738 | Comdat IR provides access to COFF and ELF object file COMDAT functionality. |
| 739 | |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 740 | Comdats have a name which represents the COMDAT key. All global objects that |
David Majnemer | dad0a64 | 2014-06-27 18:19:56 +0000 | [diff] [blame] | 741 | specify this key will only end up in the final object file if the linker chooses |
| 742 | that key over some other key. Aliases are placed in the same COMDAT that their |
| 743 | aliasee computes to, if any. |
| 744 | |
| 745 | Comdats have a selection kind to provide input on how the linker should |
| 746 | choose between keys in two different object files. |
| 747 | |
| 748 | Syntax:: |
| 749 | |
| 750 | $<Name> = comdat SelectionKind |
| 751 | |
| 752 | The selection kind must be one of the following: |
| 753 | |
| 754 | ``any`` |
| 755 | The linker may choose any COMDAT key, the choice is arbitrary. |
| 756 | ``exactmatch`` |
| 757 | The linker may choose any COMDAT key but the sections must contain the |
| 758 | same data. |
| 759 | ``largest`` |
| 760 | The linker will choose the section containing the largest COMDAT key. |
| 761 | ``noduplicates`` |
| 762 | The linker requires that only section with this COMDAT key exist. |
| 763 | ``samesize`` |
| 764 | The linker may choose any COMDAT key but the sections must contain the |
| 765 | same amount of data. |
| 766 | |
| 767 | Note that the Mach-O platform doesn't support COMDATs and ELF only supports |
| 768 | ``any`` as a selection kind. |
| 769 | |
| 770 | Here is an example of a COMDAT group where a function will only be selected if |
| 771 | the COMDAT key's section is the largest: |
| 772 | |
| 773 | .. code-block:: llvm |
| 774 | |
| 775 | $foo = comdat largest |
| 776 | @foo = global i32 2, comdat $foo |
| 777 | |
| 778 | define void @bar() comdat $foo { |
| 779 | ret void |
| 780 | } |
| 781 | |
| 782 | In a COFF object file, this will create a COMDAT section with selection kind |
| 783 | ``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol |
| 784 | and another COMDAT section with selection kind |
| 785 | ``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT |
Hans Wennborg | 0def066 | 2014-09-10 17:05:08 +0000 | [diff] [blame] | 786 | section and contains the contents of the ``@bar`` symbol. |
David Majnemer | dad0a64 | 2014-06-27 18:19:56 +0000 | [diff] [blame] | 787 | |
| 788 | There are some restrictions on the properties of the global object. |
| 789 | It, or an alias to it, must have the same name as the COMDAT group when |
| 790 | targeting COFF. |
| 791 | The contents and size of this object may be used during link-time to determine |
| 792 | which COMDAT groups get selected depending on the selection kind. |
| 793 | Because the name of the object must match the name of the COMDAT group, the |
| 794 | linkage of the global object must not be local; local symbols can get renamed |
| 795 | if a collision occurs in the symbol table. |
| 796 | |
| 797 | The combined use of COMDATS and section attributes may yield surprising results. |
| 798 | For example: |
| 799 | |
| 800 | .. code-block:: llvm |
| 801 | |
| 802 | $foo = comdat any |
| 803 | $bar = comdat any |
| 804 | @g1 = global i32 42, section "sec", comdat $foo |
| 805 | @g2 = global i32 42, section "sec", comdat $bar |
| 806 | |
| 807 | From the object file perspective, this requires the creation of two sections |
| 808 | with the same name. This is necessary because both globals belong to different |
| 809 | COMDAT groups and COMDATs, at the object file level, are represented by |
| 810 | sections. |
| 811 | |
| 812 | Note that certain IR constructs like global variables and functions may create |
| 813 | COMDATs in the object file in addition to any which are specified using COMDAT |
| 814 | IR. This arises, for example, when a global variable has linkonce_odr linkage. |
| 815 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 816 | .. _namedmetadatastructure: |
| 817 | |
| 818 | Named Metadata |
| 819 | -------------- |
| 820 | |
| 821 | Named metadata is a collection of metadata. :ref:`Metadata |
| 822 | nodes <metadata>` (but not metadata strings) are the only valid |
| 823 | operands for a named metadata. |
| 824 | |
| 825 | Syntax:: |
| 826 | |
| 827 | ; Some unnamed metadata nodes, which are referenced by the named metadata. |
| 828 | !0 = metadata !{metadata !"zero"} |
| 829 | !1 = metadata !{metadata !"one"} |
| 830 | !2 = metadata !{metadata !"two"} |
| 831 | ; A named metadata. |
| 832 | !name = !{!0, !1, !2} |
| 833 | |
| 834 | .. _paramattrs: |
| 835 | |
| 836 | Parameter Attributes |
| 837 | -------------------- |
| 838 | |
| 839 | The return type and each parameter of a function type may have a set of |
| 840 | *parameter attributes* associated with them. Parameter attributes are |
| 841 | used to communicate additional information about the result or |
| 842 | parameters of a function. Parameter attributes are considered to be part |
| 843 | of the function, not of the function type, so functions with different |
| 844 | parameter attributes can have the same function type. |
| 845 | |
| 846 | Parameter attributes are simple keywords that follow the type specified. |
| 847 | If multiple parameter attributes are needed, they are space separated. |
| 848 | For example: |
| 849 | |
| 850 | .. code-block:: llvm |
| 851 | |
| 852 | declare i32 @printf(i8* noalias nocapture, ...) |
| 853 | declare i32 @atoi(i8 zeroext) |
| 854 | declare signext i8 @returns_signed_char() |
| 855 | |
| 856 | Note that any attributes for the function result (``nounwind``, |
| 857 | ``readonly``) come immediately after the argument list. |
| 858 | |
| 859 | Currently, only the following parameter attributes are defined: |
| 860 | |
| 861 | ``zeroext`` |
| 862 | This indicates to the code generator that the parameter or return |
| 863 | value should be zero-extended to the extent required by the target's |
| 864 | ABI (which is usually 32-bits, but is 8-bits for a i1 on x86-64) by |
| 865 | the caller (for a parameter) or the callee (for a return value). |
| 866 | ``signext`` |
| 867 | This indicates to the code generator that the parameter or return |
| 868 | value should be sign-extended to the extent required by the target's |
| 869 | ABI (which is usually 32-bits) by the caller (for a parameter) or |
| 870 | the callee (for a return value). |
| 871 | ``inreg`` |
| 872 | This indicates that this parameter or return value should be treated |
| 873 | in a special target-dependent fashion during while emitting code for |
| 874 | a function call or return (usually, by putting it in a register as |
| 875 | opposed to memory, though some targets use it to distinguish between |
| 876 | two different kinds of registers). Use of this attribute is |
| 877 | target-specific. |
| 878 | ``byval`` |
| 879 | This indicates that the pointer parameter should really be passed by |
| 880 | value to the function. The attribute implies that a hidden copy of |
| 881 | the pointee is made between the caller and the callee, so the callee |
| 882 | is unable to modify the value in the caller. This attribute is only |
| 883 | valid on LLVM pointer arguments. It is generally used to pass |
| 884 | structs and arrays by value, but is also valid on pointers to |
| 885 | scalars. The copy is considered to belong to the caller not the |
| 886 | callee (for example, ``readonly`` functions should not write to |
| 887 | ``byval`` parameters). This is not a valid attribute for return |
| 888 | values. |
| 889 | |
| 890 | The byval attribute also supports specifying an alignment with the |
| 891 | align attribute. It indicates the alignment of the stack slot to |
| 892 | form and the known alignment of the pointer specified to the call |
| 893 | site. If the alignment is not specified, then the code generator |
| 894 | makes a target-specific assumption. |
| 895 | |
Reid Kleckner | a534a38 | 2013-12-19 02:14:12 +0000 | [diff] [blame] | 896 | .. _attr_inalloca: |
| 897 | |
| 898 | ``inalloca`` |
| 899 | |
Reid Kleckner | 60d3a83 | 2014-01-16 22:59:24 +0000 | [diff] [blame] | 900 | The ``inalloca`` argument attribute allows the caller to take the |
Reid Kleckner | 436c42e | 2014-01-17 23:58:17 +0000 | [diff] [blame] | 901 | address of outgoing stack arguments. An ``inalloca`` argument must |
| 902 | be a pointer to stack memory produced by an ``alloca`` instruction. |
| 903 | The alloca, or argument allocation, must also be tagged with the |
Hal Finkel | c8491d3 | 2014-07-16 21:22:46 +0000 | [diff] [blame] | 904 | inalloca keyword. Only the last argument may have the ``inalloca`` |
Reid Kleckner | 436c42e | 2014-01-17 23:58:17 +0000 | [diff] [blame] | 905 | attribute, and that argument is guaranteed to be passed in memory. |
Reid Kleckner | a534a38 | 2013-12-19 02:14:12 +0000 | [diff] [blame] | 906 | |
Reid Kleckner | 436c42e | 2014-01-17 23:58:17 +0000 | [diff] [blame] | 907 | An argument allocation may be used by a call at most once because |
| 908 | the call may deallocate it. The ``inalloca`` attribute cannot be |
| 909 | used in conjunction with other attributes that affect argument |
Reid Kleckner | f5b7651 | 2014-01-31 23:50:57 +0000 | [diff] [blame] | 910 | storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The |
| 911 | ``inalloca`` attribute also disables LLVM's implicit lowering of |
| 912 | large aggregate return values, which means that frontend authors |
| 913 | must lower them with ``sret`` pointers. |
Reid Kleckner | a534a38 | 2013-12-19 02:14:12 +0000 | [diff] [blame] | 914 | |
Reid Kleckner | 60d3a83 | 2014-01-16 22:59:24 +0000 | [diff] [blame] | 915 | When the call site is reached, the argument allocation must have |
| 916 | been the most recent stack allocation that is still live, or the |
| 917 | results are undefined. It is possible to allocate additional stack |
| 918 | space after an argument allocation and before its call site, but it |
| 919 | must be cleared off with :ref:`llvm.stackrestore |
| 920 | <int_stackrestore>`. |
Reid Kleckner | a534a38 | 2013-12-19 02:14:12 +0000 | [diff] [blame] | 921 | |
| 922 | See :doc:`InAlloca` for more information on how to use this |
| 923 | attribute. |
| 924 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 925 | ``sret`` |
| 926 | This indicates that the pointer parameter specifies the address of a |
| 927 | structure that is the return value of the function in the source |
| 928 | program. This pointer must be guaranteed by the caller to be valid: |
Eli Bendersky | 4f2162f | 2013-01-23 22:05:19 +0000 | [diff] [blame] | 929 | loads and stores to the structure may be assumed by the callee |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 930 | not to trap and to be properly aligned. This may only be applied to |
| 931 | the first parameter. This is not a valid attribute for return |
| 932 | values. |
Sean Silva | 1703e70 | 2014-04-08 21:06:22 +0000 | [diff] [blame] | 933 | |
Hal Finkel | ccc7090 | 2014-07-22 16:58:55 +0000 | [diff] [blame] | 934 | ``align <n>`` |
| 935 | This indicates that the pointer value may be assumed by the optimizer to |
| 936 | have the specified alignment. |
| 937 | |
| 938 | Note that this attribute has additional semantics when combined with the |
| 939 | ``byval`` attribute. |
| 940 | |
Sean Silva | 1703e70 | 2014-04-08 21:06:22 +0000 | [diff] [blame] | 941 | .. _noalias: |
| 942 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 943 | ``noalias`` |
Richard Smith | 939889f | 2013-06-04 20:42:42 +0000 | [diff] [blame] | 944 | This indicates that pointer values :ref:`based <pointeraliasing>` on |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 945 | the argument or return value do not alias pointer values that are |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 946 | not *based* on it, ignoring certain "irrelevant" dependencies. For a |
| 947 | call to the parent function, dependencies between memory references |
| 948 | from before or after the call and from those during the call are |
| 949 | "irrelevant" to the ``noalias`` keyword for the arguments and return |
| 950 | value used in that call. The caller shares the responsibility with |
| 951 | the callee for ensuring that these requirements are met. For further |
Sean Silva | 1703e70 | 2014-04-08 21:06:22 +0000 | [diff] [blame] | 952 | details, please see the discussion of the NoAlias response in :ref:`alias |
| 953 | analysis <Must, May, or No>`. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 954 | |
| 955 | Note that this definition of ``noalias`` is intentionally similar |
| 956 | to the definition of ``restrict`` in C99 for function arguments, |
| 957 | though it is slightly weaker. |
| 958 | |
| 959 | For function return values, C99's ``restrict`` is not meaningful, |
| 960 | while LLVM's ``noalias`` is. |
| 961 | ``nocapture`` |
| 962 | This indicates that the callee does not make any copies of the |
| 963 | pointer that outlive the callee itself. This is not a valid |
| 964 | attribute for return values. |
| 965 | |
| 966 | .. _nest: |
| 967 | |
| 968 | ``nest`` |
| 969 | This indicates that the pointer parameter can be excised using the |
| 970 | :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid |
Stephen Lin | b8bd232 | 2013-04-20 05:14:40 +0000 | [diff] [blame] | 971 | attribute for return values and can only be applied to one parameter. |
| 972 | |
| 973 | ``returned`` |
Stephen Lin | fec5b0b | 2013-06-20 21:55:10 +0000 | [diff] [blame] | 974 | This indicates that the function always returns the argument as its return |
| 975 | value. This is an optimization hint to the code generator when generating |
| 976 | the caller, allowing tail call optimization and omission of register saves |
| 977 | and restores in some cases; it is not checked or enforced when generating |
| 978 | the callee. The parameter and the function return type must be valid |
| 979 | operands for the :ref:`bitcast instruction <i_bitcast>`. This is not a |
| 980 | valid attribute for return values and can only be applied to one parameter. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 981 | |
Nick Lewycky | d52b152 | 2014-05-20 01:23:40 +0000 | [diff] [blame] | 982 | ``nonnull`` |
| 983 | This indicates that the parameter or return pointer is not null. This |
| 984 | attribute may only be applied to pointer typed parameters. This is not |
| 985 | checked or enforced by LLVM, the caller must ensure that the pointer |
| 986 | passed in is non-null, or the callee must ensure that the returned pointer |
| 987 | is non-null. |
| 988 | |
Hal Finkel | b0407ba | 2014-07-18 15:51:28 +0000 | [diff] [blame] | 989 | ``dereferenceable(<n>)`` |
| 990 | This indicates that the parameter or return pointer is dereferenceable. This |
| 991 | attribute may only be applied to pointer typed parameters. A pointer that |
| 992 | is dereferenceable can be loaded from speculatively without a risk of |
| 993 | trapping. The number of bytes known to be dereferenceable must be provided |
| 994 | in parentheses. It is legal for the number of bytes to be less than the |
| 995 | size of the pointee type. The ``nonnull`` attribute does not imply |
| 996 | dereferenceability (consider a pointer to one element past the end of an |
| 997 | array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in |
| 998 | ``addrspace(0)`` (which is the default address space). |
| 999 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1000 | .. _gc: |
| 1001 | |
| 1002 | Garbage Collector Names |
| 1003 | ----------------------- |
| 1004 | |
| 1005 | Each function may specify a garbage collector name, which is simply a |
| 1006 | string: |
| 1007 | |
| 1008 | .. code-block:: llvm |
| 1009 | |
| 1010 | define void @f() gc "name" { ... } |
| 1011 | |
| 1012 | The compiler declares the supported values of *name*. Specifying a |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 1013 | collector will cause the compiler to alter its output in order to |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1014 | support the named garbage collection algorithm. |
| 1015 | |
Peter Collingbourne | 3fa50f9 | 2013-09-16 01:08:15 +0000 | [diff] [blame] | 1016 | .. _prefixdata: |
| 1017 | |
| 1018 | Prefix Data |
| 1019 | ----------- |
| 1020 | |
| 1021 | Prefix data is data associated with a function which the code generator |
| 1022 | will emit immediately before the function body. The purpose of this feature |
| 1023 | is to allow frontends to associate language-specific runtime metadata with |
| 1024 | specific functions and make it available through the function pointer while |
| 1025 | still allowing the function pointer to be called. To access the data for a |
| 1026 | given function, a program may bitcast the function pointer to a pointer to |
| 1027 | the constant's type. This implies that the IR symbol points to the start |
| 1028 | of the prefix data. |
| 1029 | |
| 1030 | To maintain the semantics of ordinary function calls, the prefix data must |
| 1031 | have a particular format. Specifically, it must begin with a sequence of |
| 1032 | bytes which decode to a sequence of machine instructions, valid for the |
| 1033 | module's target, which transfer control to the point immediately succeeding |
| 1034 | the prefix data, without performing any other visible action. This allows |
| 1035 | the inliner and other passes to reason about the semantics of the function |
| 1036 | definition without needing to reason about the prefix data. Obviously this |
| 1037 | makes the format of the prefix data highly target dependent. |
| 1038 | |
Peter Collingbourne | 213358a | 2013-09-23 20:14:21 +0000 | [diff] [blame] | 1039 | Prefix data is laid out as if it were an initializer for a global variable |
| 1040 | of the prefix data's type. No padding is automatically placed between the |
| 1041 | prefix data and the function body. If padding is required, it must be part |
| 1042 | of the prefix data. |
| 1043 | |
Peter Collingbourne | 3fa50f9 | 2013-09-16 01:08:15 +0000 | [diff] [blame] | 1044 | A trivial example of valid prefix data for the x86 architecture is ``i8 144``, |
| 1045 | which encodes the ``nop`` instruction: |
| 1046 | |
| 1047 | .. code-block:: llvm |
| 1048 | |
| 1049 | define void @f() prefix i8 144 { ... } |
| 1050 | |
| 1051 | Generally prefix data can be formed by encoding a relative branch instruction |
| 1052 | which skips the metadata, as in this example of valid prefix data for the |
| 1053 | x86_64 architecture, where the first two bytes encode ``jmp .+10``: |
| 1054 | |
| 1055 | .. code-block:: llvm |
| 1056 | |
| 1057 | %0 = type <{ i8, i8, i8* }> |
| 1058 | |
| 1059 | define void @f() prefix %0 <{ i8 235, i8 8, i8* @md}> { ... } |
| 1060 | |
| 1061 | A function may have prefix data but no body. This has similar semantics |
| 1062 | to the ``available_externally`` linkage in that the data may be used by the |
| 1063 | optimizers but will not be emitted in the object file. |
| 1064 | |
Bill Wendling | 63b8819 | 2013-02-06 06:52:58 +0000 | [diff] [blame] | 1065 | .. _attrgrp: |
| 1066 | |
| 1067 | Attribute Groups |
| 1068 | ---------------- |
| 1069 | |
| 1070 | Attribute groups are groups of attributes that are referenced by objects within |
| 1071 | the IR. They are important for keeping ``.ll`` files readable, because a lot of |
| 1072 | functions will use the same set of attributes. In the degenerative case of a |
| 1073 | ``.ll`` file that corresponds to a single ``.c`` file, the single attribute |
| 1074 | group will capture the important command line flags used to build that file. |
| 1075 | |
| 1076 | An attribute group is a module-level object. To use an attribute group, an |
| 1077 | object references the attribute group's ID (e.g. ``#37``). An object may refer |
| 1078 | to more than one attribute group. In that situation, the attributes from the |
| 1079 | different groups are merged. |
| 1080 | |
| 1081 | Here is an example of attribute groups for a function that should always be |
| 1082 | inlined, has a stack alignment of 4, and which shouldn't use SSE instructions: |
| 1083 | |
| 1084 | .. code-block:: llvm |
| 1085 | |
| 1086 | ; Target-independent attributes: |
Eli Bendersky | 97ad924 | 2013-04-18 16:11:44 +0000 | [diff] [blame] | 1087 | attributes #0 = { alwaysinline alignstack=4 } |
Bill Wendling | 63b8819 | 2013-02-06 06:52:58 +0000 | [diff] [blame] | 1088 | |
| 1089 | ; Target-dependent attributes: |
Eli Bendersky | 97ad924 | 2013-04-18 16:11:44 +0000 | [diff] [blame] | 1090 | attributes #1 = { "no-sse" } |
Bill Wendling | 63b8819 | 2013-02-06 06:52:58 +0000 | [diff] [blame] | 1091 | |
| 1092 | ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse". |
| 1093 | define void @f() #0 #1 { ... } |
| 1094 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1095 | .. _fnattrs: |
| 1096 | |
| 1097 | Function Attributes |
| 1098 | ------------------- |
| 1099 | |
| 1100 | Function attributes are set to communicate additional information about |
| 1101 | a function. Function attributes are considered to be part of the |
| 1102 | function, not of the function type, so functions with different function |
| 1103 | attributes can have the same function type. |
| 1104 | |
| 1105 | Function attributes are simple keywords that follow the type specified. |
| 1106 | If multiple attributes are needed, they are space separated. For |
| 1107 | example: |
| 1108 | |
| 1109 | .. code-block:: llvm |
| 1110 | |
| 1111 | define void @f() noinline { ... } |
| 1112 | define void @f() alwaysinline { ... } |
| 1113 | define void @f() alwaysinline optsize { ... } |
| 1114 | define void @f() optsize { ... } |
| 1115 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1116 | ``alignstack(<n>)`` |
| 1117 | This attribute indicates that, when emitting the prologue and |
| 1118 | epilogue, the backend should forcibly align the stack pointer. |
| 1119 | Specify the desired alignment, which must be a power of two, in |
| 1120 | parentheses. |
| 1121 | ``alwaysinline`` |
| 1122 | This attribute indicates that the inliner should attempt to inline |
| 1123 | this function into callers whenever possible, ignoring any active |
| 1124 | inlining size threshold for this caller. |
Michael Gottesman | 41748d7 | 2013-06-27 00:25:01 +0000 | [diff] [blame] | 1125 | ``builtin`` |
| 1126 | This indicates that the callee function at a call site should be |
| 1127 | recognized as a built-in function, even though the function's declaration |
Michael Gottesman | 3a6a967 | 2013-07-02 21:32:56 +0000 | [diff] [blame] | 1128 | uses the ``nobuiltin`` attribute. This is only valid at call sites for |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 1129 | direct calls to functions that are declared with the ``nobuiltin`` |
Michael Gottesman | 41748d7 | 2013-06-27 00:25:01 +0000 | [diff] [blame] | 1130 | attribute. |
Michael Gottesman | 296adb8 | 2013-06-27 22:48:08 +0000 | [diff] [blame] | 1131 | ``cold`` |
| 1132 | This attribute indicates that this function is rarely called. When |
| 1133 | computing edge weights, basic blocks post-dominated by a cold |
| 1134 | function call are also considered to be cold; and, thus, given low |
| 1135 | weight. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1136 | ``inlinehint`` |
| 1137 | This attribute indicates that the source code contained a hint that |
| 1138 | inlining this function is desirable (such as the "inline" keyword in |
| 1139 | C/C++). It is just a hint; it imposes no requirements on the |
| 1140 | inliner. |
Tom Roeder | 44cb65f | 2014-06-05 19:29:43 +0000 | [diff] [blame] | 1141 | ``jumptable`` |
| 1142 | This attribute indicates that the function should be added to a |
| 1143 | jump-instruction table at code-generation time, and that all address-taken |
| 1144 | references to this function should be replaced with a reference to the |
| 1145 | appropriate jump-instruction-table function pointer. Note that this creates |
| 1146 | a new pointer for the original function, which means that code that depends |
| 1147 | on function-pointer identity can break. So, any function annotated with |
| 1148 | ``jumptable`` must also be ``unnamed_addr``. |
Andrea Di Biagio | 9b5d23b | 2013-08-09 18:42:18 +0000 | [diff] [blame] | 1149 | ``minsize`` |
| 1150 | This attribute suggests that optimization passes and code generator |
| 1151 | passes make choices that keep the code size of this function as small |
Andrew Trick | d4d1d9c | 2013-10-31 17:18:07 +0000 | [diff] [blame] | 1152 | as possible and perform optimizations that may sacrifice runtime |
Andrea Di Biagio | 9b5d23b | 2013-08-09 18:42:18 +0000 | [diff] [blame] | 1153 | performance in order to minimize the size of the generated code. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1154 | ``naked`` |
| 1155 | This attribute disables prologue / epilogue emission for the |
| 1156 | function. This can have very system-specific consequences. |
Eli Bendersky | 97ad924 | 2013-04-18 16:11:44 +0000 | [diff] [blame] | 1157 | ``nobuiltin`` |
Michael Gottesman | 41748d7 | 2013-06-27 00:25:01 +0000 | [diff] [blame] | 1158 | This indicates that the callee function at a call site is not recognized as |
| 1159 | a built-in function. LLVM will retain the original call and not replace it |
| 1160 | with equivalent code based on the semantics of the built-in function, unless |
| 1161 | the call site uses the ``builtin`` attribute. This is valid at call sites |
| 1162 | and on function declarations and definitions. |
Bill Wendling | bf902f1 | 2013-02-06 06:22:58 +0000 | [diff] [blame] | 1163 | ``noduplicate`` |
| 1164 | This attribute indicates that calls to the function cannot be |
| 1165 | duplicated. A call to a ``noduplicate`` function may be moved |
| 1166 | within its parent function, but may not be duplicated within |
| 1167 | its parent function. |
| 1168 | |
| 1169 | A function containing a ``noduplicate`` call may still |
| 1170 | be an inlining candidate, provided that the call is not |
| 1171 | duplicated by inlining. That implies that the function has |
| 1172 | internal linkage and only has one call site, so the original |
| 1173 | call is dead after inlining. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1174 | ``noimplicitfloat`` |
| 1175 | This attributes disables implicit floating point instructions. |
| 1176 | ``noinline`` |
| 1177 | This attribute indicates that the inliner should never inline this |
| 1178 | function in any situation. This attribute may not be used together |
| 1179 | with the ``alwaysinline`` attribute. |
Sean Silva | 1cbbcf1 | 2013-08-06 19:34:37 +0000 | [diff] [blame] | 1180 | ``nonlazybind`` |
| 1181 | This attribute suppresses lazy symbol binding for the function. This |
| 1182 | may make calls to the function faster, at the cost of extra program |
| 1183 | startup time if the function is not called during program startup. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1184 | ``noredzone`` |
| 1185 | This attribute indicates that the code generator should not use a |
| 1186 | red zone, even if the target-specific ABI normally permits it. |
| 1187 | ``noreturn`` |
| 1188 | This function attribute indicates that the function never returns |
| 1189 | normally. This produces undefined behavior at runtime if the |
| 1190 | function ever does dynamically return. |
| 1191 | ``nounwind`` |
| 1192 | This function attribute indicates that the function never returns |
| 1193 | with an unwind or exceptional control flow. If the function does |
| 1194 | unwind, its runtime behavior is undefined. |
Andrea Di Biagio | 377496b | 2013-08-23 11:53:55 +0000 | [diff] [blame] | 1195 | ``optnone`` |
| 1196 | This function attribute indicates that the function is not optimized |
Andrew Trick | d4d1d9c | 2013-10-31 17:18:07 +0000 | [diff] [blame] | 1197 | by any optimization or code generator passes with the |
Andrea Di Biagio | 377496b | 2013-08-23 11:53:55 +0000 | [diff] [blame] | 1198 | exception of interprocedural optimization passes. |
| 1199 | This attribute cannot be used together with the ``alwaysinline`` |
| 1200 | attribute; this attribute is also incompatible |
| 1201 | with the ``minsize`` attribute and the ``optsize`` attribute. |
Andrew Trick | d4d1d9c | 2013-10-31 17:18:07 +0000 | [diff] [blame] | 1202 | |
Paul Robinson | dcbe35b | 2013-11-18 21:44:03 +0000 | [diff] [blame] | 1203 | This attribute requires the ``noinline`` attribute to be specified on |
| 1204 | the function as well, so the function is never inlined into any caller. |
Andrea Di Biagio | 377496b | 2013-08-23 11:53:55 +0000 | [diff] [blame] | 1205 | Only functions with the ``alwaysinline`` attribute are valid |
Paul Robinson | dcbe35b | 2013-11-18 21:44:03 +0000 | [diff] [blame] | 1206 | candidates for inlining into the body of this function. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1207 | ``optsize`` |
| 1208 | This attribute suggests that optimization passes and code generator |
| 1209 | passes make choices that keep the code size of this function low, |
Andrea Di Biagio | 9b5d23b | 2013-08-09 18:42:18 +0000 | [diff] [blame] | 1210 | and otherwise do optimizations specifically to reduce code size as |
| 1211 | long as they do not significantly impact runtime performance. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1212 | ``readnone`` |
Nick Lewycky | c2ec072 | 2013-07-06 00:29:58 +0000 | [diff] [blame] | 1213 | On a function, this attribute indicates that the function computes its |
| 1214 | result (or decides to unwind an exception) based strictly on its arguments, |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1215 | without dereferencing any pointer arguments or otherwise accessing |
| 1216 | any mutable state (e.g. memory, control registers, etc) visible to |
| 1217 | caller functions. It does not write through any pointer arguments |
| 1218 | (including ``byval`` arguments) and never changes any state visible |
| 1219 | to callers. This means that it cannot unwind exceptions by calling |
| 1220 | the ``C++`` exception throwing methods. |
Andrew Trick | d4d1d9c | 2013-10-31 17:18:07 +0000 | [diff] [blame] | 1221 | |
Nick Lewycky | c2ec072 | 2013-07-06 00:29:58 +0000 | [diff] [blame] | 1222 | On an argument, this attribute indicates that the function does not |
| 1223 | dereference that pointer argument, even though it may read or write the |
Nick Lewycky | efe31f2 | 2013-07-06 01:04:47 +0000 | [diff] [blame] | 1224 | memory that the pointer points to if accessed through other pointers. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1225 | ``readonly`` |
Nick Lewycky | c2ec072 | 2013-07-06 00:29:58 +0000 | [diff] [blame] | 1226 | On a function, this attribute indicates that the function does not write |
| 1227 | through any pointer arguments (including ``byval`` arguments) or otherwise |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1228 | modify any state (e.g. memory, control registers, etc) visible to |
| 1229 | caller functions. It may dereference pointer arguments and read |
| 1230 | state that may be set in the caller. A readonly function always |
| 1231 | returns the same value (or unwinds an exception identically) when |
| 1232 | called with the same set of arguments and global state. It cannot |
| 1233 | unwind an exception by calling the ``C++`` exception throwing |
| 1234 | methods. |
Andrew Trick | d4d1d9c | 2013-10-31 17:18:07 +0000 | [diff] [blame] | 1235 | |
Nick Lewycky | c2ec072 | 2013-07-06 00:29:58 +0000 | [diff] [blame] | 1236 | On an argument, this attribute indicates that the function does not write |
| 1237 | through this pointer argument, even though it may write to the memory that |
| 1238 | the pointer points to. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1239 | ``returns_twice`` |
| 1240 | This attribute indicates that this function can return twice. The C |
| 1241 | ``setjmp`` is an example of such a function. The compiler disables |
| 1242 | some optimizations (like tail calls) in the caller of these |
| 1243 | functions. |
Kostya Serebryany | cf880b9 | 2013-02-26 06:58:09 +0000 | [diff] [blame] | 1244 | ``sanitize_address`` |
| 1245 | This attribute indicates that AddressSanitizer checks |
| 1246 | (dynamic address safety analysis) are enabled for this function. |
| 1247 | ``sanitize_memory`` |
| 1248 | This attribute indicates that MemorySanitizer checks (dynamic detection |
| 1249 | of accesses to uninitialized memory) are enabled for this function. |
| 1250 | ``sanitize_thread`` |
| 1251 | This attribute indicates that ThreadSanitizer checks |
| 1252 | (dynamic thread safety analysis) are enabled for this function. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1253 | ``ssp`` |
| 1254 | This attribute indicates that the function should emit a stack |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 1255 | smashing protector. It is in the form of a "canary" --- a random value |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1256 | placed on the stack before the local variables that's checked upon |
| 1257 | return from the function to see if it has been overwritten. A |
| 1258 | heuristic is used to determine if a function needs stack protectors |
Bill Wendling | 7c8f96a | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 1259 | or not. The heuristic used will enable protectors for functions with: |
Dmitri Gribenko | 69b5647 | 2013-01-29 23:14:41 +0000 | [diff] [blame] | 1260 | |
Bill Wendling | 7c8f96a | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 1261 | - Character arrays larger than ``ssp-buffer-size`` (default 8). |
| 1262 | - Aggregates containing character arrays larger than ``ssp-buffer-size``. |
| 1263 | - Calls to alloca() with variable sizes or constant sizes greater than |
| 1264 | ``ssp-buffer-size``. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1265 | |
Josh Magee | 24c7f06 | 2014-02-01 01:36:16 +0000 | [diff] [blame] | 1266 | Variables that are identified as requiring a protector will be arranged |
| 1267 | on the stack such that they are adjacent to the stack protector guard. |
| 1268 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1269 | If a function that has an ``ssp`` attribute is inlined into a |
| 1270 | function that doesn't have an ``ssp`` attribute, then the resulting |
| 1271 | function will have an ``ssp`` attribute. |
| 1272 | ``sspreq`` |
| 1273 | This attribute indicates that the function should *always* emit a |
| 1274 | stack smashing protector. This overrides the ``ssp`` function |
| 1275 | attribute. |
| 1276 | |
Josh Magee | 24c7f06 | 2014-02-01 01:36:16 +0000 | [diff] [blame] | 1277 | Variables that are identified as requiring a protector will be arranged |
| 1278 | on the stack such that they are adjacent to the stack protector guard. |
| 1279 | The specific layout rules are: |
| 1280 | |
| 1281 | #. Large arrays and structures containing large arrays |
| 1282 | (``>= ssp-buffer-size``) are closest to the stack protector. |
| 1283 | #. Small arrays and structures containing small arrays |
| 1284 | (``< ssp-buffer-size``) are 2nd closest to the protector. |
| 1285 | #. Variables that have had their address taken are 3rd closest to the |
| 1286 | protector. |
| 1287 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1288 | If a function that has an ``sspreq`` attribute is inlined into a |
| 1289 | function that doesn't have an ``sspreq`` attribute or which has an |
Bill Wendling | d154e283 | 2013-01-23 06:41:41 +0000 | [diff] [blame] | 1290 | ``ssp`` or ``sspstrong`` attribute, then the resulting function will have |
| 1291 | an ``sspreq`` attribute. |
| 1292 | ``sspstrong`` |
| 1293 | This attribute indicates that the function should emit a stack smashing |
Bill Wendling | 7c8f96a | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 1294 | protector. This attribute causes a strong heuristic to be used when |
| 1295 | determining if a function needs stack protectors. The strong heuristic |
| 1296 | will enable protectors for functions with: |
Dmitri Gribenko | 69b5647 | 2013-01-29 23:14:41 +0000 | [diff] [blame] | 1297 | |
Bill Wendling | 7c8f96a | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 1298 | - Arrays of any size and type |
| 1299 | - Aggregates containing an array of any size and type. |
| 1300 | - Calls to alloca(). |
| 1301 | - Local variables that have had their address taken. |
| 1302 | |
Josh Magee | 24c7f06 | 2014-02-01 01:36:16 +0000 | [diff] [blame] | 1303 | Variables that are identified as requiring a protector will be arranged |
| 1304 | on the stack such that they are adjacent to the stack protector guard. |
| 1305 | The specific layout rules are: |
| 1306 | |
| 1307 | #. Large arrays and structures containing large arrays |
| 1308 | (``>= ssp-buffer-size``) are closest to the stack protector. |
| 1309 | #. Small arrays and structures containing small arrays |
| 1310 | (``< ssp-buffer-size``) are 2nd closest to the protector. |
| 1311 | #. Variables that have had their address taken are 3rd closest to the |
| 1312 | protector. |
| 1313 | |
Bill Wendling | 7c8f96a | 2013-01-23 06:43:53 +0000 | [diff] [blame] | 1314 | This overrides the ``ssp`` function attribute. |
Bill Wendling | d154e283 | 2013-01-23 06:41:41 +0000 | [diff] [blame] | 1315 | |
| 1316 | If a function that has an ``sspstrong`` attribute is inlined into a |
| 1317 | function that doesn't have an ``sspstrong`` attribute, then the |
| 1318 | resulting function will have an ``sspstrong`` attribute. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1319 | ``uwtable`` |
| 1320 | This attribute indicates that the ABI being targeted requires that |
| 1321 | an unwind table entry be produce for this function even if we can |
| 1322 | show that no exceptions passes by it. This is normally the case for |
| 1323 | the ELF x86-64 abi, but it can be disabled for some compilation |
| 1324 | units. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1325 | |
| 1326 | .. _moduleasm: |
| 1327 | |
| 1328 | Module-Level Inline Assembly |
| 1329 | ---------------------------- |
| 1330 | |
| 1331 | Modules may contain "module-level inline asm" blocks, which corresponds |
| 1332 | to the GCC "file scope inline asm" blocks. These blocks are internally |
| 1333 | concatenated by LLVM and treated as a single unit, but may be separated |
| 1334 | in the ``.ll`` file if desired. The syntax is very simple: |
| 1335 | |
| 1336 | .. code-block:: llvm |
| 1337 | |
| 1338 | module asm "inline asm code goes here" |
| 1339 | module asm "more can go here" |
| 1340 | |
| 1341 | The strings can contain any character by escaping non-printable |
| 1342 | characters. The escape sequence used is simply "\\xx" where "xx" is the |
| 1343 | two digit hex code for the number. |
| 1344 | |
| 1345 | The inline asm code is simply printed to the machine code .s file when |
| 1346 | assembly code is generated. |
| 1347 | |
Eli Bendersky | fdc529a | 2013-06-07 19:40:08 +0000 | [diff] [blame] | 1348 | .. _langref_datalayout: |
| 1349 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1350 | Data Layout |
| 1351 | ----------- |
| 1352 | |
| 1353 | A module may specify a target specific data layout string that specifies |
| 1354 | how data is to be laid out in memory. The syntax for the data layout is |
| 1355 | simply: |
| 1356 | |
| 1357 | .. code-block:: llvm |
| 1358 | |
| 1359 | target datalayout = "layout specification" |
| 1360 | |
| 1361 | The *layout specification* consists of a list of specifications |
| 1362 | separated by the minus sign character ('-'). Each specification starts |
| 1363 | with a letter and may include other information after the letter to |
| 1364 | define some aspect of the data layout. The specifications accepted are |
| 1365 | as follows: |
| 1366 | |
| 1367 | ``E`` |
| 1368 | Specifies that the target lays out data in big-endian form. That is, |
| 1369 | the bits with the most significance have the lowest address |
| 1370 | location. |
| 1371 | ``e`` |
| 1372 | Specifies that the target lays out data in little-endian form. That |
| 1373 | is, the bits with the least significance have the lowest address |
| 1374 | location. |
| 1375 | ``S<size>`` |
| 1376 | Specifies the natural alignment of the stack in bits. Alignment |
| 1377 | promotion of stack variables is limited to the natural stack |
| 1378 | alignment to avoid dynamic stack realignment. The stack alignment |
| 1379 | must be a multiple of 8-bits. If omitted, the natural stack |
| 1380 | alignment defaults to "unspecified", which does not prevent any |
| 1381 | alignment promotions. |
| 1382 | ``p[n]:<size>:<abi>:<pref>`` |
| 1383 | This specifies the *size* of a pointer and its ``<abi>`` and |
| 1384 | ``<pref>``\erred alignments for address space ``n``. All sizes are in |
Rafael Espindola | abdd726 | 2014-01-06 21:40:24 +0000 | [diff] [blame] | 1385 | bits. The address space, ``n`` is optional, and if not specified, |
| 1386 | denotes the default address space 0. The value of ``n`` must be |
| 1387 | in the range [1,2^23). |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1388 | ``i<size>:<abi>:<pref>`` |
| 1389 | This specifies the alignment for an integer type of a given bit |
| 1390 | ``<size>``. The value of ``<size>`` must be in the range [1,2^23). |
| 1391 | ``v<size>:<abi>:<pref>`` |
| 1392 | This specifies the alignment for a vector type of a given bit |
| 1393 | ``<size>``. |
| 1394 | ``f<size>:<abi>:<pref>`` |
| 1395 | This specifies the alignment for a floating point type of a given bit |
| 1396 | ``<size>``. Only values of ``<size>`` that are supported by the target |
| 1397 | will work. 32 (float) and 64 (double) are supported on all targets; 80 |
| 1398 | or 128 (different flavors of long double) are also supported on some |
| 1399 | targets. |
Rafael Espindola | abdd726 | 2014-01-06 21:40:24 +0000 | [diff] [blame] | 1400 | ``a:<abi>:<pref>`` |
| 1401 | This specifies the alignment for an object of aggregate type. |
Rafael Espindola | 5887356 | 2014-01-03 19:21:54 +0000 | [diff] [blame] | 1402 | ``m:<mangling>`` |
Hans Wennborg | d4245ac | 2014-01-15 02:49:17 +0000 | [diff] [blame] | 1403 | If present, specifies that llvm names are mangled in the output. The |
| 1404 | options are |
| 1405 | |
| 1406 | * ``e``: ELF mangling: Private symbols get a ``.L`` prefix. |
| 1407 | * ``m``: Mips mangling: Private symbols get a ``$`` prefix. |
| 1408 | * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other |
| 1409 | symbols get a ``_`` prefix. |
| 1410 | * ``w``: Windows COFF prefix: Similar to Mach-O, but stdcall and fastcall |
| 1411 | functions also get a suffix based on the frame size. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1412 | ``n<size1>:<size2>:<size3>...`` |
| 1413 | This specifies a set of native integer widths for the target CPU in |
| 1414 | bits. For example, it might contain ``n32`` for 32-bit PowerPC, |
| 1415 | ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of |
| 1416 | this set are considered to support most general arithmetic operations |
| 1417 | efficiently. |
| 1418 | |
Rafael Espindola | abdd726 | 2014-01-06 21:40:24 +0000 | [diff] [blame] | 1419 | On every specification that takes a ``<abi>:<pref>``, specifying the |
| 1420 | ``<pref>`` alignment is optional. If omitted, the preceding ``:`` |
| 1421 | should be omitted too and ``<pref>`` will be equal to ``<abi>``. |
| 1422 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1423 | When constructing the data layout for a given target, LLVM starts with a |
| 1424 | default set of specifications which are then (possibly) overridden by |
| 1425 | the specifications in the ``datalayout`` keyword. The default |
| 1426 | specifications are given in this list: |
| 1427 | |
| 1428 | - ``E`` - big endian |
Matt Arsenault | 24b49c4 | 2013-07-31 17:49:08 +0000 | [diff] [blame] | 1429 | - ``p:64:64:64`` - 64-bit pointers with 64-bit alignment. |
| 1430 | - ``p[n]:64:64:64`` - Other address spaces are assumed to be the |
| 1431 | same as the default address space. |
Patrik Hagglund | a832ab1 | 2013-01-30 09:02:06 +0000 | [diff] [blame] | 1432 | - ``S0`` - natural stack alignment is unspecified |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1433 | - ``i1:8:8`` - i1 is 8-bit (byte) aligned |
| 1434 | - ``i8:8:8`` - i8 is 8-bit (byte) aligned |
| 1435 | - ``i16:16:16`` - i16 is 16-bit aligned |
| 1436 | - ``i32:32:32`` - i32 is 32-bit aligned |
| 1437 | - ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred |
| 1438 | alignment of 64-bits |
Patrik Hagglund | a832ab1 | 2013-01-30 09:02:06 +0000 | [diff] [blame] | 1439 | - ``f16:16:16`` - half is 16-bit aligned |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1440 | - ``f32:32:32`` - float is 32-bit aligned |
| 1441 | - ``f64:64:64`` - double is 64-bit aligned |
Patrik Hagglund | a832ab1 | 2013-01-30 09:02:06 +0000 | [diff] [blame] | 1442 | - ``f128:128:128`` - quad is 128-bit aligned |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1443 | - ``v64:64:64`` - 64-bit vector is 64-bit aligned |
| 1444 | - ``v128:128:128`` - 128-bit vector is 128-bit aligned |
Rafael Espindola | e8f4d58 | 2013-12-12 17:21:51 +0000 | [diff] [blame] | 1445 | - ``a:0:64`` - aggregates are 64-bit aligned |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1446 | |
| 1447 | When LLVM is determining the alignment for a given type, it uses the |
| 1448 | following rules: |
| 1449 | |
| 1450 | #. If the type sought is an exact match for one of the specifications, |
| 1451 | that specification is used. |
| 1452 | #. If no match is found, and the type sought is an integer type, then |
| 1453 | the smallest integer type that is larger than the bitwidth of the |
| 1454 | sought type is used. If none of the specifications are larger than |
| 1455 | the bitwidth then the largest integer type is used. For example, |
| 1456 | given the default specifications above, the i7 type will use the |
| 1457 | alignment of i8 (next largest) while both i65 and i256 will use the |
| 1458 | alignment of i64 (largest specified). |
| 1459 | #. If no match is found, and the type sought is a vector type, then the |
| 1460 | largest vector type that is smaller than the sought vector type will |
| 1461 | be used as a fall back. This happens because <128 x double> can be |
| 1462 | implemented in terms of 64 <2 x double>, for example. |
| 1463 | |
| 1464 | The function of the data layout string may not be what you expect. |
| 1465 | Notably, this is not a specification from the frontend of what alignment |
| 1466 | the code generator should use. |
| 1467 | |
| 1468 | Instead, if specified, the target data layout is required to match what |
| 1469 | the ultimate *code generator* expects. This string is used by the |
| 1470 | mid-level optimizers to improve code, and this only works if it matches |
| 1471 | what the ultimate code generator uses. If you would like to generate IR |
| 1472 | that does not embed this target-specific detail into the IR, then you |
| 1473 | don't have to specify the string. This will disable some optimizations |
| 1474 | that require precise layout information, but this also prevents those |
| 1475 | optimizations from introducing target specificity into the IR. |
| 1476 | |
Bill Wendling | 5cc9084 | 2013-10-18 23:41:25 +0000 | [diff] [blame] | 1477 | .. _langref_triple: |
| 1478 | |
| 1479 | Target Triple |
| 1480 | ------------- |
| 1481 | |
| 1482 | A module may specify a target triple string that describes the target |
| 1483 | host. The syntax for the target triple is simply: |
| 1484 | |
| 1485 | .. code-block:: llvm |
| 1486 | |
| 1487 | target triple = "x86_64-apple-macosx10.7.0" |
| 1488 | |
| 1489 | The *target triple* string consists of a series of identifiers delimited |
| 1490 | by the minus sign character ('-'). The canonical forms are: |
| 1491 | |
| 1492 | :: |
| 1493 | |
| 1494 | ARCHITECTURE-VENDOR-OPERATING_SYSTEM |
| 1495 | ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT |
| 1496 | |
| 1497 | This information is passed along to the backend so that it generates |
| 1498 | code for the proper architecture. It's possible to override this on the |
| 1499 | command line with the ``-mtriple`` command line option. |
| 1500 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1501 | .. _pointeraliasing: |
| 1502 | |
| 1503 | Pointer Aliasing Rules |
| 1504 | ---------------------- |
| 1505 | |
| 1506 | Any memory access must be done through a pointer value associated with |
| 1507 | an address range of the memory access, otherwise the behavior is |
| 1508 | undefined. Pointer values are associated with address ranges according |
| 1509 | to the following rules: |
| 1510 | |
| 1511 | - A pointer value is associated with the addresses associated with any |
| 1512 | value it is *based* on. |
| 1513 | - An address of a global variable is associated with the address range |
| 1514 | of the variable's storage. |
| 1515 | - The result value of an allocation instruction is associated with the |
| 1516 | address range of the allocated storage. |
| 1517 | - A null pointer in the default address-space is associated with no |
| 1518 | address. |
| 1519 | - An integer constant other than zero or a pointer value returned from |
| 1520 | a function not defined within LLVM may be associated with address |
| 1521 | ranges allocated through mechanisms other than those provided by |
| 1522 | LLVM. Such ranges shall not overlap with any ranges of addresses |
| 1523 | allocated by mechanisms provided by LLVM. |
| 1524 | |
| 1525 | A pointer value is *based* on another pointer value according to the |
| 1526 | following rules: |
| 1527 | |
| 1528 | - A pointer value formed from a ``getelementptr`` operation is *based* |
| 1529 | on the first operand of the ``getelementptr``. |
| 1530 | - The result value of a ``bitcast`` is *based* on the operand of the |
| 1531 | ``bitcast``. |
| 1532 | - A pointer value formed by an ``inttoptr`` is *based* on all pointer |
| 1533 | values that contribute (directly or indirectly) to the computation of |
| 1534 | the pointer's value. |
| 1535 | - The "*based* on" relationship is transitive. |
| 1536 | |
| 1537 | Note that this definition of *"based"* is intentionally similar to the |
| 1538 | definition of *"based"* in C99, though it is slightly weaker. |
| 1539 | |
| 1540 | LLVM IR does not associate types with memory. The result type of a |
| 1541 | ``load`` merely indicates the size and alignment of the memory from |
| 1542 | which to load, as well as the interpretation of the value. The first |
| 1543 | operand type of a ``store`` similarly only indicates the size and |
| 1544 | alignment of the store. |
| 1545 | |
| 1546 | Consequently, type-based alias analysis, aka TBAA, aka |
| 1547 | ``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR. |
| 1548 | :ref:`Metadata <metadata>` may be used to encode additional information |
| 1549 | which specialized optimization passes may use to implement type-based |
| 1550 | alias analysis. |
| 1551 | |
| 1552 | .. _volatile: |
| 1553 | |
| 1554 | Volatile Memory Accesses |
| 1555 | ------------------------ |
| 1556 | |
| 1557 | Certain memory accesses, such as :ref:`load <i_load>`'s, |
| 1558 | :ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be |
| 1559 | marked ``volatile``. The optimizers must not change the number of |
| 1560 | volatile operations or change their order of execution relative to other |
| 1561 | volatile operations. The optimizers *may* change the order of volatile |
| 1562 | operations relative to non-volatile operations. This is not Java's |
| 1563 | "volatile" and has no cross-thread synchronization behavior. |
| 1564 | |
Andrew Trick | 89fc5a6 | 2013-01-30 21:19:35 +0000 | [diff] [blame] | 1565 | IR-level volatile loads and stores cannot safely be optimized into |
| 1566 | llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are |
| 1567 | flagged volatile. Likewise, the backend should never split or merge |
| 1568 | target-legal volatile load/store instructions. |
| 1569 | |
Andrew Trick | 7e6f928 | 2013-01-31 00:49:39 +0000 | [diff] [blame] | 1570 | .. admonition:: Rationale |
| 1571 | |
| 1572 | Platforms may rely on volatile loads and stores of natively supported |
| 1573 | data width to be executed as single instruction. For example, in C |
| 1574 | this holds for an l-value of volatile primitive type with native |
| 1575 | hardware support, but not necessarily for aggregate types. The |
| 1576 | frontend upholds these expectations, which are intentionally |
| 1577 | unspecified in the IR. The rules above ensure that IR transformation |
| 1578 | do not violate the frontend's contract with the language. |
| 1579 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1580 | .. _memmodel: |
| 1581 | |
| 1582 | Memory Model for Concurrent Operations |
| 1583 | -------------------------------------- |
| 1584 | |
| 1585 | The LLVM IR does not define any way to start parallel threads of |
| 1586 | execution or to register signal handlers. Nonetheless, there are |
| 1587 | platform-specific ways to create them, and we define LLVM IR's behavior |
| 1588 | in their presence. This model is inspired by the C++0x memory model. |
| 1589 | |
| 1590 | For a more informal introduction to this model, see the :doc:`Atomics`. |
| 1591 | |
| 1592 | We define a *happens-before* partial order as the least partial order |
| 1593 | that |
| 1594 | |
| 1595 | - Is a superset of single-thread program order, and |
| 1596 | - When a *synchronizes-with* ``b``, includes an edge from ``a`` to |
| 1597 | ``b``. *Synchronizes-with* pairs are introduced by platform-specific |
| 1598 | techniques, like pthread locks, thread creation, thread joining, |
| 1599 | etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering |
| 1600 | Constraints <ordering>`). |
| 1601 | |
| 1602 | Note that program order does not introduce *happens-before* edges |
| 1603 | between a thread and signals executing inside that thread. |
| 1604 | |
| 1605 | Every (defined) read operation (load instructions, memcpy, atomic |
| 1606 | loads/read-modify-writes, etc.) R reads a series of bytes written by |
| 1607 | (defined) write operations (store instructions, atomic |
| 1608 | stores/read-modify-writes, memcpy, etc.). For the purposes of this |
| 1609 | section, initialized globals are considered to have a write of the |
| 1610 | initializer which is atomic and happens before any other read or write |
| 1611 | of the memory in question. For each byte of a read R, R\ :sub:`byte` |
| 1612 | may see any write to the same byte, except: |
| 1613 | |
| 1614 | - If write\ :sub:`1` happens before write\ :sub:`2`, and |
| 1615 | write\ :sub:`2` happens before R\ :sub:`byte`, then |
| 1616 | R\ :sub:`byte` does not see write\ :sub:`1`. |
| 1617 | - If R\ :sub:`byte` happens before write\ :sub:`3`, then |
| 1618 | R\ :sub:`byte` does not see write\ :sub:`3`. |
| 1619 | |
| 1620 | Given that definition, R\ :sub:`byte` is defined as follows: |
| 1621 | |
| 1622 | - If R is volatile, the result is target-dependent. (Volatile is |
| 1623 | supposed to give guarantees which can support ``sig_atomic_t`` in |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 1624 | C/C++, and may be used for accesses to addresses that do not behave |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1625 | like normal memory. It does not generally provide cross-thread |
| 1626 | synchronization.) |
| 1627 | - Otherwise, if there is no write to the same byte that happens before |
| 1628 | R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte. |
| 1629 | - Otherwise, if R\ :sub:`byte` may see exactly one write, |
| 1630 | R\ :sub:`byte` returns the value written by that write. |
| 1631 | - Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may |
| 1632 | see are atomic, it chooses one of the values written. See the :ref:`Atomic |
| 1633 | Memory Ordering Constraints <ordering>` section for additional |
| 1634 | constraints on how the choice is made. |
| 1635 | - Otherwise R\ :sub:`byte` returns ``undef``. |
| 1636 | |
| 1637 | R returns the value composed of the series of bytes it read. This |
| 1638 | implies that some bytes within the value may be ``undef`` **without** |
| 1639 | the entire value being ``undef``. Note that this only defines the |
| 1640 | semantics of the operation; it doesn't mean that targets will emit more |
| 1641 | than one instruction to read the series of bytes. |
| 1642 | |
| 1643 | Note that in cases where none of the atomic intrinsics are used, this |
| 1644 | model places only one restriction on IR transformations on top of what |
| 1645 | is required for single-threaded execution: introducing a store to a byte |
| 1646 | which might not otherwise be stored is not allowed in general. |
| 1647 | (Specifically, in the case where another thread might write to and read |
| 1648 | from an address, introducing a store can change a load that may see |
| 1649 | exactly one write into a load that may see multiple writes.) |
| 1650 | |
| 1651 | .. _ordering: |
| 1652 | |
| 1653 | Atomic Memory Ordering Constraints |
| 1654 | ---------------------------------- |
| 1655 | |
| 1656 | Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`, |
| 1657 | :ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`, |
| 1658 | :ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take |
Tim Northover | e94a518 | 2014-03-11 10:48:52 +0000 | [diff] [blame] | 1659 | ordering parameters that determine which other atomic instructions on |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1660 | the same address they *synchronize with*. These semantics are borrowed |
| 1661 | from Java and C++0x, but are somewhat more colloquial. If these |
| 1662 | descriptions aren't precise enough, check those specs (see spec |
| 1663 | references in the :doc:`atomics guide <Atomics>`). |
| 1664 | :ref:`fence <i_fence>` instructions treat these orderings somewhat |
| 1665 | differently since they don't take an address. See that instruction's |
| 1666 | documentation for details. |
| 1667 | |
| 1668 | For a simpler introduction to the ordering constraints, see the |
| 1669 | :doc:`Atomics`. |
| 1670 | |
| 1671 | ``unordered`` |
| 1672 | The set of values that can be read is governed by the happens-before |
| 1673 | partial order. A value cannot be read unless some operation wrote |
| 1674 | it. This is intended to provide a guarantee strong enough to model |
| 1675 | Java's non-volatile shared variables. This ordering cannot be |
| 1676 | specified for read-modify-write operations; it is not strong enough |
| 1677 | to make them atomic in any interesting way. |
| 1678 | ``monotonic`` |
| 1679 | In addition to the guarantees of ``unordered``, there is a single |
| 1680 | total order for modifications by ``monotonic`` operations on each |
| 1681 | address. All modification orders must be compatible with the |
| 1682 | happens-before order. There is no guarantee that the modification |
| 1683 | orders can be combined to a global total order for the whole program |
| 1684 | (and this often will not be possible). The read in an atomic |
| 1685 | read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and |
| 1686 | :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification |
| 1687 | order immediately before the value it writes. If one atomic read |
| 1688 | happens before another atomic read of the same address, the later |
| 1689 | read must see the same value or a later value in the address's |
| 1690 | modification order. This disallows reordering of ``monotonic`` (or |
| 1691 | stronger) operations on the same address. If an address is written |
| 1692 | ``monotonic``-ally by one thread, and other threads ``monotonic``-ally |
| 1693 | read that address repeatedly, the other threads must eventually see |
| 1694 | the write. This corresponds to the C++0x/C1x |
| 1695 | ``memory_order_relaxed``. |
| 1696 | ``acquire`` |
| 1697 | In addition to the guarantees of ``monotonic``, a |
| 1698 | *synchronizes-with* edge may be formed with a ``release`` operation. |
| 1699 | This is intended to model C++'s ``memory_order_acquire``. |
| 1700 | ``release`` |
| 1701 | In addition to the guarantees of ``monotonic``, if this operation |
| 1702 | writes a value which is subsequently read by an ``acquire`` |
| 1703 | operation, it *synchronizes-with* that operation. (This isn't a |
| 1704 | complete description; see the C++0x definition of a release |
| 1705 | sequence.) This corresponds to the C++0x/C1x |
| 1706 | ``memory_order_release``. |
| 1707 | ``acq_rel`` (acquire+release) |
| 1708 | Acts as both an ``acquire`` and ``release`` operation on its |
| 1709 | address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``. |
| 1710 | ``seq_cst`` (sequentially consistent) |
| 1711 | In addition to the guarantees of ``acq_rel`` (``acquire`` for an |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 1712 | operation that only reads, ``release`` for an operation that only |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1713 | writes), there is a global total order on all |
| 1714 | sequentially-consistent operations on all addresses, which is |
| 1715 | consistent with the *happens-before* partial order and with the |
| 1716 | modification orders of all the affected addresses. Each |
| 1717 | sequentially-consistent read sees the last preceding write to the |
| 1718 | same address in this global order. This corresponds to the C++0x/C1x |
| 1719 | ``memory_order_seq_cst`` and Java volatile. |
| 1720 | |
| 1721 | .. _singlethread: |
| 1722 | |
| 1723 | If an atomic operation is marked ``singlethread``, it only *synchronizes |
| 1724 | with* or participates in modification and seq\_cst total orderings with |
| 1725 | other operations running in the same thread (for example, in signal |
| 1726 | handlers). |
| 1727 | |
| 1728 | .. _fastmath: |
| 1729 | |
| 1730 | Fast-Math Flags |
| 1731 | --------------- |
| 1732 | |
| 1733 | LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`, |
| 1734 | :ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`, |
| 1735 | :ref:`frem <i_frem>`) have the following flags that can set to enable |
| 1736 | otherwise unsafe floating point operations |
| 1737 | |
| 1738 | ``nnan`` |
| 1739 | No NaNs - Allow optimizations to assume the arguments and result are not |
| 1740 | NaN. Such optimizations are required to retain defined behavior over |
| 1741 | NaNs, but the value of the result is undefined. |
| 1742 | |
| 1743 | ``ninf`` |
| 1744 | No Infs - Allow optimizations to assume the arguments and result are not |
| 1745 | +/-Inf. Such optimizations are required to retain defined behavior over |
| 1746 | +/-Inf, but the value of the result is undefined. |
| 1747 | |
| 1748 | ``nsz`` |
| 1749 | No Signed Zeros - Allow optimizations to treat the sign of a zero |
| 1750 | argument or result as insignificant. |
| 1751 | |
| 1752 | ``arcp`` |
| 1753 | Allow Reciprocal - Allow optimizations to use the reciprocal of an |
| 1754 | argument rather than perform division. |
| 1755 | |
| 1756 | ``fast`` |
| 1757 | Fast - Allow algebraically equivalent transformations that may |
| 1758 | dramatically change results in floating point (e.g. reassociate). This |
| 1759 | flag implies all the others. |
| 1760 | |
Duncan P. N. Exon Smith | 0a448fb | 2014-08-19 21:30:15 +0000 | [diff] [blame] | 1761 | .. _uselistorder: |
| 1762 | |
| 1763 | Use-list Order Directives |
| 1764 | ------------------------- |
| 1765 | |
| 1766 | Use-list directives encode the in-memory order of each use-list, allowing the |
| 1767 | order to be recreated. ``<order-indexes>`` is a comma-separated list of |
| 1768 | indexes that are assigned to the referenced value's uses. The referenced |
| 1769 | value's use-list is immediately sorted by these indexes. |
| 1770 | |
| 1771 | Use-list directives may appear at function scope or global scope. They are not |
| 1772 | instructions, and have no effect on the semantics of the IR. When they're at |
| 1773 | function scope, they must appear after the terminator of the final basic block. |
| 1774 | |
| 1775 | If basic blocks have their address taken via ``blockaddress()`` expressions, |
| 1776 | ``uselistorder_bb`` can be used to reorder their use-lists from outside their |
| 1777 | function's scope. |
| 1778 | |
| 1779 | :Syntax: |
| 1780 | |
| 1781 | :: |
| 1782 | |
| 1783 | uselistorder <ty> <value>, { <order-indexes> } |
| 1784 | uselistorder_bb @function, %block { <order-indexes> } |
| 1785 | |
| 1786 | :Examples: |
| 1787 | |
| 1788 | :: |
| 1789 | |
Duncan P. N. Exon Smith | 2304665 | 2014-08-19 21:48:04 +0000 | [diff] [blame] | 1790 | define void @foo(i32 %arg1, i32 %arg2) { |
| 1791 | entry: |
| 1792 | ; ... instructions ... |
| 1793 | bb: |
| 1794 | ; ... instructions ... |
| 1795 | |
| 1796 | ; At function scope. |
| 1797 | uselistorder i32 %arg1, { 1, 0, 2 } |
| 1798 | uselistorder label %bb, { 1, 0 } |
| 1799 | } |
Duncan P. N. Exon Smith | 0a448fb | 2014-08-19 21:30:15 +0000 | [diff] [blame] | 1800 | |
| 1801 | ; At global scope. |
| 1802 | uselistorder i32* @global, { 1, 2, 0 } |
| 1803 | uselistorder i32 7, { 1, 0 } |
| 1804 | uselistorder i32 (i32) @bar, { 1, 0 } |
| 1805 | uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 } |
| 1806 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1807 | .. _typesystem: |
| 1808 | |
| 1809 | Type System |
| 1810 | =========== |
| 1811 | |
| 1812 | The LLVM type system is one of the most important features of the |
| 1813 | intermediate representation. Being typed enables a number of |
| 1814 | optimizations to be performed on the intermediate representation |
| 1815 | directly, without having to do extra analyses on the side before the |
| 1816 | transformation. A strong type system makes it easier to read the |
| 1817 | generated code and enables novel analyses and transformations that are |
| 1818 | not feasible to perform on normal three address code representations. |
| 1819 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1820 | .. _t_void: |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 1821 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1822 | Void Type |
| 1823 | --------- |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1824 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1825 | :Overview: |
| 1826 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1827 | |
| 1828 | The void type does not represent any value and has no size. |
| 1829 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1830 | :Syntax: |
| 1831 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1832 | |
| 1833 | :: |
| 1834 | |
| 1835 | void |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1836 | |
| 1837 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1838 | .. _t_function: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1839 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1840 | Function Type |
| 1841 | ------------- |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1842 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1843 | :Overview: |
| 1844 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1845 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1846 | The function type can be thought of as a function signature. It consists of a |
| 1847 | return type and a list of formal parameter types. The return type of a function |
| 1848 | type is a void type or first class type --- except for :ref:`label <t_label>` |
| 1849 | and :ref:`metadata <t_metadata>` types. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1850 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1851 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1852 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1853 | :: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1854 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1855 | <returntype> (<parameter list>) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1856 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1857 | ...where '``<parameter list>``' is a comma-separated list of type |
| 1858 | specifiers. Optionally, the parameter list may include a type ``...``, which |
| 1859 | indicates that the function takes a variable number of arguments. Variable |
| 1860 | argument functions can access their arguments with the :ref:`variable argument |
| 1861 | handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type |
| 1862 | except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1863 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1864 | :Examples: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1865 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1866 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1867 | | ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` | |
| 1868 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1869 | | ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. | |
| 1870 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1871 | | ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. | |
| 1872 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1873 | | ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values | |
| 1874 | +---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 1875 | |
| 1876 | .. _t_firstclass: |
| 1877 | |
| 1878 | First Class Types |
| 1879 | ----------------- |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1880 | |
| 1881 | The :ref:`first class <t_firstclass>` types are perhaps the most important. |
| 1882 | Values of these types are the only ones which can be produced by |
| 1883 | instructions. |
| 1884 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1885 | .. _t_single_value: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1886 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1887 | Single Value Types |
| 1888 | ^^^^^^^^^^^^^^^^^^ |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1889 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1890 | These are the types that are valid in registers from CodeGen's perspective. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1891 | |
| 1892 | .. _t_integer: |
| 1893 | |
| 1894 | Integer Type |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1895 | """""""""""" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1896 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1897 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1898 | |
| 1899 | The integer type is a very simple type that simply specifies an |
| 1900 | arbitrary bit width for the integer type desired. Any bit width from 1 |
| 1901 | bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified. |
| 1902 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1903 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1904 | |
| 1905 | :: |
| 1906 | |
| 1907 | iN |
| 1908 | |
| 1909 | The number of bits the integer will occupy is specified by the ``N`` |
| 1910 | value. |
| 1911 | |
| 1912 | Examples: |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1913 | ********* |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1914 | |
| 1915 | +----------------+------------------------------------------------+ |
| 1916 | | ``i1`` | a single-bit integer. | |
| 1917 | +----------------+------------------------------------------------+ |
| 1918 | | ``i32`` | a 32-bit integer. | |
| 1919 | +----------------+------------------------------------------------+ |
| 1920 | | ``i1942652`` | a really big integer of over 1 million bits. | |
| 1921 | +----------------+------------------------------------------------+ |
| 1922 | |
| 1923 | .. _t_floating: |
| 1924 | |
| 1925 | Floating Point Types |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1926 | """""""""""""""""""" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1927 | |
| 1928 | .. list-table:: |
| 1929 | :header-rows: 1 |
| 1930 | |
| 1931 | * - Type |
| 1932 | - Description |
| 1933 | |
| 1934 | * - ``half`` |
| 1935 | - 16-bit floating point value |
| 1936 | |
| 1937 | * - ``float`` |
| 1938 | - 32-bit floating point value |
| 1939 | |
| 1940 | * - ``double`` |
| 1941 | - 64-bit floating point value |
| 1942 | |
| 1943 | * - ``fp128`` |
| 1944 | - 128-bit floating point value (112-bit mantissa) |
| 1945 | |
| 1946 | * - ``x86_fp80`` |
| 1947 | - 80-bit floating point value (X87) |
| 1948 | |
| 1949 | * - ``ppc_fp128`` |
| 1950 | - 128-bit floating point value (two 64-bits) |
| 1951 | |
Reid Kleckner | 9a16d08 | 2014-03-05 02:41:37 +0000 | [diff] [blame] | 1952 | X86_mmx Type |
| 1953 | """""""""""" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1954 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1955 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1956 | |
Reid Kleckner | 9a16d08 | 2014-03-05 02:41:37 +0000 | [diff] [blame] | 1957 | The x86_mmx type represents a value held in an MMX register on an x86 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1958 | machine. The operations allowed on it are quite limited: parameters and |
| 1959 | return values, load and store, and bitcast. User-specified MMX |
| 1960 | instructions are represented as intrinsic or asm calls with arguments |
| 1961 | and/or results of this type. There are no arrays, vectors or constants |
| 1962 | of this type. |
| 1963 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1964 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1965 | |
| 1966 | :: |
| 1967 | |
Reid Kleckner | 9a16d08 | 2014-03-05 02:41:37 +0000 | [diff] [blame] | 1968 | x86_mmx |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1969 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1970 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1971 | .. _t_pointer: |
| 1972 | |
| 1973 | Pointer Type |
| 1974 | """""""""""" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1975 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1976 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1977 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1978 | The pointer type is used to specify memory locations. Pointers are |
| 1979 | commonly used to reference objects in memory. |
| 1980 | |
| 1981 | Pointer types may have an optional address space attribute defining the |
| 1982 | numbered address space where the pointed-to object resides. The default |
| 1983 | address space is number zero. The semantics of non-zero address spaces |
| 1984 | are target-specific. |
| 1985 | |
| 1986 | Note that LLVM does not permit pointers to void (``void*``) nor does it |
| 1987 | permit pointers to labels (``label*``). Use ``i8*`` instead. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1988 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1989 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 1990 | |
| 1991 | :: |
| 1992 | |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1993 | <type> * |
| 1994 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 1995 | :Examples: |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 1996 | |
| 1997 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 1998 | | ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. | |
| 1999 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 2000 | | ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. | |
| 2001 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 2002 | | ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. | |
| 2003 | +-------------------------+--------------------------------------------------------------------------------------------------------------+ |
| 2004 | |
| 2005 | .. _t_vector: |
| 2006 | |
| 2007 | Vector Type |
| 2008 | """"""""""" |
| 2009 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2010 | :Overview: |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 2011 | |
| 2012 | A vector type is a simple derived type that represents a vector of |
| 2013 | elements. Vector types are used when multiple primitive data are |
| 2014 | operated in parallel using a single instruction (SIMD). A vector type |
| 2015 | requires a size (number of elements) and an underlying primitive data |
| 2016 | type. Vector types are considered :ref:`first class <t_firstclass>`. |
| 2017 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2018 | :Syntax: |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 2019 | |
| 2020 | :: |
| 2021 | |
| 2022 | < <# elements> x <elementtype> > |
| 2023 | |
| 2024 | The number of elements is a constant integer value larger than 0; |
Manuel Jacob | 961f787 | 2014-07-30 12:30:06 +0000 | [diff] [blame] | 2025 | elementtype may be any integer, floating point or pointer type. Vectors |
| 2026 | of size zero are not allowed. |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 2027 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2028 | :Examples: |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 2029 | |
| 2030 | +-------------------+--------------------------------------------------+ |
| 2031 | | ``<4 x i32>`` | Vector of 4 32-bit integer values. | |
| 2032 | +-------------------+--------------------------------------------------+ |
| 2033 | | ``<8 x float>`` | Vector of 8 32-bit floating-point values. | |
| 2034 | +-------------------+--------------------------------------------------+ |
| 2035 | | ``<2 x i64>`` | Vector of 2 64-bit integer values. | |
| 2036 | +-------------------+--------------------------------------------------+ |
| 2037 | | ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. | |
| 2038 | +-------------------+--------------------------------------------------+ |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2039 | |
| 2040 | .. _t_label: |
| 2041 | |
| 2042 | Label Type |
| 2043 | ^^^^^^^^^^ |
| 2044 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2045 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2046 | |
| 2047 | The label type represents code labels. |
| 2048 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2049 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2050 | |
| 2051 | :: |
| 2052 | |
| 2053 | label |
| 2054 | |
| 2055 | .. _t_metadata: |
| 2056 | |
| 2057 | Metadata Type |
| 2058 | ^^^^^^^^^^^^^ |
| 2059 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2060 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2061 | |
| 2062 | The metadata type represents embedded metadata. No derived types may be |
| 2063 | created from metadata except for :ref:`function <t_function>` arguments. |
| 2064 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2065 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2066 | |
| 2067 | :: |
| 2068 | |
| 2069 | metadata |
| 2070 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2071 | .. _t_aggregate: |
| 2072 | |
| 2073 | Aggregate Types |
| 2074 | ^^^^^^^^^^^^^^^ |
| 2075 | |
| 2076 | Aggregate Types are a subset of derived types that can contain multiple |
| 2077 | member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are |
| 2078 | aggregate types. :ref:`Vectors <t_vector>` are not considered to be |
| 2079 | aggregate types. |
| 2080 | |
| 2081 | .. _t_array: |
| 2082 | |
| 2083 | Array Type |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 2084 | """""""""" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2085 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2086 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2087 | |
| 2088 | The array type is a very simple derived type that arranges elements |
| 2089 | sequentially in memory. The array type requires a size (number of |
| 2090 | elements) and an underlying data type. |
| 2091 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2092 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2093 | |
| 2094 | :: |
| 2095 | |
| 2096 | [<# elements> x <elementtype>] |
| 2097 | |
| 2098 | The number of elements is a constant integer value; ``elementtype`` may |
| 2099 | be any type with a size. |
| 2100 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2101 | :Examples: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2102 | |
| 2103 | +------------------+--------------------------------------+ |
| 2104 | | ``[40 x i32]`` | Array of 40 32-bit integer values. | |
| 2105 | +------------------+--------------------------------------+ |
| 2106 | | ``[41 x i32]`` | Array of 41 32-bit integer values. | |
| 2107 | +------------------+--------------------------------------+ |
| 2108 | | ``[4 x i8]`` | Array of 4 8-bit integer values. | |
| 2109 | +------------------+--------------------------------------+ |
| 2110 | |
| 2111 | Here are some examples of multidimensional arrays: |
| 2112 | |
| 2113 | +-----------------------------+----------------------------------------------------------+ |
| 2114 | | ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. | |
| 2115 | +-----------------------------+----------------------------------------------------------+ |
| 2116 | | ``[12 x [10 x float]]`` | 12x10 array of single precision floating point values. | |
| 2117 | +-----------------------------+----------------------------------------------------------+ |
| 2118 | | ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. | |
| 2119 | +-----------------------------+----------------------------------------------------------+ |
| 2120 | |
| 2121 | There is no restriction on indexing beyond the end of the array implied |
| 2122 | by a static type (though there are restrictions on indexing beyond the |
| 2123 | bounds of an allocated object in some cases). This means that |
| 2124 | single-dimension 'variable sized array' addressing can be implemented in |
| 2125 | LLVM with a zero length array type. An implementation of 'pascal style |
| 2126 | arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for |
| 2127 | example. |
| 2128 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2129 | .. _t_struct: |
| 2130 | |
| 2131 | Structure Type |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 2132 | """""""""""""" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2133 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2134 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2135 | |
| 2136 | The structure type is used to represent a collection of data members |
| 2137 | together in memory. The elements of a structure may be any type that has |
| 2138 | a size. |
| 2139 | |
| 2140 | Structures in memory are accessed using '``load``' and '``store``' by |
| 2141 | getting a pointer to a field with the '``getelementptr``' instruction. |
| 2142 | Structures in registers are accessed using the '``extractvalue``' and |
| 2143 | '``insertvalue``' instructions. |
| 2144 | |
| 2145 | Structures may optionally be "packed" structures, which indicate that |
| 2146 | the alignment of the struct is one byte, and that there is no padding |
| 2147 | between the elements. In non-packed structs, padding between field types |
| 2148 | is inserted as defined by the DataLayout string in the module, which is |
| 2149 | required to match what the underlying code generator expects. |
| 2150 | |
| 2151 | Structures can either be "literal" or "identified". A literal structure |
| 2152 | is defined inline with other types (e.g. ``{i32, i32}*``) whereas |
| 2153 | identified types are always defined at the top level with a name. |
| 2154 | Literal types are uniqued by their contents and can never be recursive |
| 2155 | or opaque since there is no way to write one. Identified types can be |
| 2156 | recursive, can be opaqued, and are never uniqued. |
| 2157 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2158 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2159 | |
| 2160 | :: |
| 2161 | |
| 2162 | %T1 = type { <type list> } ; Identified normal struct type |
| 2163 | %T2 = type <{ <type list> }> ; Identified packed struct type |
| 2164 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2165 | :Examples: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2166 | |
| 2167 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 2168 | | ``{ i32, i32, i32 }`` | A triple of three ``i32`` values | |
| 2169 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
Daniel Dunbar | 1dc66ca | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 2170 | | ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2171 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 2172 | | ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. | |
| 2173 | +------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| 2174 | |
| 2175 | .. _t_opaque: |
| 2176 | |
| 2177 | Opaque Structure Types |
Rafael Espindola | 0801334 | 2013-12-07 19:34:20 +0000 | [diff] [blame] | 2178 | """""""""""""""""""""" |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2179 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2180 | :Overview: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2181 | |
| 2182 | Opaque structure types are used to represent named structure types that |
| 2183 | do not have a body specified. This corresponds (for example) to the C |
| 2184 | notion of a forward declared structure. |
| 2185 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2186 | :Syntax: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2187 | |
| 2188 | :: |
| 2189 | |
| 2190 | %X = type opaque |
| 2191 | %52 = type opaque |
| 2192 | |
Rafael Espindola | 2f6d7b9 | 2013-12-10 14:53:22 +0000 | [diff] [blame] | 2193 | :Examples: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2194 | |
| 2195 | +--------------+-------------------+ |
| 2196 | | ``opaque`` | An opaque type. | |
| 2197 | +--------------+-------------------+ |
| 2198 | |
Sean Silva | 1703e70 | 2014-04-08 21:06:22 +0000 | [diff] [blame] | 2199 | .. _constants: |
| 2200 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2201 | Constants |
| 2202 | ========= |
| 2203 | |
| 2204 | LLVM has several different basic types of constants. This section |
| 2205 | describes them all and their syntax. |
| 2206 | |
| 2207 | Simple Constants |
| 2208 | ---------------- |
| 2209 | |
| 2210 | **Boolean constants** |
| 2211 | The two strings '``true``' and '``false``' are both valid constants |
| 2212 | of the ``i1`` type. |
| 2213 | **Integer constants** |
| 2214 | Standard integers (such as '4') are constants of the |
| 2215 | :ref:`integer <t_integer>` type. Negative numbers may be used with |
| 2216 | integer types. |
| 2217 | **Floating point constants** |
| 2218 | Floating point constants use standard decimal notation (e.g. |
| 2219 | 123.421), exponential notation (e.g. 1.23421e+2), or a more precise |
| 2220 | hexadecimal notation (see below). The assembler requires the exact |
| 2221 | decimal value of a floating-point constant. For example, the |
| 2222 | assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating |
| 2223 | decimal in binary. Floating point constants must have a :ref:`floating |
| 2224 | point <t_floating>` type. |
| 2225 | **Null pointer constants** |
| 2226 | The identifier '``null``' is recognized as a null pointer constant |
| 2227 | and must be of :ref:`pointer type <t_pointer>`. |
| 2228 | |
| 2229 | The one non-intuitive notation for constants is the hexadecimal form of |
| 2230 | floating point constants. For example, the form |
| 2231 | '``double 0x432ff973cafa8000``' is equivalent to (but harder to read |
| 2232 | than) '``double 4.5e+15``'. The only time hexadecimal floating point |
| 2233 | constants are required (and the only time that they are generated by the |
| 2234 | disassembler) is when a floating point constant must be emitted but it |
| 2235 | cannot be represented as a decimal floating point number in a reasonable |
| 2236 | number of digits. For example, NaN's, infinities, and other special |
| 2237 | values are represented in their IEEE hexadecimal format so that assembly |
| 2238 | and disassembly do not cause any bits to change in the constants. |
| 2239 | |
| 2240 | When using the hexadecimal form, constants of types half, float, and |
| 2241 | double are represented using the 16-digit form shown above (which |
| 2242 | matches the IEEE754 representation for double); half and float values |
Dmitri Gribenko | 4dc2ba1 | 2013-01-16 23:40:37 +0000 | [diff] [blame] | 2243 | must, however, be exactly representable as IEEE 754 half and single |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2244 | precision, respectively. Hexadecimal format is always used for long |
| 2245 | double, and there are three forms of long double. The 80-bit format used |
| 2246 | by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The |
| 2247 | 128-bit format used by PowerPC (two adjacent doubles) is represented by |
| 2248 | ``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is |
Richard Sandiford | ae426b4 | 2013-05-03 14:32:27 +0000 | [diff] [blame] | 2249 | represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles |
| 2250 | will only work if they match the long double format on your target. |
| 2251 | The IEEE 16-bit format (half precision) is represented by ``0xH`` |
| 2252 | followed by 4 hexadecimal digits. All hexadecimal formats are big-endian |
| 2253 | (sign bit at the left). |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2254 | |
Reid Kleckner | 9a16d08 | 2014-03-05 02:41:37 +0000 | [diff] [blame] | 2255 | There are no constants of type x86_mmx. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2256 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 2257 | .. _complexconstants: |
| 2258 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2259 | Complex Constants |
| 2260 | ----------------- |
| 2261 | |
| 2262 | Complex constants are a (potentially recursive) combination of simple |
| 2263 | constants and smaller complex constants. |
| 2264 | |
| 2265 | **Structure constants** |
| 2266 | Structure constants are represented with notation similar to |
| 2267 | structure type definitions (a comma separated list of elements, |
| 2268 | surrounded by braces (``{}``)). For example: |
| 2269 | "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as |
| 2270 | "``@G = external global i32``". Structure constants must have |
| 2271 | :ref:`structure type <t_struct>`, and the number and types of elements |
| 2272 | must match those specified by the type. |
| 2273 | **Array constants** |
| 2274 | Array constants are represented with notation similar to array type |
| 2275 | definitions (a comma separated list of elements, surrounded by |
| 2276 | square brackets (``[]``)). For example: |
| 2277 | "``[ i32 42, i32 11, i32 74 ]``". Array constants must have |
| 2278 | :ref:`array type <t_array>`, and the number and types of elements must |
Daniel Sanders | f605184 | 2014-09-11 12:02:59 +0000 | [diff] [blame] | 2279 | match those specified by the type. As a special case, character array |
| 2280 | constants may also be represented as a double-quoted string using the ``c`` |
| 2281 | prefix. For example: "``c"Hello World\0A\00"``". |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2282 | **Vector constants** |
| 2283 | Vector constants are represented with notation similar to vector |
| 2284 | type definitions (a comma separated list of elements, surrounded by |
| 2285 | less-than/greater-than's (``<>``)). For example: |
| 2286 | "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants |
| 2287 | must have :ref:`vector type <t_vector>`, and the number and types of |
| 2288 | elements must match those specified by the type. |
| 2289 | **Zero initialization** |
| 2290 | The string '``zeroinitializer``' can be used to zero initialize a |
| 2291 | value to zero of *any* type, including scalar and |
| 2292 | :ref:`aggregate <t_aggregate>` types. This is often used to avoid |
| 2293 | having to print large zero initializers (e.g. for large arrays) and |
| 2294 | is always exactly equivalent to using explicit zero initializers. |
| 2295 | **Metadata node** |
| 2296 | A metadata node is a structure-like constant with :ref:`metadata |
| 2297 | type <t_metadata>`. For example: |
| 2298 | "``metadata !{ i32 0, metadata !"test" }``". Unlike other |
| 2299 | constants that are meant to be interpreted as part of the |
| 2300 | instruction stream, metadata is a place to attach additional |
| 2301 | information such as debug info. |
| 2302 | |
| 2303 | Global Variable and Function Addresses |
| 2304 | -------------------------------------- |
| 2305 | |
| 2306 | The addresses of :ref:`global variables <globalvars>` and |
| 2307 | :ref:`functions <functionstructure>` are always implicitly valid |
| 2308 | (link-time) constants. These constants are explicitly referenced when |
| 2309 | the :ref:`identifier for the global <identifiers>` is used and always have |
| 2310 | :ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM |
| 2311 | file: |
| 2312 | |
| 2313 | .. code-block:: llvm |
| 2314 | |
| 2315 | @X = global i32 17 |
| 2316 | @Y = global i32 42 |
| 2317 | @Z = global [2 x i32*] [ i32* @X, i32* @Y ] |
| 2318 | |
| 2319 | .. _undefvalues: |
| 2320 | |
| 2321 | Undefined Values |
| 2322 | ---------------- |
| 2323 | |
| 2324 | The string '``undef``' can be used anywhere a constant is expected, and |
| 2325 | indicates that the user of the value may receive an unspecified |
| 2326 | bit-pattern. Undefined values may be of any type (other than '``label``' |
| 2327 | or '``void``') and be used anywhere a constant is permitted. |
| 2328 | |
| 2329 | Undefined values are useful because they indicate to the compiler that |
| 2330 | the program is well defined no matter what value is used. This gives the |
| 2331 | compiler more freedom to optimize. Here are some examples of |
| 2332 | (potentially surprising) transformations that are valid (in pseudo IR): |
| 2333 | |
| 2334 | .. code-block:: llvm |
| 2335 | |
| 2336 | %A = add %X, undef |
| 2337 | %B = sub %X, undef |
| 2338 | %C = xor %X, undef |
| 2339 | Safe: |
| 2340 | %A = undef |
| 2341 | %B = undef |
| 2342 | %C = undef |
| 2343 | |
| 2344 | This is safe because all of the output bits are affected by the undef |
| 2345 | bits. Any output bit can have a zero or one depending on the input bits. |
| 2346 | |
| 2347 | .. code-block:: llvm |
| 2348 | |
| 2349 | %A = or %X, undef |
| 2350 | %B = and %X, undef |
| 2351 | Safe: |
| 2352 | %A = -1 |
| 2353 | %B = 0 |
| 2354 | Unsafe: |
| 2355 | %A = undef |
| 2356 | %B = undef |
| 2357 | |
| 2358 | These logical operations have bits that are not always affected by the |
| 2359 | input. For example, if ``%X`` has a zero bit, then the output of the |
| 2360 | '``and``' operation will always be a zero for that bit, no matter what |
| 2361 | the corresponding bit from the '``undef``' is. As such, it is unsafe to |
| 2362 | optimize or assume that the result of the '``and``' is '``undef``'. |
| 2363 | However, it is safe to assume that all bits of the '``undef``' could be |
| 2364 | 0, and optimize the '``and``' to 0. Likewise, it is safe to assume that |
| 2365 | all the bits of the '``undef``' operand to the '``or``' could be set, |
| 2366 | allowing the '``or``' to be folded to -1. |
| 2367 | |
| 2368 | .. code-block:: llvm |
| 2369 | |
| 2370 | %A = select undef, %X, %Y |
| 2371 | %B = select undef, 42, %Y |
| 2372 | %C = select %X, %Y, undef |
| 2373 | Safe: |
| 2374 | %A = %X (or %Y) |
| 2375 | %B = 42 (or %Y) |
| 2376 | %C = %Y |
| 2377 | Unsafe: |
| 2378 | %A = undef |
| 2379 | %B = undef |
| 2380 | %C = undef |
| 2381 | |
| 2382 | This set of examples shows that undefined '``select``' (and conditional |
| 2383 | branch) conditions can go *either way*, but they have to come from one |
| 2384 | of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were |
| 2385 | both known to have a clear low bit, then ``%A`` would have to have a |
| 2386 | cleared low bit. However, in the ``%C`` example, the optimizer is |
| 2387 | allowed to assume that the '``undef``' operand could be the same as |
| 2388 | ``%Y``, allowing the whole '``select``' to be eliminated. |
| 2389 | |
| 2390 | .. code-block:: llvm |
| 2391 | |
| 2392 | %A = xor undef, undef |
| 2393 | |
| 2394 | %B = undef |
| 2395 | %C = xor %B, %B |
| 2396 | |
| 2397 | %D = undef |
Jonathan Roelofs | ec81c0b | 2014-10-16 19:28:10 +0000 | [diff] [blame] | 2398 | %E = icmp slt %D, 4 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2399 | %F = icmp gte %D, 4 |
| 2400 | |
| 2401 | Safe: |
| 2402 | %A = undef |
| 2403 | %B = undef |
| 2404 | %C = undef |
| 2405 | %D = undef |
| 2406 | %E = undef |
| 2407 | %F = undef |
| 2408 | |
| 2409 | This example points out that two '``undef``' operands are not |
| 2410 | necessarily the same. This can be surprising to people (and also matches |
| 2411 | C semantics) where they assume that "``X^X``" is always zero, even if |
| 2412 | ``X`` is undefined. This isn't true for a number of reasons, but the |
| 2413 | short answer is that an '``undef``' "variable" can arbitrarily change |
| 2414 | its value over its "live range". This is true because the variable |
| 2415 | doesn't actually *have a live range*. Instead, the value is logically |
| 2416 | read from arbitrary registers that happen to be around when needed, so |
| 2417 | the value is not necessarily consistent over time. In fact, ``%A`` and |
| 2418 | ``%C`` need to have the same semantics or the core LLVM "replace all |
| 2419 | uses with" concept would not hold. |
| 2420 | |
| 2421 | .. code-block:: llvm |
| 2422 | |
| 2423 | %A = fdiv undef, %X |
| 2424 | %B = fdiv %X, undef |
| 2425 | Safe: |
| 2426 | %A = undef |
| 2427 | b: unreachable |
| 2428 | |
| 2429 | These examples show the crucial difference between an *undefined value* |
| 2430 | and *undefined behavior*. An undefined value (like '``undef``') is |
| 2431 | allowed to have an arbitrary bit-pattern. This means that the ``%A`` |
| 2432 | operation can be constant folded to '``undef``', because the '``undef``' |
| 2433 | could be an SNaN, and ``fdiv`` is not (currently) defined on SNaN's. |
| 2434 | However, in the second example, we can make a more aggressive |
| 2435 | assumption: because the ``undef`` is allowed to be an arbitrary value, |
| 2436 | we are allowed to assume that it could be zero. Since a divide by zero |
| 2437 | has *undefined behavior*, we are allowed to assume that the operation |
| 2438 | does not execute at all. This allows us to delete the divide and all |
| 2439 | code after it. Because the undefined operation "can't happen", the |
| 2440 | optimizer can assume that it occurs in dead code. |
| 2441 | |
| 2442 | .. code-block:: llvm |
| 2443 | |
| 2444 | a: store undef -> %X |
| 2445 | b: store %X -> undef |
| 2446 | Safe: |
| 2447 | a: <deleted> |
| 2448 | b: unreachable |
| 2449 | |
| 2450 | These examples reiterate the ``fdiv`` example: a store *of* an undefined |
| 2451 | value can be assumed to not have any effect; we can assume that the |
| 2452 | value is overwritten with bits that happen to match what was already |
| 2453 | there. However, a store *to* an undefined location could clobber |
| 2454 | arbitrary memory, therefore, it has undefined behavior. |
| 2455 | |
| 2456 | .. _poisonvalues: |
| 2457 | |
| 2458 | Poison Values |
| 2459 | ------------- |
| 2460 | |
| 2461 | Poison values are similar to :ref:`undef values <undefvalues>`, however |
| 2462 | they also represent the fact that an instruction or constant expression |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 2463 | that cannot evoke side effects has nevertheless detected a condition |
| 2464 | that results in undefined behavior. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2465 | |
| 2466 | There is currently no way of representing a poison value in the IR; they |
| 2467 | only exist when produced by operations such as :ref:`add <i_add>` with |
| 2468 | the ``nsw`` flag. |
| 2469 | |
| 2470 | Poison value behavior is defined in terms of value *dependence*: |
| 2471 | |
| 2472 | - Values other than :ref:`phi <i_phi>` nodes depend on their operands. |
| 2473 | - :ref:`Phi <i_phi>` nodes depend on the operand corresponding to |
| 2474 | their dynamic predecessor basic block. |
| 2475 | - Function arguments depend on the corresponding actual argument values |
| 2476 | in the dynamic callers of their functions. |
| 2477 | - :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>` |
| 2478 | instructions that dynamically transfer control back to them. |
| 2479 | - :ref:`Invoke <i_invoke>` instructions depend on the |
| 2480 | :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing |
| 2481 | call instructions that dynamically transfer control back to them. |
| 2482 | - Non-volatile loads and stores depend on the most recent stores to all |
| 2483 | of the referenced memory addresses, following the order in the IR |
| 2484 | (including loads and stores implied by intrinsics such as |
| 2485 | :ref:`@llvm.memcpy <int_memcpy>`.) |
| 2486 | - An instruction with externally visible side effects depends on the |
| 2487 | most recent preceding instruction with externally visible side |
| 2488 | effects, following the order in the IR. (This includes :ref:`volatile |
| 2489 | operations <volatile>`.) |
| 2490 | - An instruction *control-depends* on a :ref:`terminator |
| 2491 | instruction <terminators>` if the terminator instruction has |
| 2492 | multiple successors and the instruction is always executed when |
| 2493 | control transfers to one of the successors, and may not be executed |
| 2494 | when control is transferred to another. |
| 2495 | - Additionally, an instruction also *control-depends* on a terminator |
| 2496 | instruction if the set of instructions it otherwise depends on would |
| 2497 | be different if the terminator had transferred control to a different |
| 2498 | successor. |
| 2499 | - Dependence is transitive. |
| 2500 | |
Richard Smith | 32dbdf6 | 2014-07-31 04:25:36 +0000 | [diff] [blame] | 2501 | Poison values have the same behavior as :ref:`undef values <undefvalues>`, |
| 2502 | with the additional effect that any instruction that has a *dependence* |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2503 | on a poison value has undefined behavior. |
| 2504 | |
| 2505 | Here are some examples: |
| 2506 | |
| 2507 | .. code-block:: llvm |
| 2508 | |
| 2509 | entry: |
| 2510 | %poison = sub nuw i32 0, 1 ; Results in a poison value. |
| 2511 | %still_poison = and i32 %poison, 0 ; 0, but also poison. |
| 2512 | %poison_yet_again = getelementptr i32* @h, i32 %still_poison |
| 2513 | store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned |
| 2514 | |
| 2515 | store i32 %poison, i32* @g ; Poison value stored to memory. |
| 2516 | %poison2 = load i32* @g ; Poison value loaded back from memory. |
| 2517 | |
| 2518 | store volatile i32 %poison, i32* @g ; External observation; undefined behavior. |
| 2519 | |
| 2520 | %narrowaddr = bitcast i32* @g to i16* |
| 2521 | %wideaddr = bitcast i32* @g to i64* |
| 2522 | %poison3 = load i16* %narrowaddr ; Returns a poison value. |
| 2523 | %poison4 = load i64* %wideaddr ; Returns a poison value. |
| 2524 | |
| 2525 | %cmp = icmp slt i32 %poison, 0 ; Returns a poison value. |
| 2526 | br i1 %cmp, label %true, label %end ; Branch to either destination. |
| 2527 | |
| 2528 | true: |
| 2529 | store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so |
| 2530 | ; it has undefined behavior. |
| 2531 | br label %end |
| 2532 | |
| 2533 | end: |
| 2534 | %p = phi i32 [ 0, %entry ], [ 1, %true ] |
| 2535 | ; Both edges into this PHI are |
| 2536 | ; control-dependent on %cmp, so this |
| 2537 | ; always results in a poison value. |
| 2538 | |
| 2539 | store volatile i32 0, i32* @g ; This would depend on the store in %true |
| 2540 | ; if %cmp is true, or the store in %entry |
| 2541 | ; otherwise, so this is undefined behavior. |
| 2542 | |
| 2543 | br i1 %cmp, label %second_true, label %second_end |
| 2544 | ; The same branch again, but this time the |
| 2545 | ; true block doesn't have side effects. |
| 2546 | |
| 2547 | second_true: |
| 2548 | ; No side effects! |
| 2549 | ret void |
| 2550 | |
| 2551 | second_end: |
| 2552 | store volatile i32 0, i32* @g ; This time, the instruction always depends |
| 2553 | ; on the store in %end. Also, it is |
| 2554 | ; control-equivalent to %end, so this is |
| 2555 | ; well-defined (ignoring earlier undefined |
| 2556 | ; behavior in this example). |
| 2557 | |
| 2558 | .. _blockaddress: |
| 2559 | |
| 2560 | Addresses of Basic Blocks |
| 2561 | ------------------------- |
| 2562 | |
| 2563 | ``blockaddress(@function, %block)`` |
| 2564 | |
| 2565 | The '``blockaddress``' constant computes the address of the specified |
| 2566 | basic block in the specified function, and always has an ``i8*`` type. |
| 2567 | Taking the address of the entry block is illegal. |
| 2568 | |
| 2569 | This value only has defined behavior when used as an operand to the |
| 2570 | ':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons |
| 2571 | against null. Pointer equality tests between labels addresses results in |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 2572 | undefined behavior --- though, again, comparison against null is ok, and |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2573 | no label is equal to the null pointer. This may be passed around as an |
| 2574 | opaque pointer sized value as long as the bits are not inspected. This |
| 2575 | allows ``ptrtoint`` and arithmetic to be performed on these values so |
| 2576 | long as the original value is reconstituted before the ``indirectbr`` |
| 2577 | instruction. |
| 2578 | |
| 2579 | Finally, some targets may provide defined semantics when using the value |
| 2580 | as the operand to an inline assembly, but that is target specific. |
| 2581 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 2582 | .. _constantexprs: |
| 2583 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2584 | Constant Expressions |
| 2585 | -------------------- |
| 2586 | |
| 2587 | Constant expressions are used to allow expressions involving other |
| 2588 | constants to be used as constants. Constant expressions may be of any |
| 2589 | :ref:`first class <t_firstclass>` type and may involve any LLVM operation |
| 2590 | that does not have side effects (e.g. load and call are not supported). |
| 2591 | The following is the syntax for constant expressions: |
| 2592 | |
| 2593 | ``trunc (CST to TYPE)`` |
| 2594 | Truncate a constant to another type. The bit size of CST must be |
| 2595 | larger than the bit size of TYPE. Both types must be integers. |
| 2596 | ``zext (CST to TYPE)`` |
| 2597 | Zero extend a constant to another type. The bit size of CST must be |
| 2598 | smaller than the bit size of TYPE. Both types must be integers. |
| 2599 | ``sext (CST to TYPE)`` |
| 2600 | Sign extend a constant to another type. The bit size of CST must be |
| 2601 | smaller than the bit size of TYPE. Both types must be integers. |
| 2602 | ``fptrunc (CST to TYPE)`` |
| 2603 | Truncate a floating point constant to another floating point type. |
| 2604 | The size of CST must be larger than the size of TYPE. Both types |
| 2605 | must be floating point. |
| 2606 | ``fpext (CST to TYPE)`` |
| 2607 | Floating point extend a constant to another type. The size of CST |
| 2608 | must be smaller or equal to the size of TYPE. Both types must be |
| 2609 | floating point. |
| 2610 | ``fptoui (CST to TYPE)`` |
| 2611 | Convert a floating point constant to the corresponding unsigned |
| 2612 | integer constant. TYPE must be a scalar or vector integer type. CST |
| 2613 | must be of scalar or vector floating point type. Both CST and TYPE |
| 2614 | must be scalars, or vectors of the same number of elements. If the |
| 2615 | value won't fit in the integer type, the results are undefined. |
| 2616 | ``fptosi (CST to TYPE)`` |
| 2617 | Convert a floating point constant to the corresponding signed |
| 2618 | integer constant. TYPE must be a scalar or vector integer type. CST |
| 2619 | must be of scalar or vector floating point type. Both CST and TYPE |
| 2620 | must be scalars, or vectors of the same number of elements. If the |
| 2621 | value won't fit in the integer type, the results are undefined. |
| 2622 | ``uitofp (CST to TYPE)`` |
| 2623 | Convert an unsigned integer constant to the corresponding floating |
| 2624 | point constant. TYPE must be a scalar or vector floating point type. |
| 2625 | CST must be of scalar or vector integer type. Both CST and TYPE must |
| 2626 | be scalars, or vectors of the same number of elements. If the value |
| 2627 | won't fit in the floating point type, the results are undefined. |
| 2628 | ``sitofp (CST to TYPE)`` |
| 2629 | Convert a signed integer constant to the corresponding floating |
| 2630 | point constant. TYPE must be a scalar or vector floating point type. |
| 2631 | CST must be of scalar or vector integer type. Both CST and TYPE must |
| 2632 | be scalars, or vectors of the same number of elements. If the value |
| 2633 | won't fit in the floating point type, the results are undefined. |
| 2634 | ``ptrtoint (CST to TYPE)`` |
| 2635 | Convert a pointer typed constant to the corresponding integer |
Eli Bendersky | 9c0d493 | 2013-03-11 16:51:15 +0000 | [diff] [blame] | 2636 | constant. ``TYPE`` must be an integer type. ``CST`` must be of |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2637 | pointer type. The ``CST`` value is zero extended, truncated, or |
| 2638 | unchanged to make it fit in ``TYPE``. |
| 2639 | ``inttoptr (CST to TYPE)`` |
| 2640 | Convert an integer constant to a pointer constant. TYPE must be a |
| 2641 | pointer type. CST must be of integer type. The CST value is zero |
| 2642 | extended, truncated, or unchanged to make it fit in a pointer size. |
| 2643 | This one is *really* dangerous! |
| 2644 | ``bitcast (CST to TYPE)`` |
| 2645 | Convert a constant, CST, to another TYPE. The constraints of the |
| 2646 | operands are the same as those for the :ref:`bitcast |
| 2647 | instruction <i_bitcast>`. |
Matt Arsenault | b03bd4d | 2013-11-15 01:34:59 +0000 | [diff] [blame] | 2648 | ``addrspacecast (CST to TYPE)`` |
| 2649 | Convert a constant pointer or constant vector of pointer, CST, to another |
| 2650 | TYPE in a different address space. The constraints of the operands are the |
| 2651 | same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2652 | ``getelementptr (CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (CSTPTR, IDX0, IDX1, ...)`` |
| 2653 | Perform the :ref:`getelementptr operation <i_getelementptr>` on |
| 2654 | constants. As with the :ref:`getelementptr <i_getelementptr>` |
| 2655 | instruction, the index list may have zero or more indexes, which are |
| 2656 | required to make sense for the type of "CSTPTR". |
| 2657 | ``select (COND, VAL1, VAL2)`` |
| 2658 | Perform the :ref:`select operation <i_select>` on constants. |
| 2659 | ``icmp COND (VAL1, VAL2)`` |
| 2660 | Performs the :ref:`icmp operation <i_icmp>` on constants. |
| 2661 | ``fcmp COND (VAL1, VAL2)`` |
| 2662 | Performs the :ref:`fcmp operation <i_fcmp>` on constants. |
| 2663 | ``extractelement (VAL, IDX)`` |
| 2664 | Perform the :ref:`extractelement operation <i_extractelement>` on |
| 2665 | constants. |
| 2666 | ``insertelement (VAL, ELT, IDX)`` |
| 2667 | Perform the :ref:`insertelement operation <i_insertelement>` on |
| 2668 | constants. |
| 2669 | ``shufflevector (VEC1, VEC2, IDXMASK)`` |
| 2670 | Perform the :ref:`shufflevector operation <i_shufflevector>` on |
| 2671 | constants. |
| 2672 | ``extractvalue (VAL, IDX0, IDX1, ...)`` |
| 2673 | Perform the :ref:`extractvalue operation <i_extractvalue>` on |
| 2674 | constants. The index list is interpreted in a similar manner as |
| 2675 | indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At |
| 2676 | least one index value must be specified. |
| 2677 | ``insertvalue (VAL, ELT, IDX0, IDX1, ...)`` |
| 2678 | Perform the :ref:`insertvalue operation <i_insertvalue>` on constants. |
| 2679 | The index list is interpreted in a similar manner as indices in a |
| 2680 | ':ref:`getelementptr <i_getelementptr>`' operation. At least one index |
| 2681 | value must be specified. |
| 2682 | ``OPCODE (LHS, RHS)`` |
| 2683 | Perform the specified operation of the LHS and RHS constants. OPCODE |
| 2684 | may be any of the :ref:`binary <binaryops>` or :ref:`bitwise |
| 2685 | binary <bitwiseops>` operations. The constraints on operands are |
| 2686 | the same as those for the corresponding instruction (e.g. no bitwise |
| 2687 | operations on floating point values are allowed). |
| 2688 | |
| 2689 | Other Values |
| 2690 | ============ |
| 2691 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 2692 | .. _inlineasmexprs: |
| 2693 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2694 | Inline Assembler Expressions |
| 2695 | ---------------------------- |
| 2696 | |
| 2697 | LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level |
| 2698 | Inline Assembly <moduleasm>`) through the use of a special value. This |
| 2699 | value represents the inline assembler as a string (containing the |
| 2700 | instructions to emit), a list of operand constraints (stored as a |
| 2701 | string), a flag that indicates whether or not the inline asm expression |
| 2702 | has side effects, and a flag indicating whether the function containing |
| 2703 | the asm needs to align its stack conservatively. An example inline |
| 2704 | assembler expression is: |
| 2705 | |
| 2706 | .. code-block:: llvm |
| 2707 | |
| 2708 | i32 (i32) asm "bswap $0", "=r,r" |
| 2709 | |
| 2710 | Inline assembler expressions may **only** be used as the callee operand |
| 2711 | of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction. |
| 2712 | Thus, typically we have: |
| 2713 | |
| 2714 | .. code-block:: llvm |
| 2715 | |
| 2716 | %X = call i32 asm "bswap $0", "=r,r"(i32 %Y) |
| 2717 | |
| 2718 | Inline asms with side effects not visible in the constraint list must be |
| 2719 | marked as having side effects. This is done through the use of the |
| 2720 | '``sideeffect``' keyword, like so: |
| 2721 | |
| 2722 | .. code-block:: llvm |
| 2723 | |
| 2724 | call void asm sideeffect "eieio", ""() |
| 2725 | |
| 2726 | In some cases inline asms will contain code that will not work unless |
| 2727 | the stack is aligned in some way, such as calls or SSE instructions on |
| 2728 | x86, yet will not contain code that does that alignment within the asm. |
| 2729 | The compiler should make conservative assumptions about what the asm |
| 2730 | might contain and should generate its usual stack alignment code in the |
| 2731 | prologue if the '``alignstack``' keyword is present: |
| 2732 | |
| 2733 | .. code-block:: llvm |
| 2734 | |
| 2735 | call void asm alignstack "eieio", ""() |
| 2736 | |
| 2737 | Inline asms also support using non-standard assembly dialects. The |
| 2738 | assumed dialect is ATT. When the '``inteldialect``' keyword is present, |
| 2739 | the inline asm is using the Intel dialect. Currently, ATT and Intel are |
| 2740 | the only supported dialects. An example is: |
| 2741 | |
| 2742 | .. code-block:: llvm |
| 2743 | |
| 2744 | call void asm inteldialect "eieio", ""() |
| 2745 | |
| 2746 | If multiple keywords appear the '``sideeffect``' keyword must come |
| 2747 | first, the '``alignstack``' keyword second and the '``inteldialect``' |
| 2748 | keyword last. |
| 2749 | |
| 2750 | Inline Asm Metadata |
| 2751 | ^^^^^^^^^^^^^^^^^^^ |
| 2752 | |
| 2753 | The call instructions that wrap inline asm nodes may have a |
| 2754 | "``!srcloc``" MDNode attached to it that contains a list of constant |
| 2755 | integers. If present, the code generator will use the integer as the |
| 2756 | location cookie value when report errors through the ``LLVMContext`` |
| 2757 | error reporting mechanisms. This allows a front-end to correlate backend |
| 2758 | errors that occur with inline asm back to the source code that produced |
| 2759 | it. For example: |
| 2760 | |
| 2761 | .. code-block:: llvm |
| 2762 | |
| 2763 | call void asm sideeffect "something bad", ""(), !srcloc !42 |
| 2764 | ... |
| 2765 | !42 = !{ i32 1234567 } |
| 2766 | |
| 2767 | It is up to the front-end to make sense of the magic numbers it places |
| 2768 | in the IR. If the MDNode contains multiple constants, the code generator |
| 2769 | will use the one that corresponds to the line of the asm that the error |
| 2770 | occurs on. |
| 2771 | |
| 2772 | .. _metadata: |
| 2773 | |
| 2774 | Metadata Nodes and Metadata Strings |
| 2775 | ----------------------------------- |
| 2776 | |
| 2777 | LLVM IR allows metadata to be attached to instructions in the program |
| 2778 | that can convey extra information about the code to the optimizers and |
| 2779 | code generator. One example application of metadata is source-level |
| 2780 | debug information. There are two metadata primitives: strings and nodes. |
| 2781 | All metadata has the ``metadata`` type and is identified in syntax by a |
| 2782 | preceding exclamation point ('``!``'). |
| 2783 | |
| 2784 | A metadata string is a string surrounded by double quotes. It can |
| 2785 | contain any character by escaping non-printable characters with |
| 2786 | "``\xx``" where "``xx``" is the two digit hex code. For example: |
| 2787 | "``!"test\00"``". |
| 2788 | |
| 2789 | Metadata nodes are represented with notation similar to structure |
| 2790 | constants (a comma separated list of elements, surrounded by braces and |
| 2791 | preceded by an exclamation point). Metadata nodes can have any values as |
| 2792 | their operand. For example: |
| 2793 | |
| 2794 | .. code-block:: llvm |
| 2795 | |
| 2796 | !{ metadata !"test\00", i32 10} |
| 2797 | |
| 2798 | A :ref:`named metadata <namedmetadatastructure>` is a collection of |
| 2799 | metadata nodes, which can be looked up in the module symbol table. For |
| 2800 | example: |
| 2801 | |
| 2802 | .. code-block:: llvm |
| 2803 | |
| 2804 | !foo = metadata !{!4, !3} |
| 2805 | |
| 2806 | Metadata can be used as function arguments. Here ``llvm.dbg.value`` |
| 2807 | function is using two metadata arguments: |
| 2808 | |
| 2809 | .. code-block:: llvm |
| 2810 | |
| 2811 | call void @llvm.dbg.value(metadata !24, i64 0, metadata !25) |
| 2812 | |
| 2813 | Metadata can be attached with an instruction. Here metadata ``!21`` is |
| 2814 | attached to the ``add`` instruction using the ``!dbg`` identifier: |
| 2815 | |
| 2816 | .. code-block:: llvm |
| 2817 | |
| 2818 | %indvar.next = add i64 %indvar, 1, !dbg !21 |
| 2819 | |
| 2820 | More information about specific metadata nodes recognized by the |
| 2821 | optimizers and code generator is found below. |
| 2822 | |
| 2823 | '``tbaa``' Metadata |
| 2824 | ^^^^^^^^^^^^^^^^^^^ |
| 2825 | |
| 2826 | In LLVM IR, memory does not have types, so LLVM's own type system is not |
| 2827 | suitable for doing TBAA. Instead, metadata is added to the IR to |
| 2828 | describe a type system of a higher level language. This can be used to |
| 2829 | implement typical C/C++ TBAA, but it can also be used to implement |
| 2830 | custom alias analysis behavior for other languages. |
| 2831 | |
| 2832 | The current metadata format is very simple. TBAA metadata nodes have up |
| 2833 | to three fields, e.g.: |
| 2834 | |
| 2835 | .. code-block:: llvm |
| 2836 | |
| 2837 | !0 = metadata !{ metadata !"an example type tree" } |
| 2838 | !1 = metadata !{ metadata !"int", metadata !0 } |
| 2839 | !2 = metadata !{ metadata !"float", metadata !0 } |
| 2840 | !3 = metadata !{ metadata !"const float", metadata !2, i64 1 } |
| 2841 | |
| 2842 | The first field is an identity field. It can be any value, usually a |
| 2843 | metadata string, which uniquely identifies the type. The most important |
| 2844 | name in the tree is the name of the root node. Two trees with different |
| 2845 | root node names are entirely disjoint, even if they have leaves with |
| 2846 | common names. |
| 2847 | |
| 2848 | The second field identifies the type's parent node in the tree, or is |
| 2849 | null or omitted for a root node. A type is considered to alias all of |
| 2850 | its descendants and all of its ancestors in the tree. Also, a type is |
| 2851 | considered to alias all types in other trees, so that bitcode produced |
| 2852 | from multiple front-ends is handled conservatively. |
| 2853 | |
| 2854 | If the third field is present, it's an integer which if equal to 1 |
| 2855 | indicates that the type is "constant" (meaning |
| 2856 | ``pointsToConstantMemory`` should return true; see `other useful |
| 2857 | AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). |
| 2858 | |
| 2859 | '``tbaa.struct``' Metadata |
| 2860 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 2861 | |
| 2862 | The :ref:`llvm.memcpy <int_memcpy>` is often used to implement |
| 2863 | aggregate assignment operations in C and similar languages, however it |
| 2864 | is defined to copy a contiguous region of memory, which is more than |
| 2865 | strictly necessary for aggregate types which contain holes due to |
| 2866 | padding. Also, it doesn't contain any TBAA information about the fields |
| 2867 | of the aggregate. |
| 2868 | |
| 2869 | ``!tbaa.struct`` metadata can describe which memory subregions in a |
| 2870 | memcpy are padding and what the TBAA tags of the struct are. |
| 2871 | |
| 2872 | The current metadata format is very simple. ``!tbaa.struct`` metadata |
| 2873 | nodes are a list of operands which are in conceptual groups of three. |
| 2874 | For each group of three, the first operand gives the byte offset of a |
| 2875 | field in bytes, the second gives its size in bytes, and the third gives |
| 2876 | its tbaa tag. e.g.: |
| 2877 | |
| 2878 | .. code-block:: llvm |
| 2879 | |
| 2880 | !4 = metadata !{ i64 0, i64 4, metadata !1, i64 8, i64 4, metadata !2 } |
| 2881 | |
| 2882 | This describes a struct with two fields. The first is at offset 0 bytes |
| 2883 | with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes |
| 2884 | and has size 4 bytes and has tbaa tag !2. |
| 2885 | |
| 2886 | Note that the fields need not be contiguous. In this example, there is a |
| 2887 | 4 byte gap between the two fields. This gap represents padding which |
| 2888 | does not carry useful data and need not be preserved. |
| 2889 | |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2890 | '``noalias``' and '``alias.scope``' Metadata |
Dan Liew | bafdcba | 2014-07-28 13:33:51 +0000 | [diff] [blame] | 2891 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2892 | |
| 2893 | ``noalias`` and ``alias.scope`` metadata provide the ability to specify generic |
| 2894 | noalias memory-access sets. This means that some collection of memory access |
| 2895 | instructions (loads, stores, memory-accessing calls, etc.) that carry |
| 2896 | ``noalias`` metadata can specifically be specified not to alias with some other |
| 2897 | collection of memory access instructions that carry ``alias.scope`` metadata. |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2898 | Each type of metadata specifies a list of scopes where each scope has an id and |
| 2899 | a domain. When evaluating an aliasing query, if for some some domain, the set |
| 2900 | of scopes with that domain in one instruction's ``alias.scope`` list is a |
| 2901 | subset of (or qual to) the set of scopes for that domain in another |
| 2902 | instruction's ``noalias`` list, then the two memory accesses are assumed not to |
| 2903 | alias. |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2904 | |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2905 | The metadata identifying each domain is itself a list containing one or two |
| 2906 | entries. The first entry is the name of the domain. Note that if the name is a |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2907 | string then it can be combined accross functions and translation units. A |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2908 | self-reference can be used to create globally unique domain names. A |
| 2909 | descriptive string may optionally be provided as a second list entry. |
| 2910 | |
| 2911 | The metadata identifying each scope is also itself a list containing two or |
| 2912 | three entries. The first entry is the name of the scope. Note that if the name |
| 2913 | is a string then it can be combined accross functions and translation units. A |
| 2914 | self-reference can be used to create globally unique scope names. A metadata |
| 2915 | reference to the scope's domain is the second entry. A descriptive string may |
| 2916 | optionally be provided as a third list entry. |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2917 | |
| 2918 | For example, |
| 2919 | |
| 2920 | .. code-block:: llvm |
| 2921 | |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2922 | ; Two scope domains: |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2923 | !0 = metadata !{metadata !0} |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2924 | !1 = metadata !{metadata !1} |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2925 | |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2926 | ; Some scopes in these domains: |
| 2927 | !2 = metadata !{metadata !2, metadata !0} |
| 2928 | !3 = metadata !{metadata !3, metadata !0} |
| 2929 | !4 = metadata !{metadata !4, metadata !1} |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2930 | |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2931 | ; Some scope lists: |
| 2932 | !5 = metadata !{metadata !4} ; A list containing only scope !4 |
| 2933 | !6 = metadata !{metadata !4, metadata !3, metadata !2} |
| 2934 | !7 = metadata !{metadata !3} |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2935 | |
| 2936 | ; These two instructions don't alias: |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2937 | %0 = load float* %c, align 4, !alias.scope !5 |
| 2938 | store float %0, float* %arrayidx.i, align 4, !noalias !5 |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2939 | |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2940 | ; These two instructions also don't alias (for domain !1, the set of scopes |
| 2941 | ; in the !alias.scope equals that in the !noalias list): |
| 2942 | %2 = load float* %c, align 4, !alias.scope !5 |
| 2943 | store float %2, float* %arrayidx.i2, align 4, !noalias !6 |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2944 | |
Hal Finkel | 029cde6 | 2014-07-25 15:50:02 +0000 | [diff] [blame] | 2945 | ; These two instructions don't alias (for domain !0, the set of scopes in |
| 2946 | ; the !noalias list is not a superset of, or equal to, the scopes in the |
| 2947 | ; !alias.scope list): |
| 2948 | %2 = load float* %c, align 4, !alias.scope !6 |
| 2949 | store float %0, float* %arrayidx.i, align 4, !noalias !7 |
Hal Finkel | 9414665 | 2014-07-24 14:25:39 +0000 | [diff] [blame] | 2950 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2951 | '``fpmath``' Metadata |
| 2952 | ^^^^^^^^^^^^^^^^^^^^^ |
| 2953 | |
| 2954 | ``fpmath`` metadata may be attached to any instruction of floating point |
| 2955 | type. It can be used to express the maximum acceptable error in the |
| 2956 | result of that instruction, in ULPs, thus potentially allowing the |
| 2957 | compiler to use a more efficient but less accurate method of computing |
| 2958 | it. ULP is defined as follows: |
| 2959 | |
| 2960 | If ``x`` is a real number that lies between two finite consecutive |
| 2961 | floating-point numbers ``a`` and ``b``, without being equal to one |
| 2962 | of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the |
| 2963 | distance between the two non-equal finite floating-point numbers |
| 2964 | nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``. |
| 2965 | |
| 2966 | The metadata node shall consist of a single positive floating point |
| 2967 | number representing the maximum relative error, for example: |
| 2968 | |
| 2969 | .. code-block:: llvm |
| 2970 | |
| 2971 | !0 = metadata !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs |
| 2972 | |
| 2973 | '``range``' Metadata |
| 2974 | ^^^^^^^^^^^^^^^^^^^^ |
| 2975 | |
Jingyue Wu | 37fcb59 | 2014-06-19 16:50:16 +0000 | [diff] [blame] | 2976 | ``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of |
| 2977 | integer types. It expresses the possible ranges the loaded value or the value |
| 2978 | returned by the called function at this call site is in. The ranges are |
| 2979 | represented with a flattened list of integers. The loaded value or the value |
| 2980 | returned is known to be in the union of the ranges defined by each consecutive |
| 2981 | pair. Each pair has the following properties: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 2982 | |
| 2983 | - The type must match the type loaded by the instruction. |
| 2984 | - The pair ``a,b`` represents the range ``[a,b)``. |
| 2985 | - Both ``a`` and ``b`` are constants. |
| 2986 | - The range is allowed to wrap. |
| 2987 | - The range should not represent the full or empty set. That is, |
| 2988 | ``a!=b``. |
| 2989 | |
| 2990 | In addition, the pairs must be in signed order of the lower bound and |
| 2991 | they must be non-contiguous. |
| 2992 | |
| 2993 | Examples: |
| 2994 | |
| 2995 | .. code-block:: llvm |
| 2996 | |
| 2997 | %a = load i8* %x, align 1, !range !0 ; Can only be 0 or 1 |
| 2998 | %b = load i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1 |
Jingyue Wu | 37fcb59 | 2014-06-19 16:50:16 +0000 | [diff] [blame] | 2999 | %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5 |
| 3000 | %d = invoke i8 @bar() to label %cont |
| 3001 | unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3002 | ... |
| 3003 | !0 = metadata !{ i8 0, i8 2 } |
| 3004 | !1 = metadata !{ i8 255, i8 2 } |
| 3005 | !2 = metadata !{ i8 0, i8 2, i8 3, i8 6 } |
| 3006 | !3 = metadata !{ i8 -2, i8 0, i8 3, i8 6 } |
| 3007 | |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3008 | '``llvm.loop``' |
| 3009 | ^^^^^^^^^^^^^^^ |
| 3010 | |
| 3011 | It is sometimes useful to attach information to loop constructs. Currently, |
| 3012 | loop metadata is implemented as metadata attached to the branch instruction |
| 3013 | in the loop latch block. This type of metadata refer to a metadata node that is |
Matt Arsenault | 24b49c4 | 2013-07-31 17:49:08 +0000 | [diff] [blame] | 3014 | guaranteed to be separate for each loop. The loop identifier metadata is |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3015 | specified with the name ``llvm.loop``. |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3016 | |
| 3017 | The loop identifier metadata is implemented using a metadata that refers to |
Michael Liao | a769908 | 2013-03-06 18:24:34 +0000 | [diff] [blame] | 3018 | itself to avoid merging it with any other identifier metadata, e.g., |
| 3019 | during module linkage or function inlining. That is, each loop should refer |
| 3020 | to their own identification metadata even if they reside in separate functions. |
| 3021 | The following example contains loop identifier metadata for two separate loop |
Pekka Jaaskelainen | 119a2b6 | 2013-02-22 12:03:07 +0000 | [diff] [blame] | 3022 | constructs: |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3023 | |
| 3024 | .. code-block:: llvm |
Paul Redmond | eaaed3b | 2013-02-21 17:20:45 +0000 | [diff] [blame] | 3025 | |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3026 | !0 = metadata !{ metadata !0 } |
Pekka Jaaskelainen | 119a2b6 | 2013-02-22 12:03:07 +0000 | [diff] [blame] | 3027 | !1 = metadata !{ metadata !1 } |
| 3028 | |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3029 | The loop identifier metadata can be used to specify additional |
| 3030 | per-loop metadata. Any operands after the first operand can be treated |
| 3031 | as user-defined metadata. For example the ``llvm.loop.unroll.count`` |
| 3032 | suggests an unroll factor to the loop unroller: |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3033 | |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3034 | .. code-block:: llvm |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3035 | |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3036 | br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0 |
| 3037 | ... |
| 3038 | !0 = metadata !{ metadata !0, metadata !1 } |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3039 | !1 = metadata !{ metadata !"llvm.loop.unroll.count", i32 4 } |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3040 | |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3041 | '``llvm.loop.vectorize``' and '``llvm.loop.interleave``' |
| 3042 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3043 | |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3044 | Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are |
| 3045 | used to control per-loop vectorization and interleaving parameters such as |
| 3046 | vectorization width and interleave count. These metadata should be used in |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3047 | conjunction with ``llvm.loop`` loop identification metadata. The |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3048 | ``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only |
| 3049 | optimization hints and the optimizer will only interleave and vectorize loops if |
| 3050 | it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata |
| 3051 | which contains information about loop-carried memory dependencies can be helpful |
| 3052 | in determining the safety of these transformations. |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3053 | |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3054 | '``llvm.loop.interleave.count``' Metadata |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3055 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3056 | |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3057 | This metadata suggests an interleave count to the loop interleaver. |
| 3058 | The first operand is the string ``llvm.loop.interleave.count`` and the |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3059 | second operand is an integer specifying the interleave count. For |
| 3060 | example: |
| 3061 | |
| 3062 | .. code-block:: llvm |
| 3063 | |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3064 | !0 = metadata !{ metadata !"llvm.loop.interleave.count", i32 4 } |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3065 | |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3066 | Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving |
| 3067 | multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0 |
| 3068 | then the interleave count will be determined automatically. |
| 3069 | |
| 3070 | '``llvm.loop.vectorize.enable``' Metadata |
Dan Liew | 9a1829d | 2014-07-22 14:59:38 +0000 | [diff] [blame] | 3071 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Mark Heffernan | 9d20e42 | 2014-07-21 23:11:03 +0000 | [diff] [blame] | 3072 | |
| 3073 | This metadata selectively enables or disables vectorization for the loop. The |
| 3074 | first operand is the string ``llvm.loop.vectorize.enable`` and the second operand |
| 3075 | is a bit. If the bit operand value is 1 vectorization is enabled. A value of |
| 3076 | 0 disables vectorization: |
| 3077 | |
| 3078 | .. code-block:: llvm |
| 3079 | |
| 3080 | !0 = metadata !{ metadata !"llvm.loop.vectorize.enable", i1 0 } |
| 3081 | !1 = metadata !{ metadata !"llvm.loop.vectorize.enable", i1 1 } |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3082 | |
| 3083 | '``llvm.loop.vectorize.width``' Metadata |
| 3084 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3085 | |
| 3086 | This metadata sets the target width of the vectorizer. The first |
| 3087 | operand is the string ``llvm.loop.vectorize.width`` and the second |
| 3088 | operand is an integer specifying the width. For example: |
| 3089 | |
| 3090 | .. code-block:: llvm |
| 3091 | |
| 3092 | !0 = metadata !{ metadata !"llvm.loop.vectorize.width", i32 4 } |
| 3093 | |
| 3094 | Note that setting ``llvm.loop.vectorize.width`` to 1 disables |
| 3095 | vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to |
| 3096 | 0 or if the loop does not have this metadata the width will be |
| 3097 | determined automatically. |
| 3098 | |
| 3099 | '``llvm.loop.unroll``' |
| 3100 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 3101 | |
| 3102 | Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling |
| 3103 | optimization hints such as the unroll factor. ``llvm.loop.unroll`` |
| 3104 | metadata should be used in conjunction with ``llvm.loop`` loop |
| 3105 | identification metadata. The ``llvm.loop.unroll`` metadata are only |
| 3106 | optimization hints and the unrolling will only be performed if the |
| 3107 | optimizer believes it is safe to do so. |
| 3108 | |
Mark Heffernan | 893752a | 2014-07-18 19:24:51 +0000 | [diff] [blame] | 3109 | '``llvm.loop.unroll.count``' Metadata |
| 3110 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3111 | |
| 3112 | This metadata suggests an unroll factor to the loop unroller. The |
| 3113 | first operand is the string ``llvm.loop.unroll.count`` and the second |
| 3114 | operand is a positive integer specifying the unroll factor. For |
| 3115 | example: |
| 3116 | |
| 3117 | .. code-block:: llvm |
| 3118 | |
| 3119 | !0 = metadata !{ metadata !"llvm.loop.unroll.count", i32 4 } |
| 3120 | |
| 3121 | If the trip count of the loop is less than the unroll count the loop |
| 3122 | will be partially unrolled. |
| 3123 | |
Mark Heffernan | e6b4ba1 | 2014-07-23 17:31:37 +0000 | [diff] [blame] | 3124 | '``llvm.loop.unroll.disable``' Metadata |
| 3125 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3126 | |
| 3127 | This metadata either disables loop unrolling. The metadata has a single operand |
| 3128 | which is the string ``llvm.loop.unroll.disable``. For example: |
| 3129 | |
| 3130 | .. code-block:: llvm |
| 3131 | |
| 3132 | !0 = metadata !{ metadata !"llvm.loop.unroll.disable" } |
| 3133 | |
| 3134 | '``llvm.loop.unroll.full``' Metadata |
| 3135 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3136 | |
| 3137 | This metadata either suggests that the loop should be unrolled fully. The |
| 3138 | metadata has a single operand which is the string ``llvm.loop.unroll.disable``. |
| 3139 | For example: |
| 3140 | |
| 3141 | .. code-block:: llvm |
| 3142 | |
| 3143 | !0 = metadata !{ metadata !"llvm.loop.unroll.full" } |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3144 | |
| 3145 | '``llvm.mem``' |
| 3146 | ^^^^^^^^^^^^^^^ |
| 3147 | |
| 3148 | Metadata types used to annotate memory accesses with information helpful |
| 3149 | for optimizations are prefixed with ``llvm.mem``. |
| 3150 | |
| 3151 | '``llvm.mem.parallel_loop_access``' Metadata |
| 3152 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3153 | |
Pekka Jaaskelainen | 23b222cc | 2014-05-23 11:35:46 +0000 | [diff] [blame] | 3154 | The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier, |
| 3155 | or metadata containing a list of loop identifiers for nested loops. |
| 3156 | The metadata is attached to memory accessing instructions and denotes that |
| 3157 | no loop carried memory dependence exist between it and other instructions denoted |
| 3158 | with the same loop identifier. |
| 3159 | |
| 3160 | Precisely, given two instructions ``m1`` and ``m2`` that both have the |
| 3161 | ``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the |
| 3162 | set of loops associated with that metadata, respectively, then there is no loop |
Pekka Jaaskelainen | a304408 | 2014-06-06 11:21:44 +0000 | [diff] [blame] | 3163 | carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and |
Pekka Jaaskelainen | 23b222cc | 2014-05-23 11:35:46 +0000 | [diff] [blame] | 3164 | ``L2``. |
| 3165 | |
| 3166 | As a special case, if all memory accessing instructions in a loop have |
| 3167 | ``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the |
| 3168 | loop has no loop carried memory dependences and is considered to be a parallel |
| 3169 | loop. |
| 3170 | |
| 3171 | Note that if not all memory access instructions have such metadata referring to |
| 3172 | the loop, then the loop is considered not being trivially parallel. Additional |
| 3173 | memory dependence analysis is required to make that determination. As a fail |
| 3174 | safe mechanism, this causes loops that were originally parallel to be considered |
| 3175 | sequential (if optimization passes that are unaware of the parallel semantics |
| 3176 | insert new memory instructions into the loop body). |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3177 | |
| 3178 | Example of a loop that is considered parallel due to its correct use of |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3179 | both ``llvm.loop`` and ``llvm.mem.parallel_loop_access`` |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3180 | metadata types that refer to the same loop identifier metadata. |
| 3181 | |
| 3182 | .. code-block:: llvm |
| 3183 | |
| 3184 | for.body: |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3185 | ... |
Tobias Grosser | fbe95dc | 2014-03-05 13:36:04 +0000 | [diff] [blame] | 3186 | %val0 = load i32* %arrayidx, !llvm.mem.parallel_loop_access !0 |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3187 | ... |
Tobias Grosser | fbe95dc | 2014-03-05 13:36:04 +0000 | [diff] [blame] | 3188 | store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0 |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3189 | ... |
| 3190 | br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0 |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3191 | |
| 3192 | for.end: |
| 3193 | ... |
| 3194 | !0 = metadata !{ metadata !0 } |
| 3195 | |
| 3196 | It is also possible to have nested parallel loops. In that case the |
| 3197 | memory accesses refer to a list of loop identifier metadata nodes instead of |
| 3198 | the loop identifier metadata node directly: |
| 3199 | |
| 3200 | .. code-block:: llvm |
| 3201 | |
| 3202 | outer.for.body: |
Tobias Grosser | fbe95dc | 2014-03-05 13:36:04 +0000 | [diff] [blame] | 3203 | ... |
| 3204 | %val1 = load i32* %arrayidx3, !llvm.mem.parallel_loop_access !2 |
| 3205 | ... |
| 3206 | br label %inner.for.body |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3207 | |
| 3208 | inner.for.body: |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3209 | ... |
Tobias Grosser | fbe95dc | 2014-03-05 13:36:04 +0000 | [diff] [blame] | 3210 | %val0 = load i32* %arrayidx1, !llvm.mem.parallel_loop_access !0 |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3211 | ... |
Tobias Grosser | fbe95dc | 2014-03-05 13:36:04 +0000 | [diff] [blame] | 3212 | store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0 |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3213 | ... |
| 3214 | br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1 |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3215 | |
| 3216 | inner.for.end: |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3217 | ... |
Tobias Grosser | fbe95dc | 2014-03-05 13:36:04 +0000 | [diff] [blame] | 3218 | store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2 |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3219 | ... |
| 3220 | br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2 |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3221 | |
| 3222 | outer.for.end: ; preds = %for.body |
| 3223 | ... |
Paul Redmond | 5fdf836 | 2013-05-28 20:00:34 +0000 | [diff] [blame] | 3224 | !0 = metadata !{ metadata !1, metadata !2 } ; a list of loop identifiers |
| 3225 | !1 = metadata !{ metadata !1 } ; an identifier for the inner loop |
| 3226 | !2 = metadata !{ metadata !2 } ; an identifier for the outer loop |
Pekka Jaaskelainen | 0d23725 | 2013-02-13 18:08:57 +0000 | [diff] [blame] | 3227 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3228 | Module Flags Metadata |
| 3229 | ===================== |
| 3230 | |
| 3231 | Information about the module as a whole is difficult to convey to LLVM's |
| 3232 | subsystems. The LLVM IR isn't sufficient to transmit this information. |
| 3233 | The ``llvm.module.flags`` named metadata exists in order to facilitate |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 3234 | this. These flags are in the form of key / value pairs --- much like a |
| 3235 | dictionary --- making it easy for any subsystem who cares about a flag to |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3236 | look it up. |
| 3237 | |
| 3238 | The ``llvm.module.flags`` metadata contains a list of metadata triplets. |
| 3239 | Each triplet has the following form: |
| 3240 | |
| 3241 | - The first element is a *behavior* flag, which specifies the behavior |
| 3242 | when two (or more) modules are merged together, and it encounters two |
| 3243 | (or more) metadata with the same ID. The supported behaviors are |
| 3244 | described below. |
| 3245 | - The second element is a metadata string that is a unique ID for the |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3246 | metadata. Each module may only have one flag entry for each unique ID (not |
| 3247 | including entries with the **Require** behavior). |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3248 | - The third element is the value of the flag. |
| 3249 | |
| 3250 | When two (or more) modules are merged together, the resulting |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3251 | ``llvm.module.flags`` metadata is the union of the modules' flags. That is, for |
| 3252 | each unique metadata ID string, there will be exactly one entry in the merged |
| 3253 | modules ``llvm.module.flags`` metadata table, and the value for that entry will |
| 3254 | be determined by the merge behavior flag, as described below. The only exception |
| 3255 | is that entries with the *Require* behavior are always preserved. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3256 | |
| 3257 | The following behaviors are supported: |
| 3258 | |
| 3259 | .. list-table:: |
| 3260 | :header-rows: 1 |
| 3261 | :widths: 10 90 |
| 3262 | |
| 3263 | * - Value |
| 3264 | - Behavior |
| 3265 | |
| 3266 | * - 1 |
| 3267 | - **Error** |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3268 | Emits an error if two values disagree, otherwise the resulting value |
| 3269 | is that of the operands. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3270 | |
| 3271 | * - 2 |
| 3272 | - **Warning** |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3273 | Emits a warning if two values disagree. The result value will be the |
| 3274 | operand for the flag from the first module being linked. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3275 | |
| 3276 | * - 3 |
| 3277 | - **Require** |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3278 | Adds a requirement that another module flag be present and have a |
| 3279 | specified value after linking is performed. The value must be a |
| 3280 | metadata pair, where the first element of the pair is the ID of the |
| 3281 | module flag to be restricted, and the second element of the pair is |
| 3282 | the value the module flag should be restricted to. This behavior can |
| 3283 | be used to restrict the allowable results (via triggering of an |
| 3284 | error) of linking IDs with the **Override** behavior. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3285 | |
| 3286 | * - 4 |
| 3287 | - **Override** |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3288 | Uses the specified value, regardless of the behavior or value of the |
| 3289 | other module. If both modules specify **Override**, but the values |
| 3290 | differ, an error will be emitted. |
| 3291 | |
Daniel Dunbar | d77d9fb | 2013-01-16 21:38:56 +0000 | [diff] [blame] | 3292 | * - 5 |
| 3293 | - **Append** |
| 3294 | Appends the two values, which are required to be metadata nodes. |
| 3295 | |
| 3296 | * - 6 |
| 3297 | - **AppendUnique** |
| 3298 | Appends the two values, which are required to be metadata |
| 3299 | nodes. However, duplicate entries in the second list are dropped |
| 3300 | during the append operation. |
| 3301 | |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3302 | It is an error for a particular unique flag ID to have multiple behaviors, |
| 3303 | except in the case of **Require** (which adds restrictions on another metadata |
| 3304 | value) or **Override**. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3305 | |
| 3306 | An example of module flags: |
| 3307 | |
| 3308 | .. code-block:: llvm |
| 3309 | |
| 3310 | !0 = metadata !{ i32 1, metadata !"foo", i32 1 } |
| 3311 | !1 = metadata !{ i32 4, metadata !"bar", i32 37 } |
| 3312 | !2 = metadata !{ i32 2, metadata !"qux", i32 42 } |
| 3313 | !3 = metadata !{ i32 3, metadata !"qux", |
| 3314 | metadata !{ |
| 3315 | metadata !"foo", i32 1 |
| 3316 | } |
| 3317 | } |
| 3318 | !llvm.module.flags = !{ !0, !1, !2, !3 } |
| 3319 | |
| 3320 | - Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior |
| 3321 | if two or more ``!"foo"`` flags are seen is to emit an error if their |
| 3322 | values are not equal. |
| 3323 | |
| 3324 | - Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The |
| 3325 | behavior if two or more ``!"bar"`` flags are seen is to use the value |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3326 | '37'. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3327 | |
| 3328 | - Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The |
| 3329 | behavior if two or more ``!"qux"`` flags are seen is to emit a |
| 3330 | warning if their values are not equal. |
| 3331 | |
| 3332 | - Metadata ``!3`` has the ID ``!"qux"`` and the value: |
| 3333 | |
| 3334 | :: |
| 3335 | |
| 3336 | metadata !{ metadata !"foo", i32 1 } |
| 3337 | |
Daniel Dunbar | 25c4b57 | 2013-01-15 01:22:53 +0000 | [diff] [blame] | 3338 | The behavior is to emit an error if the ``llvm.module.flags`` does not |
| 3339 | contain a flag with the ID ``!"foo"`` that has the value '1' after linking is |
| 3340 | performed. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3341 | |
| 3342 | Objective-C Garbage Collection Module Flags Metadata |
| 3343 | ---------------------------------------------------- |
| 3344 | |
| 3345 | On the Mach-O platform, Objective-C stores metadata about garbage |
| 3346 | collection in a special section called "image info". The metadata |
| 3347 | consists of a version number and a bitmask specifying what types of |
| 3348 | garbage collection are supported (if any) by the file. If two or more |
| 3349 | modules are linked together their garbage collection metadata needs to |
| 3350 | be merged rather than appended together. |
| 3351 | |
| 3352 | The Objective-C garbage collection module flags metadata consists of the |
| 3353 | following key-value pairs: |
| 3354 | |
| 3355 | .. list-table:: |
| 3356 | :header-rows: 1 |
| 3357 | :widths: 30 70 |
| 3358 | |
| 3359 | * - Key |
| 3360 | - Value |
| 3361 | |
Daniel Dunbar | 1dc66ca | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 3362 | * - ``Objective-C Version`` |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 3363 | - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3364 | |
Daniel Dunbar | 1dc66ca | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 3365 | * - ``Objective-C Image Info Version`` |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 3366 | - **[Required]** --- The version of the image info section. Currently |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3367 | always 0. |
| 3368 | |
Daniel Dunbar | 1dc66ca | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 3369 | * - ``Objective-C Image Info Section`` |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 3370 | - **[Required]** --- The section to place the metadata. Valid values are |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3371 | ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and |
| 3372 | ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for |
| 3373 | Objective-C ABI version 2. |
| 3374 | |
Daniel Dunbar | 1dc66ca | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 3375 | * - ``Objective-C Garbage Collection`` |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 3376 | - **[Required]** --- Specifies whether garbage collection is supported or |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3377 | not. Valid values are 0, for no garbage collection, and 2, for garbage |
| 3378 | collection supported. |
| 3379 | |
Daniel Dunbar | 1dc66ca | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 3380 | * - ``Objective-C GC Only`` |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 3381 | - **[Optional]** --- Specifies that only garbage collection is supported. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3382 | If present, its value must be 6. This flag requires that the |
| 3383 | ``Objective-C Garbage Collection`` flag have the value 2. |
| 3384 | |
| 3385 | Some important flag interactions: |
| 3386 | |
| 3387 | - If a module with ``Objective-C Garbage Collection`` set to 0 is |
| 3388 | merged with a module with ``Objective-C Garbage Collection`` set to |
| 3389 | 2, then the resulting module has the |
| 3390 | ``Objective-C Garbage Collection`` flag set to 0. |
| 3391 | - A module with ``Objective-C Garbage Collection`` set to 0 cannot be |
| 3392 | merged with a module with ``Objective-C GC Only`` set to 6. |
| 3393 | |
Daniel Dunbar | 252bedc | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 3394 | Automatic Linker Flags Module Flags Metadata |
| 3395 | -------------------------------------------- |
| 3396 | |
| 3397 | Some targets support embedding flags to the linker inside individual object |
| 3398 | files. Typically this is used in conjunction with language extensions which |
| 3399 | allow source files to explicitly declare the libraries they depend on, and have |
| 3400 | these automatically be transmitted to the linker via object files. |
| 3401 | |
| 3402 | These flags are encoded in the IR using metadata in the module flags section, |
Daniel Dunbar | 1dc66ca | 2013-01-17 18:57:32 +0000 | [diff] [blame] | 3403 | using the ``Linker Options`` key. The merge behavior for this flag is required |
Daniel Dunbar | 252bedc | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 3404 | to be ``AppendUnique``, and the value for the key is expected to be a metadata |
| 3405 | node which should be a list of other metadata nodes, each of which should be a |
| 3406 | list of metadata strings defining linker options. |
| 3407 | |
| 3408 | For example, the following metadata section specifies two separate sets of |
| 3409 | linker options, presumably to link against ``libz`` and the ``Cocoa`` |
| 3410 | framework:: |
| 3411 | |
Michael Liao | a769908 | 2013-03-06 18:24:34 +0000 | [diff] [blame] | 3412 | !0 = metadata !{ i32 6, metadata !"Linker Options", |
Daniel Dunbar | 252bedc | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 3413 | metadata !{ |
Daniel Dunbar | 9585612 | 2013-01-18 19:37:00 +0000 | [diff] [blame] | 3414 | metadata !{ metadata !"-lz" }, |
| 3415 | metadata !{ metadata !"-framework", metadata !"Cocoa" } } } |
Daniel Dunbar | 252bedc | 2013-01-17 00:16:27 +0000 | [diff] [blame] | 3416 | !llvm.module.flags = !{ !0 } |
| 3417 | |
| 3418 | The metadata encoding as lists of lists of options, as opposed to a collapsed |
| 3419 | list of options, is chosen so that the IR encoding can use multiple option |
| 3420 | strings to specify e.g., a single library, while still having that specifier be |
| 3421 | preserved as an atomic element that can be recognized by a target specific |
| 3422 | assembly writer or object file emitter. |
| 3423 | |
| 3424 | Each individual option is required to be either a valid option for the target's |
| 3425 | linker, or an option that is reserved by the target specific assembly writer or |
| 3426 | object file emitter. No other aspect of these options is defined by the IR. |
| 3427 | |
Oliver Stannard | 5dc2934 | 2014-06-20 10:08:11 +0000 | [diff] [blame] | 3428 | C type width Module Flags Metadata |
| 3429 | ---------------------------------- |
| 3430 | |
| 3431 | The ARM backend emits a section into each generated object file describing the |
| 3432 | options that it was compiled with (in a compiler-independent way) to prevent |
| 3433 | linking incompatible objects, and to allow automatic library selection. Some |
| 3434 | of these options are not visible at the IR level, namely wchar_t width and enum |
| 3435 | width. |
| 3436 | |
| 3437 | To pass this information to the backend, these options are encoded in module |
| 3438 | flags metadata, using the following key-value pairs: |
| 3439 | |
| 3440 | .. list-table:: |
| 3441 | :header-rows: 1 |
| 3442 | :widths: 30 70 |
| 3443 | |
| 3444 | * - Key |
| 3445 | - Value |
| 3446 | |
| 3447 | * - short_wchar |
| 3448 | - * 0 --- sizeof(wchar_t) == 4 |
| 3449 | * 1 --- sizeof(wchar_t) == 2 |
| 3450 | |
| 3451 | * - short_enum |
| 3452 | - * 0 --- Enums are at least as large as an ``int``. |
| 3453 | * 1 --- Enums are stored in the smallest integer type which can |
| 3454 | represent all of its values. |
| 3455 | |
| 3456 | For example, the following metadata section specifies that the module was |
| 3457 | compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an |
| 3458 | enum is the smallest type which can represent all of its values:: |
| 3459 | |
| 3460 | !llvm.module.flags = !{!0, !1} |
| 3461 | !0 = metadata !{i32 1, metadata !"short_wchar", i32 1} |
| 3462 | !1 = metadata !{i32 1, metadata !"short_enum", i32 0} |
| 3463 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 3464 | .. _intrinsicglobalvariables: |
| 3465 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3466 | Intrinsic Global Variables |
| 3467 | ========================== |
| 3468 | |
| 3469 | LLVM has a number of "magic" global variables that contain data that |
| 3470 | affect code generation or other IR semantics. These are documented here. |
| 3471 | All globals of this sort should have a section specified as |
| 3472 | "``llvm.metadata``". This section and all globals that start with |
| 3473 | "``llvm.``" are reserved for use by LLVM. |
| 3474 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 3475 | .. _gv_llvmused: |
| 3476 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3477 | The '``llvm.used``' Global Variable |
| 3478 | ----------------------------------- |
| 3479 | |
Rafael Espindola | 74f2e46 | 2013-04-22 14:58:02 +0000 | [diff] [blame] | 3480 | The ``@llvm.used`` global is an array which has |
Paul Redmond | 219ef81 | 2013-05-30 17:24:32 +0000 | [diff] [blame] | 3481 | :ref:`appending linkage <linkage_appending>`. This array contains a list of |
Rafael Espindola | 70a729d | 2013-06-11 13:18:13 +0000 | [diff] [blame] | 3482 | pointers to named global variables, functions and aliases which may optionally |
| 3483 | have a pointer cast formed of bitcast or getelementptr. For example, a legal |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3484 | use of it is: |
| 3485 | |
| 3486 | .. code-block:: llvm |
| 3487 | |
| 3488 | @X = global i8 4 |
| 3489 | @Y = global i32 123 |
| 3490 | |
| 3491 | @llvm.used = appending global [2 x i8*] [ |
| 3492 | i8* @X, |
| 3493 | i8* bitcast (i32* @Y to i8*) |
| 3494 | ], section "llvm.metadata" |
| 3495 | |
Rafael Espindola | 74f2e46 | 2013-04-22 14:58:02 +0000 | [diff] [blame] | 3496 | If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler, |
| 3497 | and linker are required to treat the symbol as if there is a reference to the |
Rafael Espindola | 70a729d | 2013-06-11 13:18:13 +0000 | [diff] [blame] | 3498 | symbol that it cannot see (which is why they have to be named). For example, if |
| 3499 | a variable has internal linkage and no references other than that from the |
| 3500 | ``@llvm.used`` list, it cannot be deleted. This is commonly used to represent |
| 3501 | references from inline asms and other things the compiler cannot "see", and |
| 3502 | corresponds to "``attribute((used))``" in GNU C. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3503 | |
| 3504 | On some targets, the code generator must emit a directive to the |
| 3505 | assembler or object file to prevent the assembler and linker from |
| 3506 | molesting the symbol. |
| 3507 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 3508 | .. _gv_llvmcompilerused: |
| 3509 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3510 | The '``llvm.compiler.used``' Global Variable |
| 3511 | -------------------------------------------- |
| 3512 | |
| 3513 | The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used`` |
| 3514 | directive, except that it only prevents the compiler from touching the |
| 3515 | symbol. On targets that support it, this allows an intelligent linker to |
| 3516 | optimize references to the symbol without being impeded as it would be |
| 3517 | by ``@llvm.used``. |
| 3518 | |
| 3519 | This is a rare construct that should only be used in rare circumstances, |
| 3520 | and should not be exposed to source languages. |
| 3521 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 3522 | .. _gv_llvmglobalctors: |
| 3523 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3524 | The '``llvm.global_ctors``' Global Variable |
| 3525 | ------------------------------------------- |
| 3526 | |
| 3527 | .. code-block:: llvm |
| 3528 | |
Reid Kleckner | fceb76f | 2014-05-16 20:39:27 +0000 | [diff] [blame] | 3529 | %0 = type { i32, void ()*, i8* } |
| 3530 | @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3531 | |
| 3532 | The ``@llvm.global_ctors`` array contains a list of constructor |
Reid Kleckner | fceb76f | 2014-05-16 20:39:27 +0000 | [diff] [blame] | 3533 | functions, priorities, and an optional associated global or function. |
| 3534 | The functions referenced by this array will be called in ascending order |
| 3535 | of priority (i.e. lowest first) when the module is loaded. The order of |
| 3536 | functions with the same priority is not defined. |
| 3537 | |
| 3538 | If the third field is present, non-null, and points to a global variable |
| 3539 | or function, the initializer function will only run if the associated |
| 3540 | data from the current module is not discarded. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3541 | |
Eli Bendersky | 0220e6b | 2013-06-07 20:24:43 +0000 | [diff] [blame] | 3542 | .. _llvmglobaldtors: |
| 3543 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3544 | The '``llvm.global_dtors``' Global Variable |
| 3545 | ------------------------------------------- |
| 3546 | |
| 3547 | .. code-block:: llvm |
| 3548 | |
Reid Kleckner | fceb76f | 2014-05-16 20:39:27 +0000 | [diff] [blame] | 3549 | %0 = type { i32, void ()*, i8* } |
| 3550 | @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }] |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3551 | |
Reid Kleckner | fceb76f | 2014-05-16 20:39:27 +0000 | [diff] [blame] | 3552 | The ``@llvm.global_dtors`` array contains a list of destructor |
| 3553 | functions, priorities, and an optional associated global or function. |
| 3554 | The functions referenced by this array will be called in descending |
Reid Kleckner | bffbcc5 | 2014-05-27 21:35:17 +0000 | [diff] [blame] | 3555 | order of priority (i.e. highest first) when the module is unloaded. The |
Reid Kleckner | fceb76f | 2014-05-16 20:39:27 +0000 | [diff] [blame] | 3556 | order of functions with the same priority is not defined. |
| 3557 | |
| 3558 | If the third field is present, non-null, and points to a global variable |
| 3559 | or function, the destructor function will only run if the associated |
| 3560 | data from the current module is not discarded. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3561 | |
| 3562 | Instruction Reference |
| 3563 | ===================== |
| 3564 | |
| 3565 | The LLVM instruction set consists of several different classifications |
| 3566 | of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary |
| 3567 | instructions <binaryops>`, :ref:`bitwise binary |
| 3568 | instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and |
| 3569 | :ref:`other instructions <otherops>`. |
| 3570 | |
| 3571 | .. _terminators: |
| 3572 | |
| 3573 | Terminator Instructions |
| 3574 | ----------------------- |
| 3575 | |
| 3576 | As mentioned :ref:`previously <functionstructure>`, every basic block in a |
| 3577 | program ends with a "Terminator" instruction, which indicates which |
| 3578 | block should be executed after the current block is finished. These |
| 3579 | terminator instructions typically yield a '``void``' value: they produce |
| 3580 | control flow, not values (the one exception being the |
| 3581 | ':ref:`invoke <i_invoke>`' instruction). |
| 3582 | |
| 3583 | The terminator instructions are: ':ref:`ret <i_ret>`', |
| 3584 | ':ref:`br <i_br>`', ':ref:`switch <i_switch>`', |
| 3585 | ':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`', |
| 3586 | ':ref:`resume <i_resume>`', and ':ref:`unreachable <i_unreachable>`'. |
| 3587 | |
| 3588 | .. _i_ret: |
| 3589 | |
| 3590 | '``ret``' Instruction |
| 3591 | ^^^^^^^^^^^^^^^^^^^^^ |
| 3592 | |
| 3593 | Syntax: |
| 3594 | """"""" |
| 3595 | |
| 3596 | :: |
| 3597 | |
| 3598 | ret <type> <value> ; Return a value from a non-void function |
| 3599 | ret void ; Return from void function |
| 3600 | |
| 3601 | Overview: |
| 3602 | """"""""" |
| 3603 | |
| 3604 | The '``ret``' instruction is used to return control flow (and optionally |
| 3605 | a value) from a function back to the caller. |
| 3606 | |
| 3607 | There are two forms of the '``ret``' instruction: one that returns a |
| 3608 | value and then causes control flow, and one that just causes control |
| 3609 | flow to occur. |
| 3610 | |
| 3611 | Arguments: |
| 3612 | """""""""" |
| 3613 | |
| 3614 | The '``ret``' instruction optionally accepts a single argument, the |
| 3615 | return value. The type of the return value must be a ':ref:`first |
| 3616 | class <t_firstclass>`' type. |
| 3617 | |
| 3618 | A function is not :ref:`well formed <wellformed>` if it it has a non-void |
| 3619 | return type and contains a '``ret``' instruction with no return value or |
| 3620 | a return value with a type that does not match its type, or if it has a |
| 3621 | void return type and contains a '``ret``' instruction with a return |
| 3622 | value. |
| 3623 | |
| 3624 | Semantics: |
| 3625 | """""""""" |
| 3626 | |
| 3627 | When the '``ret``' instruction is executed, control flow returns back to |
| 3628 | the calling function's context. If the caller is a |
| 3629 | ":ref:`call <i_call>`" instruction, execution continues at the |
| 3630 | instruction after the call. If the caller was an |
| 3631 | ":ref:`invoke <i_invoke>`" instruction, execution continues at the |
| 3632 | beginning of the "normal" destination block. If the instruction returns |
| 3633 | a value, that value shall set the call or invoke instruction's return |
| 3634 | value. |
| 3635 | |
| 3636 | Example: |
| 3637 | """""""" |
| 3638 | |
| 3639 | .. code-block:: llvm |
| 3640 | |
| 3641 | ret i32 5 ; Return an integer value of 5 |
| 3642 | ret void ; Return from a void function |
| 3643 | ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2 |
| 3644 | |
| 3645 | .. _i_br: |
| 3646 | |
| 3647 | '``br``' Instruction |
| 3648 | ^^^^^^^^^^^^^^^^^^^^ |
| 3649 | |
| 3650 | Syntax: |
| 3651 | """"""" |
| 3652 | |
| 3653 | :: |
| 3654 | |
| 3655 | br i1 <cond>, label <iftrue>, label <iffalse> |
| 3656 | br label <dest> ; Unconditional branch |
| 3657 | |
| 3658 | Overview: |
| 3659 | """"""""" |
| 3660 | |
| 3661 | The '``br``' instruction is used to cause control flow to transfer to a |
| 3662 | different basic block in the current function. There are two forms of |
| 3663 | this instruction, corresponding to a conditional branch and an |
| 3664 | unconditional branch. |
| 3665 | |
| 3666 | Arguments: |
| 3667 | """""""""" |
| 3668 | |
| 3669 | The conditional branch form of the '``br``' instruction takes a single |
| 3670 | '``i1``' value and two '``label``' values. The unconditional form of the |
| 3671 | '``br``' instruction takes a single '``label``' value as a target. |
| 3672 | |
| 3673 | Semantics: |
| 3674 | """""""""" |
| 3675 | |
| 3676 | Upon execution of a conditional '``br``' instruction, the '``i1``' |
| 3677 | argument is evaluated. If the value is ``true``, control flows to the |
| 3678 | '``iftrue``' ``label`` argument. If "cond" is ``false``, control flows |
| 3679 | to the '``iffalse``' ``label`` argument. |
| 3680 | |
| 3681 | Example: |
| 3682 | """""""" |
| 3683 | |
| 3684 | .. code-block:: llvm |
| 3685 | |
| 3686 | Test: |
| 3687 | %cond = icmp eq i32 %a, %b |
| 3688 | br i1 %cond, label %IfEqual, label %IfUnequal |
| 3689 | IfEqual: |
| 3690 | ret i32 1 |
| 3691 | IfUnequal: |
| 3692 | ret i32 0 |
| 3693 | |
| 3694 | .. _i_switch: |
| 3695 | |
| 3696 | '``switch``' Instruction |
| 3697 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3698 | |
| 3699 | Syntax: |
| 3700 | """"""" |
| 3701 | |
| 3702 | :: |
| 3703 | |
| 3704 | switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ] |
| 3705 | |
| 3706 | Overview: |
| 3707 | """"""""" |
| 3708 | |
| 3709 | The '``switch``' instruction is used to transfer control flow to one of |
| 3710 | several different places. It is a generalization of the '``br``' |
| 3711 | instruction, allowing a branch to occur to one of many possible |
| 3712 | destinations. |
| 3713 | |
| 3714 | Arguments: |
| 3715 | """""""""" |
| 3716 | |
| 3717 | The '``switch``' instruction uses three parameters: an integer |
| 3718 | comparison value '``value``', a default '``label``' destination, and an |
| 3719 | array of pairs of comparison value constants and '``label``'s. The table |
| 3720 | is not allowed to contain duplicate constant entries. |
| 3721 | |
| 3722 | Semantics: |
| 3723 | """""""""" |
| 3724 | |
| 3725 | The ``switch`` instruction specifies a table of values and destinations. |
| 3726 | When the '``switch``' instruction is executed, this table is searched |
| 3727 | for the given value. If the value is found, control flow is transferred |
| 3728 | to the corresponding destination; otherwise, control flow is transferred |
| 3729 | to the default destination. |
| 3730 | |
| 3731 | Implementation: |
| 3732 | """"""""""""""" |
| 3733 | |
| 3734 | Depending on properties of the target machine and the particular |
| 3735 | ``switch`` instruction, this instruction may be code generated in |
| 3736 | different ways. For example, it could be generated as a series of |
| 3737 | chained conditional branches or with a lookup table. |
| 3738 | |
| 3739 | Example: |
| 3740 | """""""" |
| 3741 | |
| 3742 | .. code-block:: llvm |
| 3743 | |
| 3744 | ; Emulate a conditional br instruction |
| 3745 | %Val = zext i1 %value to i32 |
| 3746 | switch i32 %Val, label %truedest [ i32 0, label %falsedest ] |
| 3747 | |
| 3748 | ; Emulate an unconditional br instruction |
| 3749 | switch i32 0, label %dest [ ] |
| 3750 | |
| 3751 | ; Implement a jump table: |
| 3752 | switch i32 %val, label %otherwise [ i32 0, label %onzero |
| 3753 | i32 1, label %onone |
| 3754 | i32 2, label %ontwo ] |
| 3755 | |
| 3756 | .. _i_indirectbr: |
| 3757 | |
| 3758 | '``indirectbr``' Instruction |
| 3759 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3760 | |
| 3761 | Syntax: |
| 3762 | """"""" |
| 3763 | |
| 3764 | :: |
| 3765 | |
| 3766 | indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ] |
| 3767 | |
| 3768 | Overview: |
| 3769 | """"""""" |
| 3770 | |
| 3771 | The '``indirectbr``' instruction implements an indirect branch to a |
| 3772 | label within the current function, whose address is specified by |
| 3773 | "``address``". Address must be derived from a |
| 3774 | :ref:`blockaddress <blockaddress>` constant. |
| 3775 | |
| 3776 | Arguments: |
| 3777 | """""""""" |
| 3778 | |
| 3779 | The '``address``' argument is the address of the label to jump to. The |
| 3780 | rest of the arguments indicate the full set of possible destinations |
| 3781 | that the address may point to. Blocks are allowed to occur multiple |
| 3782 | times in the destination list, though this isn't particularly useful. |
| 3783 | |
| 3784 | This destination list is required so that dataflow analysis has an |
| 3785 | accurate understanding of the CFG. |
| 3786 | |
| 3787 | Semantics: |
| 3788 | """""""""" |
| 3789 | |
| 3790 | Control transfers to the block specified in the address argument. All |
| 3791 | possible destination blocks must be listed in the label list, otherwise |
| 3792 | this instruction has undefined behavior. This implies that jumps to |
| 3793 | labels defined in other functions have undefined behavior as well. |
| 3794 | |
| 3795 | Implementation: |
| 3796 | """"""""""""""" |
| 3797 | |
| 3798 | This is typically implemented with a jump through a register. |
| 3799 | |
| 3800 | Example: |
| 3801 | """""""" |
| 3802 | |
| 3803 | .. code-block:: llvm |
| 3804 | |
| 3805 | indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ] |
| 3806 | |
| 3807 | .. _i_invoke: |
| 3808 | |
| 3809 | '``invoke``' Instruction |
| 3810 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3811 | |
| 3812 | Syntax: |
| 3813 | """"""" |
| 3814 | |
| 3815 | :: |
| 3816 | |
| 3817 | <result> = invoke [cconv] [ret attrs] <ptr to function ty> <function ptr val>(<function args>) [fn attrs] |
| 3818 | to label <normal label> unwind label <exception label> |
| 3819 | |
| 3820 | Overview: |
| 3821 | """"""""" |
| 3822 | |
| 3823 | The '``invoke``' instruction causes control to transfer to a specified |
| 3824 | function, with the possibility of control flow transfer to either the |
| 3825 | '``normal``' label or the '``exception``' label. If the callee function |
| 3826 | returns with the "``ret``" instruction, control flow will return to the |
| 3827 | "normal" label. If the callee (or any indirect callees) returns via the |
| 3828 | ":ref:`resume <i_resume>`" instruction or other exception handling |
| 3829 | mechanism, control is interrupted and continued at the dynamically |
| 3830 | nearest "exception" label. |
| 3831 | |
| 3832 | The '``exception``' label is a `landing |
| 3833 | pad <ExceptionHandling.html#overview>`_ for the exception. As such, |
| 3834 | '``exception``' label is required to have the |
| 3835 | ":ref:`landingpad <i_landingpad>`" instruction, which contains the |
| 3836 | information about the behavior of the program after unwinding happens, |
| 3837 | as its first non-PHI instruction. The restrictions on the |
| 3838 | "``landingpad``" instruction's tightly couples it to the "``invoke``" |
| 3839 | instruction, so that the important information contained within the |
| 3840 | "``landingpad``" instruction can't be lost through normal code motion. |
| 3841 | |
| 3842 | Arguments: |
| 3843 | """""""""" |
| 3844 | |
| 3845 | This instruction requires several arguments: |
| 3846 | |
| 3847 | #. The optional "cconv" marker indicates which :ref:`calling |
| 3848 | convention <callingconv>` the call should use. If none is |
| 3849 | specified, the call defaults to using C calling conventions. |
| 3850 | #. The optional :ref:`Parameter Attributes <paramattrs>` list for return |
| 3851 | values. Only '``zeroext``', '``signext``', and '``inreg``' attributes |
| 3852 | are valid here. |
| 3853 | #. '``ptr to function ty``': shall be the signature of the pointer to |
| 3854 | function value being invoked. In most cases, this is a direct |
| 3855 | function invocation, but indirect ``invoke``'s are just as possible, |
| 3856 | branching off an arbitrary pointer to function value. |
| 3857 | #. '``function ptr val``': An LLVM value containing a pointer to a |
| 3858 | function to be invoked. |
| 3859 | #. '``function args``': argument list whose types match the function |
| 3860 | signature argument types and parameter attributes. All arguments must |
| 3861 | be of :ref:`first class <t_firstclass>` type. If the function signature |
| 3862 | indicates the function accepts a variable number of arguments, the |
| 3863 | extra arguments can be specified. |
| 3864 | #. '``normal label``': the label reached when the called function |
| 3865 | executes a '``ret``' instruction. |
| 3866 | #. '``exception label``': the label reached when a callee returns via |
| 3867 | the :ref:`resume <i_resume>` instruction or other exception handling |
| 3868 | mechanism. |
| 3869 | #. The optional :ref:`function attributes <fnattrs>` list. Only |
| 3870 | '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``' |
| 3871 | attributes are valid here. |
| 3872 | |
| 3873 | Semantics: |
| 3874 | """""""""" |
| 3875 | |
| 3876 | This instruction is designed to operate as a standard '``call``' |
| 3877 | instruction in most regards. The primary difference is that it |
| 3878 | establishes an association with a label, which is used by the runtime |
| 3879 | library to unwind the stack. |
| 3880 | |
| 3881 | This instruction is used in languages with destructors to ensure that |
| 3882 | proper cleanup is performed in the case of either a ``longjmp`` or a |
| 3883 | thrown exception. Additionally, this is important for implementation of |
| 3884 | '``catch``' clauses in high-level languages that support them. |
| 3885 | |
| 3886 | For the purposes of the SSA form, the definition of the value returned |
| 3887 | by the '``invoke``' instruction is deemed to occur on the edge from the |
| 3888 | current block to the "normal" label. If the callee unwinds then no |
| 3889 | return value is available. |
| 3890 | |
| 3891 | Example: |
| 3892 | """""""" |
| 3893 | |
| 3894 | .. code-block:: llvm |
| 3895 | |
| 3896 | %retval = invoke i32 @Test(i32 15) to label %Continue |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 3897 | unwind label %TestCleanup ; i32:retval set |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3898 | %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 3899 | unwind label %TestCleanup ; i32:retval set |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3900 | |
| 3901 | .. _i_resume: |
| 3902 | |
| 3903 | '``resume``' Instruction |
| 3904 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3905 | |
| 3906 | Syntax: |
| 3907 | """"""" |
| 3908 | |
| 3909 | :: |
| 3910 | |
| 3911 | resume <type> <value> |
| 3912 | |
| 3913 | Overview: |
| 3914 | """"""""" |
| 3915 | |
| 3916 | The '``resume``' instruction is a terminator instruction that has no |
| 3917 | successors. |
| 3918 | |
| 3919 | Arguments: |
| 3920 | """""""""" |
| 3921 | |
| 3922 | The '``resume``' instruction requires one argument, which must have the |
| 3923 | same type as the result of any '``landingpad``' instruction in the same |
| 3924 | function. |
| 3925 | |
| 3926 | Semantics: |
| 3927 | """""""""" |
| 3928 | |
| 3929 | The '``resume``' instruction resumes propagation of an existing |
| 3930 | (in-flight) exception whose unwinding was interrupted with a |
| 3931 | :ref:`landingpad <i_landingpad>` instruction. |
| 3932 | |
| 3933 | Example: |
| 3934 | """""""" |
| 3935 | |
| 3936 | .. code-block:: llvm |
| 3937 | |
| 3938 | resume { i8*, i32 } %exn |
| 3939 | |
| 3940 | .. _i_unreachable: |
| 3941 | |
| 3942 | '``unreachable``' Instruction |
| 3943 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 3944 | |
| 3945 | Syntax: |
| 3946 | """"""" |
| 3947 | |
| 3948 | :: |
| 3949 | |
| 3950 | unreachable |
| 3951 | |
| 3952 | Overview: |
| 3953 | """"""""" |
| 3954 | |
| 3955 | The '``unreachable``' instruction has no defined semantics. This |
| 3956 | instruction is used to inform the optimizer that a particular portion of |
| 3957 | the code is not reachable. This can be used to indicate that the code |
| 3958 | after a no-return function cannot be reached, and other facts. |
| 3959 | |
| 3960 | Semantics: |
| 3961 | """""""""" |
| 3962 | |
| 3963 | The '``unreachable``' instruction has no defined semantics. |
| 3964 | |
| 3965 | .. _binaryops: |
| 3966 | |
| 3967 | Binary Operations |
| 3968 | ----------------- |
| 3969 | |
| 3970 | Binary operators are used to do most of the computation in a program. |
| 3971 | They require two operands of the same type, execute an operation on |
| 3972 | them, and produce a single value. The operands might represent multiple |
| 3973 | data, as is the case with the :ref:`vector <t_vector>` data type. The |
| 3974 | result value has the same type as its operands. |
| 3975 | |
| 3976 | There are several different binary operators: |
| 3977 | |
| 3978 | .. _i_add: |
| 3979 | |
| 3980 | '``add``' Instruction |
| 3981 | ^^^^^^^^^^^^^^^^^^^^^ |
| 3982 | |
| 3983 | Syntax: |
| 3984 | """"""" |
| 3985 | |
| 3986 | :: |
| 3987 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 3988 | <result> = add <ty> <op1>, <op2> ; yields ty:result |
| 3989 | <result> = add nuw <ty> <op1>, <op2> ; yields ty:result |
| 3990 | <result> = add nsw <ty> <op1>, <op2> ; yields ty:result |
| 3991 | <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 3992 | |
| 3993 | Overview: |
| 3994 | """"""""" |
| 3995 | |
| 3996 | The '``add``' instruction returns the sum of its two operands. |
| 3997 | |
| 3998 | Arguments: |
| 3999 | """""""""" |
| 4000 | |
| 4001 | The two arguments to the '``add``' instruction must be |
| 4002 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4003 | arguments must have identical types. |
| 4004 | |
| 4005 | Semantics: |
| 4006 | """""""""" |
| 4007 | |
| 4008 | The value produced is the integer sum of the two operands. |
| 4009 | |
| 4010 | If the sum has unsigned overflow, the result returned is the |
| 4011 | mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of |
| 4012 | the result. |
| 4013 | |
| 4014 | Because LLVM integers use a two's complement representation, this |
| 4015 | instruction is appropriate for both signed and unsigned integers. |
| 4016 | |
| 4017 | ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", |
| 4018 | respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the |
| 4019 | result value of the ``add`` is a :ref:`poison value <poisonvalues>` if |
| 4020 | unsigned and/or signed overflow, respectively, occurs. |
| 4021 | |
| 4022 | Example: |
| 4023 | """""""" |
| 4024 | |
| 4025 | .. code-block:: llvm |
| 4026 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4027 | <result> = add i32 4, %var ; yields i32:result = 4 + %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4028 | |
| 4029 | .. _i_fadd: |
| 4030 | |
| 4031 | '``fadd``' Instruction |
| 4032 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4033 | |
| 4034 | Syntax: |
| 4035 | """"""" |
| 4036 | |
| 4037 | :: |
| 4038 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4039 | <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4040 | |
| 4041 | Overview: |
| 4042 | """"""""" |
| 4043 | |
| 4044 | The '``fadd``' instruction returns the sum of its two operands. |
| 4045 | |
| 4046 | Arguments: |
| 4047 | """""""""" |
| 4048 | |
| 4049 | The two arguments to the '``fadd``' instruction must be :ref:`floating |
| 4050 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 4051 | Both arguments must have identical types. |
| 4052 | |
| 4053 | Semantics: |
| 4054 | """""""""" |
| 4055 | |
| 4056 | The value produced is the floating point sum of the two operands. This |
| 4057 | instruction can also take any number of :ref:`fast-math flags <fastmath>`, |
| 4058 | which are optimization hints to enable otherwise unsafe floating point |
| 4059 | optimizations: |
| 4060 | |
| 4061 | Example: |
| 4062 | """""""" |
| 4063 | |
| 4064 | .. code-block:: llvm |
| 4065 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4066 | <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4067 | |
| 4068 | '``sub``' Instruction |
| 4069 | ^^^^^^^^^^^^^^^^^^^^^ |
| 4070 | |
| 4071 | Syntax: |
| 4072 | """"""" |
| 4073 | |
| 4074 | :: |
| 4075 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4076 | <result> = sub <ty> <op1>, <op2> ; yields ty:result |
| 4077 | <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result |
| 4078 | <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result |
| 4079 | <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4080 | |
| 4081 | Overview: |
| 4082 | """"""""" |
| 4083 | |
| 4084 | The '``sub``' instruction returns the difference of its two operands. |
| 4085 | |
| 4086 | Note that the '``sub``' instruction is used to represent the '``neg``' |
| 4087 | instruction present in most other intermediate representations. |
| 4088 | |
| 4089 | Arguments: |
| 4090 | """""""""" |
| 4091 | |
| 4092 | The two arguments to the '``sub``' instruction must be |
| 4093 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4094 | arguments must have identical types. |
| 4095 | |
| 4096 | Semantics: |
| 4097 | """""""""" |
| 4098 | |
| 4099 | The value produced is the integer difference of the two operands. |
| 4100 | |
| 4101 | If the difference has unsigned overflow, the result returned is the |
| 4102 | mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of |
| 4103 | the result. |
| 4104 | |
| 4105 | Because LLVM integers use a two's complement representation, this |
| 4106 | instruction is appropriate for both signed and unsigned integers. |
| 4107 | |
| 4108 | ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", |
| 4109 | respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the |
| 4110 | result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if |
| 4111 | unsigned and/or signed overflow, respectively, occurs. |
| 4112 | |
| 4113 | Example: |
| 4114 | """""""" |
| 4115 | |
| 4116 | .. code-block:: llvm |
| 4117 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4118 | <result> = sub i32 4, %var ; yields i32:result = 4 - %var |
| 4119 | <result> = sub i32 0, %val ; yields i32:result = -%var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4120 | |
| 4121 | .. _i_fsub: |
| 4122 | |
| 4123 | '``fsub``' Instruction |
| 4124 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4125 | |
| 4126 | Syntax: |
| 4127 | """"""" |
| 4128 | |
| 4129 | :: |
| 4130 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4131 | <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4132 | |
| 4133 | Overview: |
| 4134 | """"""""" |
| 4135 | |
| 4136 | The '``fsub``' instruction returns the difference of its two operands. |
| 4137 | |
| 4138 | Note that the '``fsub``' instruction is used to represent the '``fneg``' |
| 4139 | instruction present in most other intermediate representations. |
| 4140 | |
| 4141 | Arguments: |
| 4142 | """""""""" |
| 4143 | |
| 4144 | The two arguments to the '``fsub``' instruction must be :ref:`floating |
| 4145 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 4146 | Both arguments must have identical types. |
| 4147 | |
| 4148 | Semantics: |
| 4149 | """""""""" |
| 4150 | |
| 4151 | The value produced is the floating point difference of the two operands. |
| 4152 | This instruction can also take any number of :ref:`fast-math |
| 4153 | flags <fastmath>`, which are optimization hints to enable otherwise |
| 4154 | unsafe floating point optimizations: |
| 4155 | |
| 4156 | Example: |
| 4157 | """""""" |
| 4158 | |
| 4159 | .. code-block:: llvm |
| 4160 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4161 | <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var |
| 4162 | <result> = fsub float -0.0, %val ; yields float:result = -%var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4163 | |
| 4164 | '``mul``' Instruction |
| 4165 | ^^^^^^^^^^^^^^^^^^^^^ |
| 4166 | |
| 4167 | Syntax: |
| 4168 | """"""" |
| 4169 | |
| 4170 | :: |
| 4171 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4172 | <result> = mul <ty> <op1>, <op2> ; yields ty:result |
| 4173 | <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result |
| 4174 | <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result |
| 4175 | <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4176 | |
| 4177 | Overview: |
| 4178 | """"""""" |
| 4179 | |
| 4180 | The '``mul``' instruction returns the product of its two operands. |
| 4181 | |
| 4182 | Arguments: |
| 4183 | """""""""" |
| 4184 | |
| 4185 | The two arguments to the '``mul``' instruction must be |
| 4186 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4187 | arguments must have identical types. |
| 4188 | |
| 4189 | Semantics: |
| 4190 | """""""""" |
| 4191 | |
| 4192 | The value produced is the integer product of the two operands. |
| 4193 | |
| 4194 | If the result of the multiplication has unsigned overflow, the result |
| 4195 | returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the |
| 4196 | bit width of the result. |
| 4197 | |
| 4198 | Because LLVM integers use a two's complement representation, and the |
| 4199 | result is the same width as the operands, this instruction returns the |
| 4200 | correct result for both signed and unsigned integers. If a full product |
| 4201 | (e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be |
| 4202 | sign-extended or zero-extended as appropriate to the width of the full |
| 4203 | product. |
| 4204 | |
| 4205 | ``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap", |
| 4206 | respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the |
| 4207 | result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if |
| 4208 | unsigned and/or signed overflow, respectively, occurs. |
| 4209 | |
| 4210 | Example: |
| 4211 | """""""" |
| 4212 | |
| 4213 | .. code-block:: llvm |
| 4214 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4215 | <result> = mul i32 4, %var ; yields i32:result = 4 * %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4216 | |
| 4217 | .. _i_fmul: |
| 4218 | |
| 4219 | '``fmul``' Instruction |
| 4220 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4221 | |
| 4222 | Syntax: |
| 4223 | """"""" |
| 4224 | |
| 4225 | :: |
| 4226 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4227 | <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4228 | |
| 4229 | Overview: |
| 4230 | """"""""" |
| 4231 | |
| 4232 | The '``fmul``' instruction returns the product of its two operands. |
| 4233 | |
| 4234 | Arguments: |
| 4235 | """""""""" |
| 4236 | |
| 4237 | The two arguments to the '``fmul``' instruction must be :ref:`floating |
| 4238 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 4239 | Both arguments must have identical types. |
| 4240 | |
| 4241 | Semantics: |
| 4242 | """""""""" |
| 4243 | |
| 4244 | The value produced is the floating point product of the two operands. |
| 4245 | This instruction can also take any number of :ref:`fast-math |
| 4246 | flags <fastmath>`, which are optimization hints to enable otherwise |
| 4247 | unsafe floating point optimizations: |
| 4248 | |
| 4249 | Example: |
| 4250 | """""""" |
| 4251 | |
| 4252 | .. code-block:: llvm |
| 4253 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4254 | <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4255 | |
| 4256 | '``udiv``' Instruction |
| 4257 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4258 | |
| 4259 | Syntax: |
| 4260 | """"""" |
| 4261 | |
| 4262 | :: |
| 4263 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4264 | <result> = udiv <ty> <op1>, <op2> ; yields ty:result |
| 4265 | <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4266 | |
| 4267 | Overview: |
| 4268 | """"""""" |
| 4269 | |
| 4270 | The '``udiv``' instruction returns the quotient of its two operands. |
| 4271 | |
| 4272 | Arguments: |
| 4273 | """""""""" |
| 4274 | |
| 4275 | The two arguments to the '``udiv``' instruction must be |
| 4276 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4277 | arguments must have identical types. |
| 4278 | |
| 4279 | Semantics: |
| 4280 | """""""""" |
| 4281 | |
| 4282 | The value produced is the unsigned integer quotient of the two operands. |
| 4283 | |
| 4284 | Note that unsigned integer division and signed integer division are |
| 4285 | distinct operations; for signed integer division, use '``sdiv``'. |
| 4286 | |
| 4287 | Division by zero leads to undefined behavior. |
| 4288 | |
| 4289 | If the ``exact`` keyword is present, the result value of the ``udiv`` is |
| 4290 | a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as |
| 4291 | such, "((a udiv exact b) mul b) == a"). |
| 4292 | |
| 4293 | Example: |
| 4294 | """""""" |
| 4295 | |
| 4296 | .. code-block:: llvm |
| 4297 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4298 | <result> = udiv i32 4, %var ; yields i32:result = 4 / %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4299 | |
| 4300 | '``sdiv``' Instruction |
| 4301 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4302 | |
| 4303 | Syntax: |
| 4304 | """"""" |
| 4305 | |
| 4306 | :: |
| 4307 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4308 | <result> = sdiv <ty> <op1>, <op2> ; yields ty:result |
| 4309 | <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4310 | |
| 4311 | Overview: |
| 4312 | """"""""" |
| 4313 | |
| 4314 | The '``sdiv``' instruction returns the quotient of its two operands. |
| 4315 | |
| 4316 | Arguments: |
| 4317 | """""""""" |
| 4318 | |
| 4319 | The two arguments to the '``sdiv``' instruction must be |
| 4320 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4321 | arguments must have identical types. |
| 4322 | |
| 4323 | Semantics: |
| 4324 | """""""""" |
| 4325 | |
| 4326 | The value produced is the signed integer quotient of the two operands |
| 4327 | rounded towards zero. |
| 4328 | |
| 4329 | Note that signed integer division and unsigned integer division are |
| 4330 | distinct operations; for unsigned integer division, use '``udiv``'. |
| 4331 | |
| 4332 | Division by zero leads to undefined behavior. Overflow also leads to |
| 4333 | undefined behavior; this is a rare case, but can occur, for example, by |
| 4334 | doing a 32-bit division of -2147483648 by -1. |
| 4335 | |
| 4336 | If the ``exact`` keyword is present, the result value of the ``sdiv`` is |
| 4337 | a :ref:`poison value <poisonvalues>` if the result would be rounded. |
| 4338 | |
| 4339 | Example: |
| 4340 | """""""" |
| 4341 | |
| 4342 | .. code-block:: llvm |
| 4343 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4344 | <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4345 | |
| 4346 | .. _i_fdiv: |
| 4347 | |
| 4348 | '``fdiv``' Instruction |
| 4349 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4350 | |
| 4351 | Syntax: |
| 4352 | """"""" |
| 4353 | |
| 4354 | :: |
| 4355 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4356 | <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4357 | |
| 4358 | Overview: |
| 4359 | """"""""" |
| 4360 | |
| 4361 | The '``fdiv``' instruction returns the quotient of its two operands. |
| 4362 | |
| 4363 | Arguments: |
| 4364 | """""""""" |
| 4365 | |
| 4366 | The two arguments to the '``fdiv``' instruction must be :ref:`floating |
| 4367 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 4368 | Both arguments must have identical types. |
| 4369 | |
| 4370 | Semantics: |
| 4371 | """""""""" |
| 4372 | |
| 4373 | The value produced is the floating point quotient of the two operands. |
| 4374 | This instruction can also take any number of :ref:`fast-math |
| 4375 | flags <fastmath>`, which are optimization hints to enable otherwise |
| 4376 | unsafe floating point optimizations: |
| 4377 | |
| 4378 | Example: |
| 4379 | """""""" |
| 4380 | |
| 4381 | .. code-block:: llvm |
| 4382 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4383 | <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4384 | |
| 4385 | '``urem``' Instruction |
| 4386 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4387 | |
| 4388 | Syntax: |
| 4389 | """"""" |
| 4390 | |
| 4391 | :: |
| 4392 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4393 | <result> = urem <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4394 | |
| 4395 | Overview: |
| 4396 | """"""""" |
| 4397 | |
| 4398 | The '``urem``' instruction returns the remainder from the unsigned |
| 4399 | division of its two arguments. |
| 4400 | |
| 4401 | Arguments: |
| 4402 | """""""""" |
| 4403 | |
| 4404 | The two arguments to the '``urem``' instruction must be |
| 4405 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4406 | arguments must have identical types. |
| 4407 | |
| 4408 | Semantics: |
| 4409 | """""""""" |
| 4410 | |
| 4411 | This instruction returns the unsigned integer *remainder* of a division. |
| 4412 | This instruction always performs an unsigned division to get the |
| 4413 | remainder. |
| 4414 | |
| 4415 | Note that unsigned integer remainder and signed integer remainder are |
| 4416 | distinct operations; for signed integer remainder, use '``srem``'. |
| 4417 | |
| 4418 | Taking the remainder of a division by zero leads to undefined behavior. |
| 4419 | |
| 4420 | Example: |
| 4421 | """""""" |
| 4422 | |
| 4423 | .. code-block:: llvm |
| 4424 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4425 | <result> = urem i32 4, %var ; yields i32:result = 4 % %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4426 | |
| 4427 | '``srem``' Instruction |
| 4428 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4429 | |
| 4430 | Syntax: |
| 4431 | """"""" |
| 4432 | |
| 4433 | :: |
| 4434 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4435 | <result> = srem <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4436 | |
| 4437 | Overview: |
| 4438 | """"""""" |
| 4439 | |
| 4440 | The '``srem``' instruction returns the remainder from the signed |
| 4441 | division of its two operands. This instruction can also take |
| 4442 | :ref:`vector <t_vector>` versions of the values in which case the elements |
| 4443 | must be integers. |
| 4444 | |
| 4445 | Arguments: |
| 4446 | """""""""" |
| 4447 | |
| 4448 | The two arguments to the '``srem``' instruction must be |
| 4449 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4450 | arguments must have identical types. |
| 4451 | |
| 4452 | Semantics: |
| 4453 | """""""""" |
| 4454 | |
| 4455 | This instruction returns the *remainder* of a division (where the result |
| 4456 | is either zero or has the same sign as the dividend, ``op1``), not the |
| 4457 | *modulo* operator (where the result is either zero or has the same sign |
| 4458 | as the divisor, ``op2``) of a value. For more information about the |
| 4459 | difference, see `The Math |
| 4460 | Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a |
| 4461 | table of how this is implemented in various languages, please see |
| 4462 | `Wikipedia: modulo |
| 4463 | operation <http://en.wikipedia.org/wiki/Modulo_operation>`_. |
| 4464 | |
| 4465 | Note that signed integer remainder and unsigned integer remainder are |
| 4466 | distinct operations; for unsigned integer remainder, use '``urem``'. |
| 4467 | |
| 4468 | Taking the remainder of a division by zero leads to undefined behavior. |
| 4469 | Overflow also leads to undefined behavior; this is a rare case, but can |
| 4470 | occur, for example, by taking the remainder of a 32-bit division of |
| 4471 | -2147483648 by -1. (The remainder doesn't actually overflow, but this |
| 4472 | rule lets srem be implemented using instructions that return both the |
| 4473 | result of the division and the remainder.) |
| 4474 | |
| 4475 | Example: |
| 4476 | """""""" |
| 4477 | |
| 4478 | .. code-block:: llvm |
| 4479 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4480 | <result> = srem i32 4, %var ; yields i32:result = 4 % %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4481 | |
| 4482 | .. _i_frem: |
| 4483 | |
| 4484 | '``frem``' Instruction |
| 4485 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4486 | |
| 4487 | Syntax: |
| 4488 | """"""" |
| 4489 | |
| 4490 | :: |
| 4491 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4492 | <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4493 | |
| 4494 | Overview: |
| 4495 | """"""""" |
| 4496 | |
| 4497 | The '``frem``' instruction returns the remainder from the division of |
| 4498 | its two operands. |
| 4499 | |
| 4500 | Arguments: |
| 4501 | """""""""" |
| 4502 | |
| 4503 | The two arguments to the '``frem``' instruction must be :ref:`floating |
| 4504 | point <t_floating>` or :ref:`vector <t_vector>` of floating point values. |
| 4505 | Both arguments must have identical types. |
| 4506 | |
| 4507 | Semantics: |
| 4508 | """""""""" |
| 4509 | |
| 4510 | This instruction returns the *remainder* of a division. The remainder |
| 4511 | has the same sign as the dividend. This instruction can also take any |
| 4512 | number of :ref:`fast-math flags <fastmath>`, which are optimization hints |
| 4513 | to enable otherwise unsafe floating point optimizations: |
| 4514 | |
| 4515 | Example: |
| 4516 | """""""" |
| 4517 | |
| 4518 | .. code-block:: llvm |
| 4519 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4520 | <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4521 | |
| 4522 | .. _bitwiseops: |
| 4523 | |
| 4524 | Bitwise Binary Operations |
| 4525 | ------------------------- |
| 4526 | |
| 4527 | Bitwise binary operators are used to do various forms of bit-twiddling |
| 4528 | in a program. They are generally very efficient instructions and can |
| 4529 | commonly be strength reduced from other instructions. They require two |
| 4530 | operands of the same type, execute an operation on them, and produce a |
| 4531 | single value. The resulting value is the same type as its operands. |
| 4532 | |
| 4533 | '``shl``' Instruction |
| 4534 | ^^^^^^^^^^^^^^^^^^^^^ |
| 4535 | |
| 4536 | Syntax: |
| 4537 | """"""" |
| 4538 | |
| 4539 | :: |
| 4540 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4541 | <result> = shl <ty> <op1>, <op2> ; yields ty:result |
| 4542 | <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result |
| 4543 | <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result |
| 4544 | <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4545 | |
| 4546 | Overview: |
| 4547 | """"""""" |
| 4548 | |
| 4549 | The '``shl``' instruction returns the first operand shifted to the left |
| 4550 | a specified number of bits. |
| 4551 | |
| 4552 | Arguments: |
| 4553 | """""""""" |
| 4554 | |
| 4555 | Both arguments to the '``shl``' instruction must be the same |
| 4556 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. |
| 4557 | '``op2``' is treated as an unsigned value. |
| 4558 | |
| 4559 | Semantics: |
| 4560 | """""""""" |
| 4561 | |
| 4562 | The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`, |
| 4563 | where ``n`` is the width of the result. If ``op2`` is (statically or |
| 4564 | dynamically) negative or equal to or larger than the number of bits in |
| 4565 | ``op1``, the result is undefined. If the arguments are vectors, each |
| 4566 | vector element of ``op1`` is shifted by the corresponding shift amount |
| 4567 | in ``op2``. |
| 4568 | |
| 4569 | If the ``nuw`` keyword is present, then the shift produces a :ref:`poison |
| 4570 | value <poisonvalues>` if it shifts out any non-zero bits. If the |
| 4571 | ``nsw`` keyword is present, then the shift produces a :ref:`poison |
| 4572 | value <poisonvalues>` if it shifts out any bits that disagree with the |
| 4573 | resultant sign bit. As such, NUW/NSW have the same semantics as they |
| 4574 | would if the shift were expressed as a mul instruction with the same |
| 4575 | nsw/nuw bits in (mul %op1, (shl 1, %op2)). |
| 4576 | |
| 4577 | Example: |
| 4578 | """""""" |
| 4579 | |
| 4580 | .. code-block:: llvm |
| 4581 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4582 | <result> = shl i32 4, %var ; yields i32: 4 << %var |
| 4583 | <result> = shl i32 4, 2 ; yields i32: 16 |
| 4584 | <result> = shl i32 1, 10 ; yields i32: 1024 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4585 | <result> = shl i32 1, 32 ; undefined |
| 4586 | <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4> |
| 4587 | |
| 4588 | '``lshr``' Instruction |
| 4589 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4590 | |
| 4591 | Syntax: |
| 4592 | """"""" |
| 4593 | |
| 4594 | :: |
| 4595 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4596 | <result> = lshr <ty> <op1>, <op2> ; yields ty:result |
| 4597 | <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4598 | |
| 4599 | Overview: |
| 4600 | """"""""" |
| 4601 | |
| 4602 | The '``lshr``' instruction (logical shift right) returns the first |
| 4603 | operand shifted to the right a specified number of bits with zero fill. |
| 4604 | |
| 4605 | Arguments: |
| 4606 | """""""""" |
| 4607 | |
| 4608 | Both arguments to the '``lshr``' instruction must be the same |
| 4609 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. |
| 4610 | '``op2``' is treated as an unsigned value. |
| 4611 | |
| 4612 | Semantics: |
| 4613 | """""""""" |
| 4614 | |
| 4615 | This instruction always performs a logical shift right operation. The |
| 4616 | most significant bits of the result will be filled with zero bits after |
| 4617 | the shift. If ``op2`` is (statically or dynamically) equal to or larger |
| 4618 | than the number of bits in ``op1``, the result is undefined. If the |
| 4619 | arguments are vectors, each vector element of ``op1`` is shifted by the |
| 4620 | corresponding shift amount in ``op2``. |
| 4621 | |
| 4622 | If the ``exact`` keyword is present, the result value of the ``lshr`` is |
| 4623 | a :ref:`poison value <poisonvalues>` if any of the bits shifted out are |
| 4624 | non-zero. |
| 4625 | |
| 4626 | Example: |
| 4627 | """""""" |
| 4628 | |
| 4629 | .. code-block:: llvm |
| 4630 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4631 | <result> = lshr i32 4, 1 ; yields i32:result = 2 |
| 4632 | <result> = lshr i32 4, 2 ; yields i32:result = 1 |
| 4633 | <result> = lshr i8 4, 3 ; yields i8:result = 0 |
| 4634 | <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4635 | <result> = lshr i32 1, 32 ; undefined |
| 4636 | <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1> |
| 4637 | |
| 4638 | '``ashr``' Instruction |
| 4639 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 4640 | |
| 4641 | Syntax: |
| 4642 | """"""" |
| 4643 | |
| 4644 | :: |
| 4645 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4646 | <result> = ashr <ty> <op1>, <op2> ; yields ty:result |
| 4647 | <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4648 | |
| 4649 | Overview: |
| 4650 | """"""""" |
| 4651 | |
| 4652 | The '``ashr``' instruction (arithmetic shift right) returns the first |
| 4653 | operand shifted to the right a specified number of bits with sign |
| 4654 | extension. |
| 4655 | |
| 4656 | Arguments: |
| 4657 | """""""""" |
| 4658 | |
| 4659 | Both arguments to the '``ashr``' instruction must be the same |
| 4660 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type. |
| 4661 | '``op2``' is treated as an unsigned value. |
| 4662 | |
| 4663 | Semantics: |
| 4664 | """""""""" |
| 4665 | |
| 4666 | This instruction always performs an arithmetic shift right operation, |
| 4667 | The most significant bits of the result will be filled with the sign bit |
| 4668 | of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger |
| 4669 | than the number of bits in ``op1``, the result is undefined. If the |
| 4670 | arguments are vectors, each vector element of ``op1`` is shifted by the |
| 4671 | corresponding shift amount in ``op2``. |
| 4672 | |
| 4673 | If the ``exact`` keyword is present, the result value of the ``ashr`` is |
| 4674 | a :ref:`poison value <poisonvalues>` if any of the bits shifted out are |
| 4675 | non-zero. |
| 4676 | |
| 4677 | Example: |
| 4678 | """""""" |
| 4679 | |
| 4680 | .. code-block:: llvm |
| 4681 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4682 | <result> = ashr i32 4, 1 ; yields i32:result = 2 |
| 4683 | <result> = ashr i32 4, 2 ; yields i32:result = 1 |
| 4684 | <result> = ashr i8 4, 3 ; yields i8:result = 0 |
| 4685 | <result> = ashr i8 -2, 1 ; yields i8:result = -1 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4686 | <result> = ashr i32 1, 32 ; undefined |
| 4687 | <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0> |
| 4688 | |
| 4689 | '``and``' Instruction |
| 4690 | ^^^^^^^^^^^^^^^^^^^^^ |
| 4691 | |
| 4692 | Syntax: |
| 4693 | """"""" |
| 4694 | |
| 4695 | :: |
| 4696 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4697 | <result> = and <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4698 | |
| 4699 | Overview: |
| 4700 | """"""""" |
| 4701 | |
| 4702 | The '``and``' instruction returns the bitwise logical and of its two |
| 4703 | operands. |
| 4704 | |
| 4705 | Arguments: |
| 4706 | """""""""" |
| 4707 | |
| 4708 | The two arguments to the '``and``' instruction must be |
| 4709 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4710 | arguments must have identical types. |
| 4711 | |
| 4712 | Semantics: |
| 4713 | """""""""" |
| 4714 | |
| 4715 | The truth table used for the '``and``' instruction is: |
| 4716 | |
| 4717 | +-----+-----+-----+ |
| 4718 | | In0 | In1 | Out | |
| 4719 | +-----+-----+-----+ |
| 4720 | | 0 | 0 | 0 | |
| 4721 | +-----+-----+-----+ |
| 4722 | | 0 | 1 | 0 | |
| 4723 | +-----+-----+-----+ |
| 4724 | | 1 | 0 | 0 | |
| 4725 | +-----+-----+-----+ |
| 4726 | | 1 | 1 | 1 | |
| 4727 | +-----+-----+-----+ |
| 4728 | |
| 4729 | Example: |
| 4730 | """""""" |
| 4731 | |
| 4732 | .. code-block:: llvm |
| 4733 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4734 | <result> = and i32 4, %var ; yields i32:result = 4 & %var |
| 4735 | <result> = and i32 15, 40 ; yields i32:result = 8 |
| 4736 | <result> = and i32 4, 8 ; yields i32:result = 0 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4737 | |
| 4738 | '``or``' Instruction |
| 4739 | ^^^^^^^^^^^^^^^^^^^^ |
| 4740 | |
| 4741 | Syntax: |
| 4742 | """"""" |
| 4743 | |
| 4744 | :: |
| 4745 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4746 | <result> = or <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4747 | |
| 4748 | Overview: |
| 4749 | """"""""" |
| 4750 | |
| 4751 | The '``or``' instruction returns the bitwise logical inclusive or of its |
| 4752 | two operands. |
| 4753 | |
| 4754 | Arguments: |
| 4755 | """""""""" |
| 4756 | |
| 4757 | The two arguments to the '``or``' instruction must be |
| 4758 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4759 | arguments must have identical types. |
| 4760 | |
| 4761 | Semantics: |
| 4762 | """""""""" |
| 4763 | |
| 4764 | The truth table used for the '``or``' instruction is: |
| 4765 | |
| 4766 | +-----+-----+-----+ |
| 4767 | | In0 | In1 | Out | |
| 4768 | +-----+-----+-----+ |
| 4769 | | 0 | 0 | 0 | |
| 4770 | +-----+-----+-----+ |
| 4771 | | 0 | 1 | 1 | |
| 4772 | +-----+-----+-----+ |
| 4773 | | 1 | 0 | 1 | |
| 4774 | +-----+-----+-----+ |
| 4775 | | 1 | 1 | 1 | |
| 4776 | +-----+-----+-----+ |
| 4777 | |
| 4778 | Example: |
| 4779 | """""""" |
| 4780 | |
| 4781 | :: |
| 4782 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4783 | <result> = or i32 4, %var ; yields i32:result = 4 | %var |
| 4784 | <result> = or i32 15, 40 ; yields i32:result = 47 |
| 4785 | <result> = or i32 4, 8 ; yields i32:result = 12 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4786 | |
| 4787 | '``xor``' Instruction |
| 4788 | ^^^^^^^^^^^^^^^^^^^^^ |
| 4789 | |
| 4790 | Syntax: |
| 4791 | """"""" |
| 4792 | |
| 4793 | :: |
| 4794 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4795 | <result> = xor <ty> <op1>, <op2> ; yields ty:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4796 | |
| 4797 | Overview: |
| 4798 | """"""""" |
| 4799 | |
| 4800 | The '``xor``' instruction returns the bitwise logical exclusive or of |
| 4801 | its two operands. The ``xor`` is used to implement the "one's |
| 4802 | complement" operation, which is the "~" operator in C. |
| 4803 | |
| 4804 | Arguments: |
| 4805 | """""""""" |
| 4806 | |
| 4807 | The two arguments to the '``xor``' instruction must be |
| 4808 | :ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both |
| 4809 | arguments must have identical types. |
| 4810 | |
| 4811 | Semantics: |
| 4812 | """""""""" |
| 4813 | |
| 4814 | The truth table used for the '``xor``' instruction is: |
| 4815 | |
| 4816 | +-----+-----+-----+ |
| 4817 | | In0 | In1 | Out | |
| 4818 | +-----+-----+-----+ |
| 4819 | | 0 | 0 | 0 | |
| 4820 | +-----+-----+-----+ |
| 4821 | | 0 | 1 | 1 | |
| 4822 | +-----+-----+-----+ |
| 4823 | | 1 | 0 | 1 | |
| 4824 | +-----+-----+-----+ |
| 4825 | | 1 | 1 | 0 | |
| 4826 | +-----+-----+-----+ |
| 4827 | |
| 4828 | Example: |
| 4829 | """""""" |
| 4830 | |
| 4831 | .. code-block:: llvm |
| 4832 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 4833 | <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var |
| 4834 | <result> = xor i32 15, 40 ; yields i32:result = 39 |
| 4835 | <result> = xor i32 4, 8 ; yields i32:result = 12 |
| 4836 | <result> = xor i32 %V, -1 ; yields i32:result = ~%V |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4837 | |
| 4838 | Vector Operations |
| 4839 | ----------------- |
| 4840 | |
| 4841 | LLVM supports several instructions to represent vector operations in a |
| 4842 | target-independent manner. These instructions cover the element-access |
| 4843 | and vector-specific operations needed to process vectors effectively. |
| 4844 | While LLVM does directly support these vector operations, many |
| 4845 | sophisticated algorithms will want to use target-specific intrinsics to |
| 4846 | take full advantage of a specific target. |
| 4847 | |
| 4848 | .. _i_extractelement: |
| 4849 | |
| 4850 | '``extractelement``' Instruction |
| 4851 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4852 | |
| 4853 | Syntax: |
| 4854 | """"""" |
| 4855 | |
| 4856 | :: |
| 4857 | |
Michael J. Spencer | 1f10c5ea | 2014-05-01 22:12:39 +0000 | [diff] [blame] | 4858 | <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty> |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4859 | |
| 4860 | Overview: |
| 4861 | """"""""" |
| 4862 | |
| 4863 | The '``extractelement``' instruction extracts a single scalar element |
| 4864 | from a vector at a specified index. |
| 4865 | |
| 4866 | Arguments: |
| 4867 | """""""""" |
| 4868 | |
| 4869 | The first operand of an '``extractelement``' instruction is a value of |
| 4870 | :ref:`vector <t_vector>` type. The second operand is an index indicating |
| 4871 | the position from which to extract the element. The index may be a |
Michael J. Spencer | 1f10c5ea | 2014-05-01 22:12:39 +0000 | [diff] [blame] | 4872 | variable of any integer type. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4873 | |
| 4874 | Semantics: |
| 4875 | """""""""" |
| 4876 | |
| 4877 | The result is a scalar of the same type as the element type of ``val``. |
| 4878 | Its value is the value at position ``idx`` of ``val``. If ``idx`` |
| 4879 | exceeds the length of ``val``, the results are undefined. |
| 4880 | |
| 4881 | Example: |
| 4882 | """""""" |
| 4883 | |
| 4884 | .. code-block:: llvm |
| 4885 | |
| 4886 | <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32 |
| 4887 | |
| 4888 | .. _i_insertelement: |
| 4889 | |
| 4890 | '``insertelement``' Instruction |
| 4891 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4892 | |
| 4893 | Syntax: |
| 4894 | """"""" |
| 4895 | |
| 4896 | :: |
| 4897 | |
Michael J. Spencer | 1f10c5ea | 2014-05-01 22:12:39 +0000 | [diff] [blame] | 4898 | <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>> |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4899 | |
| 4900 | Overview: |
| 4901 | """"""""" |
| 4902 | |
| 4903 | The '``insertelement``' instruction inserts a scalar element into a |
| 4904 | vector at a specified index. |
| 4905 | |
| 4906 | Arguments: |
| 4907 | """""""""" |
| 4908 | |
| 4909 | The first operand of an '``insertelement``' instruction is a value of |
| 4910 | :ref:`vector <t_vector>` type. The second operand is a scalar value whose |
| 4911 | type must equal the element type of the first operand. The third operand |
| 4912 | is an index indicating the position at which to insert the value. The |
Michael J. Spencer | 1f10c5ea | 2014-05-01 22:12:39 +0000 | [diff] [blame] | 4913 | index may be a variable of any integer type. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 4914 | |
| 4915 | Semantics: |
| 4916 | """""""""" |
| 4917 | |
| 4918 | The result is a vector of the same type as ``val``. Its element values |
| 4919 | are those of ``val`` except at position ``idx``, where it gets the value |
| 4920 | ``elt``. If ``idx`` exceeds the length of ``val``, the results are |
| 4921 | undefined. |
| 4922 | |
| 4923 | Example: |
| 4924 | """""""" |
| 4925 | |
| 4926 | .. code-block:: llvm |
| 4927 | |
| 4928 | <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32> |
| 4929 | |
| 4930 | .. _i_shufflevector: |
| 4931 | |
| 4932 | '``shufflevector``' Instruction |
| 4933 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4934 | |
| 4935 | Syntax: |
| 4936 | """"""" |
| 4937 | |
| 4938 | :: |
| 4939 | |
| 4940 | <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>> |
| 4941 | |
| 4942 | Overview: |
| 4943 | """"""""" |
| 4944 | |
| 4945 | The '``shufflevector``' instruction constructs a permutation of elements |
| 4946 | from two input vectors, returning a vector with the same element type as |
| 4947 | the input and length that is the same as the shuffle mask. |
| 4948 | |
| 4949 | Arguments: |
| 4950 | """""""""" |
| 4951 | |
| 4952 | The first two operands of a '``shufflevector``' instruction are vectors |
| 4953 | with the same type. The third argument is a shuffle mask whose element |
| 4954 | type is always 'i32'. The result of the instruction is a vector whose |
| 4955 | length is the same as the shuffle mask and whose element type is the |
| 4956 | same as the element type of the first two operands. |
| 4957 | |
| 4958 | The shuffle mask operand is required to be a constant vector with either |
| 4959 | constant integer or undef values. |
| 4960 | |
| 4961 | Semantics: |
| 4962 | """""""""" |
| 4963 | |
| 4964 | The elements of the two input vectors are numbered from left to right |
| 4965 | across both of the vectors. The shuffle mask operand specifies, for each |
| 4966 | element of the result vector, which element of the two input vectors the |
| 4967 | result element gets. The element selector may be undef (meaning "don't |
| 4968 | care") and the second operand may be undef if performing a shuffle from |
| 4969 | only one vector. |
| 4970 | |
| 4971 | Example: |
| 4972 | """""""" |
| 4973 | |
| 4974 | .. code-block:: llvm |
| 4975 | |
| 4976 | <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, |
| 4977 | <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32> |
| 4978 | <result> = shufflevector <4 x i32> %v1, <4 x i32> undef, |
| 4979 | <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle. |
| 4980 | <result> = shufflevector <8 x i32> %v1, <8 x i32> undef, |
| 4981 | <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> |
| 4982 | <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2, |
| 4983 | <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32> |
| 4984 | |
| 4985 | Aggregate Operations |
| 4986 | -------------------- |
| 4987 | |
| 4988 | LLVM supports several instructions for working with |
| 4989 | :ref:`aggregate <t_aggregate>` values. |
| 4990 | |
| 4991 | .. _i_extractvalue: |
| 4992 | |
| 4993 | '``extractvalue``' Instruction |
| 4994 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 4995 | |
| 4996 | Syntax: |
| 4997 | """"""" |
| 4998 | |
| 4999 | :: |
| 5000 | |
| 5001 | <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}* |
| 5002 | |
| 5003 | Overview: |
| 5004 | """"""""" |
| 5005 | |
| 5006 | The '``extractvalue``' instruction extracts the value of a member field |
| 5007 | from an :ref:`aggregate <t_aggregate>` value. |
| 5008 | |
| 5009 | Arguments: |
| 5010 | """""""""" |
| 5011 | |
| 5012 | The first operand of an '``extractvalue``' instruction is a value of |
| 5013 | :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The operands are |
| 5014 | constant indices to specify which value to extract in a similar manner |
| 5015 | as indices in a '``getelementptr``' instruction. |
| 5016 | |
| 5017 | The major differences to ``getelementptr`` indexing are: |
| 5018 | |
| 5019 | - Since the value being indexed is not a pointer, the first index is |
| 5020 | omitted and assumed to be zero. |
| 5021 | - At least one index must be specified. |
| 5022 | - Not only struct indices but also array indices must be in bounds. |
| 5023 | |
| 5024 | Semantics: |
| 5025 | """""""""" |
| 5026 | |
| 5027 | The result is the value at the position in the aggregate specified by |
| 5028 | the index operands. |
| 5029 | |
| 5030 | Example: |
| 5031 | """""""" |
| 5032 | |
| 5033 | .. code-block:: llvm |
| 5034 | |
| 5035 | <result> = extractvalue {i32, float} %agg, 0 ; yields i32 |
| 5036 | |
| 5037 | .. _i_insertvalue: |
| 5038 | |
| 5039 | '``insertvalue``' Instruction |
| 5040 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5041 | |
| 5042 | Syntax: |
| 5043 | """"""" |
| 5044 | |
| 5045 | :: |
| 5046 | |
| 5047 | <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type> |
| 5048 | |
| 5049 | Overview: |
| 5050 | """"""""" |
| 5051 | |
| 5052 | The '``insertvalue``' instruction inserts a value into a member field in |
| 5053 | an :ref:`aggregate <t_aggregate>` value. |
| 5054 | |
| 5055 | Arguments: |
| 5056 | """""""""" |
| 5057 | |
| 5058 | The first operand of an '``insertvalue``' instruction is a value of |
| 5059 | :ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is |
| 5060 | a first-class value to insert. The following operands are constant |
| 5061 | indices indicating the position at which to insert the value in a |
| 5062 | similar manner as indices in a '``extractvalue``' instruction. The value |
| 5063 | to insert must have the same type as the value identified by the |
| 5064 | indices. |
| 5065 | |
| 5066 | Semantics: |
| 5067 | """""""""" |
| 5068 | |
| 5069 | The result is an aggregate of the same type as ``val``. Its value is |
| 5070 | that of ``val`` except that the value at the position specified by the |
| 5071 | indices is that of ``elt``. |
| 5072 | |
| 5073 | Example: |
| 5074 | """""""" |
| 5075 | |
| 5076 | .. code-block:: llvm |
| 5077 | |
| 5078 | %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef} |
| 5079 | %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val} |
Dan Liew | ffcfe7f | 2014-09-08 21:19:46 +0000 | [diff] [blame] | 5080 | %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}} |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5081 | |
| 5082 | .. _memoryops: |
| 5083 | |
| 5084 | Memory Access and Addressing Operations |
| 5085 | --------------------------------------- |
| 5086 | |
| 5087 | A key design point of an SSA-based representation is how it represents |
| 5088 | memory. In LLVM, no memory locations are in SSA form, which makes things |
| 5089 | very simple. This section describes how to read, write, and allocate |
| 5090 | memory in LLVM. |
| 5091 | |
| 5092 | .. _i_alloca: |
| 5093 | |
| 5094 | '``alloca``' Instruction |
| 5095 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5096 | |
| 5097 | Syntax: |
| 5098 | """"""" |
| 5099 | |
| 5100 | :: |
| 5101 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5102 | <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] ; yields type*:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5103 | |
| 5104 | Overview: |
| 5105 | """"""""" |
| 5106 | |
| 5107 | The '``alloca``' instruction allocates memory on the stack frame of the |
| 5108 | currently executing function, to be automatically released when this |
| 5109 | function returns to its caller. The object is always allocated in the |
| 5110 | generic address space (address space zero). |
| 5111 | |
| 5112 | Arguments: |
| 5113 | """""""""" |
| 5114 | |
| 5115 | The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements`` |
| 5116 | bytes of memory on the runtime stack, returning a pointer of the |
| 5117 | appropriate type to the program. If "NumElements" is specified, it is |
| 5118 | the number of elements allocated, otherwise "NumElements" is defaulted |
| 5119 | to be one. If a constant alignment is specified, the value result of the |
Reid Kleckner | 15fe7a5 | 2014-07-15 01:16:09 +0000 | [diff] [blame] | 5120 | allocation is guaranteed to be aligned to at least that boundary. The |
| 5121 | alignment may not be greater than ``1 << 29``. If not specified, or if |
| 5122 | zero, the target can choose to align the allocation on any convenient |
| 5123 | boundary compatible with the type. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5124 | |
| 5125 | '``type``' may be any sized type. |
| 5126 | |
| 5127 | Semantics: |
| 5128 | """""""""" |
| 5129 | |
| 5130 | Memory is allocated; a pointer is returned. The operation is undefined |
| 5131 | if there is insufficient stack space for the allocation. '``alloca``'d |
| 5132 | memory is automatically released when the function returns. The |
| 5133 | '``alloca``' instruction is commonly used to represent automatic |
| 5134 | variables that must have an address available. When the function returns |
| 5135 | (either with the ``ret`` or ``resume`` instructions), the memory is |
| 5136 | reclaimed. Allocating zero bytes is legal, but the result is undefined. |
| 5137 | The order in which memory is allocated (ie., which way the stack grows) |
| 5138 | is not specified. |
| 5139 | |
| 5140 | Example: |
| 5141 | """""""" |
| 5142 | |
| 5143 | .. code-block:: llvm |
| 5144 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5145 | %ptr = alloca i32 ; yields i32*:ptr |
| 5146 | %ptr = alloca i32, i32 4 ; yields i32*:ptr |
| 5147 | %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr |
| 5148 | %ptr = alloca i32, align 1024 ; yields i32*:ptr |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5149 | |
| 5150 | .. _i_load: |
| 5151 | |
| 5152 | '``load``' Instruction |
| 5153 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 5154 | |
| 5155 | Syntax: |
| 5156 | """"""" |
| 5157 | |
| 5158 | :: |
| 5159 | |
Philip Reames | cdb72f3 | 2014-10-20 22:40:55 +0000 | [diff] [blame] | 5160 | <result> = load [volatile] <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !nonnull !<index>] |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5161 | <result> = load atomic [volatile] <ty>* <pointer> [singlethread] <ordering>, align <alignment> |
| 5162 | !<index> = !{ i32 1 } |
| 5163 | |
| 5164 | Overview: |
| 5165 | """"""""" |
| 5166 | |
| 5167 | The '``load``' instruction is used to read from memory. |
| 5168 | |
| 5169 | Arguments: |
| 5170 | """""""""" |
| 5171 | |
Eli Bendersky | 239a78b | 2013-04-17 20:17:08 +0000 | [diff] [blame] | 5172 | The argument to the ``load`` instruction specifies the memory address |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5173 | from which to load. The pointer must point to a :ref:`first |
| 5174 | class <t_firstclass>` type. If the ``load`` is marked as ``volatile``, |
| 5175 | then the optimizer is not allowed to modify the number or order of |
| 5176 | execution of this ``load`` with other :ref:`volatile |
| 5177 | operations <volatile>`. |
| 5178 | |
| 5179 | If the ``load`` is marked as ``atomic``, it takes an extra |
| 5180 | :ref:`ordering <ordering>` and optional ``singlethread`` argument. The |
| 5181 | ``release`` and ``acq_rel`` orderings are not valid on ``load`` |
| 5182 | instructions. Atomic loads produce :ref:`defined <memmodel>` results |
| 5183 | when they may see multiple atomic stores. The type of the pointee must |
| 5184 | be an integer type whose bit width is a power of two greater than or |
| 5185 | equal to eight and less than or equal to a target-specific size limit. |
| 5186 | ``align`` must be explicitly specified on atomic loads, and the load has |
| 5187 | undefined behavior if the alignment is not set to a value which is at |
| 5188 | least the size in bytes of the pointee. ``!nontemporal`` does not have |
| 5189 | any defined semantics for atomic loads. |
| 5190 | |
| 5191 | The optional constant ``align`` argument specifies the alignment of the |
| 5192 | operation (that is, the alignment of the memory address). A value of 0 |
Eli Bendersky | 239a78b | 2013-04-17 20:17:08 +0000 | [diff] [blame] | 5193 | or an omitted ``align`` argument means that the operation has the ABI |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5194 | alignment for the target. It is the responsibility of the code emitter |
| 5195 | to ensure that the alignment information is correct. Overestimating the |
| 5196 | alignment results in undefined behavior. Underestimating the alignment |
Reid Kleckner | 15fe7a5 | 2014-07-15 01:16:09 +0000 | [diff] [blame] | 5197 | may produce less efficient code. An alignment of 1 is always safe. The |
| 5198 | maximum possible alignment is ``1 << 29``. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5199 | |
| 5200 | The optional ``!nontemporal`` metadata must reference a single |
Stefanus Du Toit | 736e2e2 | 2013-06-20 14:02:44 +0000 | [diff] [blame] | 5201 | metadata name ``<index>`` corresponding to a metadata node with one |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5202 | ``i32`` entry of value 1. The existence of the ``!nontemporal`` |
Stefanus Du Toit | 736e2e2 | 2013-06-20 14:02:44 +0000 | [diff] [blame] | 5203 | metadata on the instruction tells the optimizer and code generator |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5204 | that this load is not expected to be reused in the cache. The code |
| 5205 | generator may select special instructions to save cache bandwidth, such |
| 5206 | as the ``MOVNT`` instruction on x86. |
| 5207 | |
| 5208 | The optional ``!invariant.load`` metadata must reference a single |
Stefanus Du Toit | 736e2e2 | 2013-06-20 14:02:44 +0000 | [diff] [blame] | 5209 | metadata name ``<index>`` corresponding to a metadata node with no |
| 5210 | entries. The existence of the ``!invariant.load`` metadata on the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5211 | instruction tells the optimizer and code generator that this load |
| 5212 | address points to memory which does not change value during program |
| 5213 | execution. The optimizer may then move this load around, for example, by |
| 5214 | hoisting it out of loops using loop invariant code motion. |
| 5215 | |
Philip Reames | cdb72f3 | 2014-10-20 22:40:55 +0000 | [diff] [blame] | 5216 | The optional ``!nonnull`` metadata must reference a single |
| 5217 | metadata name ``<index>`` corresponding to a metadata node with no |
| 5218 | entries. The existence of the ``!nonnull`` metadata on the |
| 5219 | instruction tells the optimizer that the value loaded is known to |
| 5220 | never be null. This is analogous to the ''nonnull'' attribute |
| 5221 | on parameters and return values. This metadata can only be applied |
| 5222 | to loads of a pointer type. |
| 5223 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5224 | Semantics: |
| 5225 | """""""""" |
| 5226 | |
| 5227 | The location of memory pointed to is loaded. If the value being loaded |
| 5228 | is of scalar type then the number of bytes read does not exceed the |
| 5229 | minimum number of bytes needed to hold all bits of the type. For |
| 5230 | example, loading an ``i24`` reads at most three bytes. When loading a |
| 5231 | value of a type like ``i20`` with a size that is not an integral number |
| 5232 | of bytes, the result is undefined if the value was not originally |
| 5233 | written using a store of the same type. |
| 5234 | |
| 5235 | Examples: |
| 5236 | """"""""" |
| 5237 | |
| 5238 | .. code-block:: llvm |
| 5239 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5240 | %ptr = alloca i32 ; yields i32*:ptr |
| 5241 | store i32 3, i32* %ptr ; yields void |
| 5242 | %val = load i32* %ptr ; yields i32:val = i32 3 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5243 | |
| 5244 | .. _i_store: |
| 5245 | |
| 5246 | '``store``' Instruction |
| 5247 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 5248 | |
| 5249 | Syntax: |
| 5250 | """"""" |
| 5251 | |
| 5252 | :: |
| 5253 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5254 | store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>] ; yields void |
| 5255 | store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> ; yields void |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5256 | |
| 5257 | Overview: |
| 5258 | """"""""" |
| 5259 | |
| 5260 | The '``store``' instruction is used to write to memory. |
| 5261 | |
| 5262 | Arguments: |
| 5263 | """""""""" |
| 5264 | |
Eli Bendersky | ca38084 | 2013-04-17 17:17:20 +0000 | [diff] [blame] | 5265 | There are two arguments to the ``store`` instruction: a value to store |
| 5266 | and an address at which to store it. The type of the ``<pointer>`` |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5267 | operand must be a pointer to the :ref:`first class <t_firstclass>` type of |
Eli Bendersky | ca38084 | 2013-04-17 17:17:20 +0000 | [diff] [blame] | 5268 | the ``<value>`` operand. If the ``store`` is marked as ``volatile``, |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5269 | then the optimizer is not allowed to modify the number or order of |
| 5270 | execution of this ``store`` with other :ref:`volatile |
| 5271 | operations <volatile>`. |
| 5272 | |
| 5273 | If the ``store`` is marked as ``atomic``, it takes an extra |
| 5274 | :ref:`ordering <ordering>` and optional ``singlethread`` argument. The |
| 5275 | ``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` |
| 5276 | instructions. Atomic loads produce :ref:`defined <memmodel>` results |
| 5277 | when they may see multiple atomic stores. The type of the pointee must |
| 5278 | be an integer type whose bit width is a power of two greater than or |
| 5279 | equal to eight and less than or equal to a target-specific size limit. |
| 5280 | ``align`` must be explicitly specified on atomic stores, and the store |
| 5281 | has undefined behavior if the alignment is not set to a value which is |
| 5282 | at least the size in bytes of the pointee. ``!nontemporal`` does not |
| 5283 | have any defined semantics for atomic stores. |
| 5284 | |
Eli Bendersky | ca38084 | 2013-04-17 17:17:20 +0000 | [diff] [blame] | 5285 | The optional constant ``align`` argument specifies the alignment of the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5286 | operation (that is, the alignment of the memory address). A value of 0 |
Eli Bendersky | ca38084 | 2013-04-17 17:17:20 +0000 | [diff] [blame] | 5287 | or an omitted ``align`` argument means that the operation has the ABI |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5288 | alignment for the target. It is the responsibility of the code emitter |
| 5289 | to ensure that the alignment information is correct. Overestimating the |
Eli Bendersky | ca38084 | 2013-04-17 17:17:20 +0000 | [diff] [blame] | 5290 | alignment results in undefined behavior. Underestimating the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5291 | alignment may produce less efficient code. An alignment of 1 is always |
Reid Kleckner | 15fe7a5 | 2014-07-15 01:16:09 +0000 | [diff] [blame] | 5292 | safe. The maximum possible alignment is ``1 << 29``. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5293 | |
Stefanus Du Toit | 736e2e2 | 2013-06-20 14:02:44 +0000 | [diff] [blame] | 5294 | The optional ``!nontemporal`` metadata must reference a single metadata |
Eli Bendersky | ca38084 | 2013-04-17 17:17:20 +0000 | [diff] [blame] | 5295 | name ``<index>`` corresponding to a metadata node with one ``i32`` entry of |
Stefanus Du Toit | 736e2e2 | 2013-06-20 14:02:44 +0000 | [diff] [blame] | 5296 | value 1. The existence of the ``!nontemporal`` metadata on the instruction |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5297 | tells the optimizer and code generator that this load is not expected to |
| 5298 | be reused in the cache. The code generator may select special |
| 5299 | instructions to save cache bandwidth, such as the MOVNT instruction on |
| 5300 | x86. |
| 5301 | |
| 5302 | Semantics: |
| 5303 | """""""""" |
| 5304 | |
Eli Bendersky | ca38084 | 2013-04-17 17:17:20 +0000 | [diff] [blame] | 5305 | The contents of memory are updated to contain ``<value>`` at the |
| 5306 | location specified by the ``<pointer>`` operand. If ``<value>`` is |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5307 | of scalar type then the number of bytes written does not exceed the |
| 5308 | minimum number of bytes needed to hold all bits of the type. For |
| 5309 | example, storing an ``i24`` writes at most three bytes. When writing a |
| 5310 | value of a type like ``i20`` with a size that is not an integral number |
| 5311 | of bytes, it is unspecified what happens to the extra bits that do not |
| 5312 | belong to the type, but they will typically be overwritten. |
| 5313 | |
| 5314 | Example: |
| 5315 | """""""" |
| 5316 | |
| 5317 | .. code-block:: llvm |
| 5318 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5319 | %ptr = alloca i32 ; yields i32*:ptr |
| 5320 | store i32 3, i32* %ptr ; yields void |
| 5321 | %val = load i32* %ptr ; yields i32:val = i32 3 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5322 | |
| 5323 | .. _i_fence: |
| 5324 | |
| 5325 | '``fence``' Instruction |
| 5326 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 5327 | |
| 5328 | Syntax: |
| 5329 | """"""" |
| 5330 | |
| 5331 | :: |
| 5332 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5333 | fence [singlethread] <ordering> ; yields void |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5334 | |
| 5335 | Overview: |
| 5336 | """"""""" |
| 5337 | |
| 5338 | The '``fence``' instruction is used to introduce happens-before edges |
| 5339 | between operations. |
| 5340 | |
| 5341 | Arguments: |
| 5342 | """""""""" |
| 5343 | |
| 5344 | '``fence``' instructions take an :ref:`ordering <ordering>` argument which |
| 5345 | defines what *synchronizes-with* edges they add. They can only be given |
| 5346 | ``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings. |
| 5347 | |
| 5348 | Semantics: |
| 5349 | """""""""" |
| 5350 | |
| 5351 | A fence A which has (at least) ``release`` ordering semantics |
| 5352 | *synchronizes with* a fence B with (at least) ``acquire`` ordering |
| 5353 | semantics if and only if there exist atomic operations X and Y, both |
| 5354 | operating on some atomic object M, such that A is sequenced before X, X |
| 5355 | modifies M (either directly or through some side effect of a sequence |
| 5356 | headed by X), Y is sequenced before B, and Y observes M. This provides a |
| 5357 | *happens-before* dependency between A and B. Rather than an explicit |
| 5358 | ``fence``, one (but not both) of the atomic operations X or Y might |
| 5359 | provide a ``release`` or ``acquire`` (resp.) ordering constraint and |
| 5360 | still *synchronize-with* the explicit ``fence`` and establish the |
| 5361 | *happens-before* edge. |
| 5362 | |
| 5363 | A ``fence`` which has ``seq_cst`` ordering, in addition to having both |
| 5364 | ``acquire`` and ``release`` semantics specified above, participates in |
| 5365 | the global program order of other ``seq_cst`` operations and/or fences. |
| 5366 | |
| 5367 | The optional ":ref:`singlethread <singlethread>`" argument specifies |
| 5368 | that the fence only synchronizes with other fences in the same thread. |
| 5369 | (This is useful for interacting with signal handlers.) |
| 5370 | |
| 5371 | Example: |
| 5372 | """""""" |
| 5373 | |
| 5374 | .. code-block:: llvm |
| 5375 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5376 | fence acquire ; yields void |
| 5377 | fence singlethread seq_cst ; yields void |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5378 | |
| 5379 | .. _i_cmpxchg: |
| 5380 | |
| 5381 | '``cmpxchg``' Instruction |
| 5382 | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5383 | |
| 5384 | Syntax: |
| 5385 | """"""" |
| 5386 | |
| 5387 | :: |
| 5388 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5389 | cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields { ty, i1 } |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5390 | |
| 5391 | Overview: |
| 5392 | """"""""" |
| 5393 | |
| 5394 | The '``cmpxchg``' instruction is used to atomically modify memory. It |
| 5395 | loads a value in memory and compares it to a given value. If they are |
Tim Northover | 420a216 | 2014-06-13 14:24:07 +0000 | [diff] [blame] | 5396 | equal, it tries to store a new value into the memory. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5397 | |
| 5398 | Arguments: |
| 5399 | """""""""" |
| 5400 | |
| 5401 | There are three arguments to the '``cmpxchg``' instruction: an address |
| 5402 | to operate on, a value to compare to the value currently be at that |
| 5403 | address, and a new value to place at that address if the compared values |
| 5404 | are equal. The type of '<cmp>' must be an integer type whose bit width |
| 5405 | is a power of two greater than or equal to eight and less than or equal |
| 5406 | to a target-specific size limit. '<cmp>' and '<new>' must have the same |
| 5407 | type, and the type of '<pointer>' must be a pointer to that type. If the |
| 5408 | ``cmpxchg`` is marked as ``volatile``, then the optimizer is not allowed |
| 5409 | to modify the number or order of execution of this ``cmpxchg`` with |
| 5410 | other :ref:`volatile operations <volatile>`. |
| 5411 | |
Tim Northover | e94a518 | 2014-03-11 10:48:52 +0000 | [diff] [blame] | 5412 | The success and failure :ref:`ordering <ordering>` arguments specify how this |
Tim Northover | 1dcc9f9 | 2014-06-13 14:24:16 +0000 | [diff] [blame] | 5413 | ``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters |
| 5414 | must be at least ``monotonic``, the ordering constraint on failure must be no |
| 5415 | stronger than that on success, and the failure ordering cannot be either |
| 5416 | ``release`` or ``acq_rel``. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5417 | |
| 5418 | The optional "``singlethread``" argument declares that the ``cmpxchg`` |
| 5419 | is only atomic with respect to code (usually signal handlers) running in |
| 5420 | the same thread as the ``cmpxchg``. Otherwise the cmpxchg is atomic with |
| 5421 | respect to all other code in the system. |
| 5422 | |
| 5423 | The pointer passed into cmpxchg must have alignment greater than or |
| 5424 | equal to the size in memory of the operand. |
| 5425 | |
| 5426 | Semantics: |
| 5427 | """""""""" |
| 5428 | |
Tim Northover | 420a216 | 2014-06-13 14:24:07 +0000 | [diff] [blame] | 5429 | The contents of memory at the location specified by the '``<pointer>``' operand |
| 5430 | is read and compared to '``<cmp>``'; if the read value is the equal, the |
| 5431 | '``<new>``' is written. The original value at the location is returned, together |
| 5432 | with a flag indicating success (true) or failure (false). |
| 5433 | |
| 5434 | If the cmpxchg operation is marked as ``weak`` then a spurious failure is |
| 5435 | permitted: the operation may not write ``<new>`` even if the comparison |
| 5436 | matched. |
| 5437 | |
| 5438 | If the cmpxchg operation is strong (the default), the i1 value is 1 if and only |
| 5439 | if the value loaded equals ``cmp``. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5440 | |
Tim Northover | e94a518 | 2014-03-11 10:48:52 +0000 | [diff] [blame] | 5441 | A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of |
| 5442 | identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic |
| 5443 | load with an ordering parameter determined the second ordering parameter. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5444 | |
| 5445 | Example: |
| 5446 | """""""" |
| 5447 | |
| 5448 | .. code-block:: llvm |
| 5449 | |
| 5450 | entry: |
Tim Northover | 420a216 | 2014-06-13 14:24:07 +0000 | [diff] [blame] | 5451 | %orig = atomic load i32* %ptr unordered ; yields i32 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5452 | br label %loop |
| 5453 | |
| 5454 | loop: |
| 5455 | %cmp = phi i32 [ %orig, %entry ], [%old, %loop] |
| 5456 | %squared = mul i32 %cmp, %cmp |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5457 | %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 } |
Tim Northover | 420a216 | 2014-06-13 14:24:07 +0000 | [diff] [blame] | 5458 | %value_loaded = extractvalue { i32, i1 } %val_success, 0 |
| 5459 | %success = extractvalue { i32, i1 } %val_success, 1 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5460 | br i1 %success, label %done, label %loop |
| 5461 | |
| 5462 | done: |
| 5463 | ... |
| 5464 | |
| 5465 | .. _i_atomicrmw: |
| 5466 | |
| 5467 | '``atomicrmw``' Instruction |
| 5468 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5469 | |
| 5470 | Syntax: |
| 5471 | """"""" |
| 5472 | |
| 5473 | :: |
| 5474 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5475 | atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields ty |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5476 | |
| 5477 | Overview: |
| 5478 | """"""""" |
| 5479 | |
| 5480 | The '``atomicrmw``' instruction is used to atomically modify memory. |
| 5481 | |
| 5482 | Arguments: |
| 5483 | """""""""" |
| 5484 | |
| 5485 | There are three arguments to the '``atomicrmw``' instruction: an |
| 5486 | operation to apply, an address whose value to modify, an argument to the |
| 5487 | operation. The operation must be one of the following keywords: |
| 5488 | |
| 5489 | - xchg |
| 5490 | - add |
| 5491 | - sub |
| 5492 | - and |
| 5493 | - nand |
| 5494 | - or |
| 5495 | - xor |
| 5496 | - max |
| 5497 | - min |
| 5498 | - umax |
| 5499 | - umin |
| 5500 | |
| 5501 | The type of '<value>' must be an integer type whose bit width is a power |
| 5502 | of two greater than or equal to eight and less than or equal to a |
| 5503 | target-specific size limit. The type of the '``<pointer>``' operand must |
| 5504 | be a pointer to that type. If the ``atomicrmw`` is marked as |
| 5505 | ``volatile``, then the optimizer is not allowed to modify the number or |
| 5506 | order of execution of this ``atomicrmw`` with other :ref:`volatile |
| 5507 | operations <volatile>`. |
| 5508 | |
| 5509 | Semantics: |
| 5510 | """""""""" |
| 5511 | |
| 5512 | The contents of memory at the location specified by the '``<pointer>``' |
| 5513 | operand are atomically read, modified, and written back. The original |
| 5514 | value at the location is returned. The modification is specified by the |
| 5515 | operation argument: |
| 5516 | |
| 5517 | - xchg: ``*ptr = val`` |
| 5518 | - add: ``*ptr = *ptr + val`` |
| 5519 | - sub: ``*ptr = *ptr - val`` |
| 5520 | - and: ``*ptr = *ptr & val`` |
| 5521 | - nand: ``*ptr = ~(*ptr & val)`` |
| 5522 | - or: ``*ptr = *ptr | val`` |
| 5523 | - xor: ``*ptr = *ptr ^ val`` |
| 5524 | - max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison) |
| 5525 | - min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison) |
| 5526 | - umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned |
| 5527 | comparison) |
| 5528 | - umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned |
| 5529 | comparison) |
| 5530 | |
| 5531 | Example: |
| 5532 | """""""" |
| 5533 | |
| 5534 | .. code-block:: llvm |
| 5535 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 5536 | %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 5537 | |
| 5538 | .. _i_getelementptr: |
| 5539 | |
| 5540 | '``getelementptr``' Instruction |
| 5541 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5542 | |
| 5543 | Syntax: |
| 5544 | """"""" |
| 5545 | |
| 5546 | :: |
| 5547 | |
| 5548 | <result> = getelementptr <pty>* <ptrval>{, <ty> <idx>}* |
| 5549 | <result> = getelementptr inbounds <pty>* <ptrval>{, <ty> <idx>}* |
| 5550 | <result> = getelementptr <ptr vector> ptrval, <vector index type> idx |
| 5551 | |
| 5552 | Overview: |
| 5553 | """"""""" |
| 5554 | |
| 5555 | The '``getelementptr``' instruction is used to get the address of a |
| 5556 | subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs |
| 5557 | address calculation only and does not access memory. |
| 5558 | |
| 5559 | Arguments: |
| 5560 | """""""""" |
| 5561 | |
| 5562 | The first argument is always a pointer or a vector of pointers, and |
| 5563 | forms the basis of the calculation. The remaining arguments are indices |
| 5564 | that indicate which of the elements of the aggregate object are indexed. |
| 5565 | The interpretation of each index is dependent on the type being indexed |
| 5566 | into. The first index always indexes the pointer value given as the |
| 5567 | first argument, the second index indexes a value of the type pointed to |
| 5568 | (not necessarily the value directly pointed to, since the first index |
| 5569 | can be non-zero), etc. The first type indexed into must be a pointer |
| 5570 | value, subsequent types can be arrays, vectors, and structs. Note that |
| 5571 | subsequent types being indexed into can never be pointers, since that |
| 5572 | would require loading the pointer before continuing calculation. |
| 5573 | |
| 5574 | The type of each index argument depends on the type it is indexing into. |
| 5575 | When indexing into a (optionally packed) structure, only ``i32`` integer |
| 5576 | **constants** are allowed (when using a vector of indices they must all |
| 5577 | be the **same** ``i32`` integer constant). When indexing into an array, |
| 5578 | pointer or vector, integers of any width are allowed, and they are not |
| 5579 | required to be constant. These integers are treated as signed values |
| 5580 | where relevant. |
| 5581 | |
| 5582 | For example, let's consider a C code fragment and how it gets compiled |
| 5583 | to LLVM: |
| 5584 | |
| 5585 | .. code-block:: c |
| 5586 | |
| 5587 | struct RT { |
| 5588 | char A; |
| 5589 | int B[10][20]; |
| 5590 | char C; |
| 5591 | }; |
| 5592 | struct ST { |
| 5593 | int X; |
| 5594 | double Y; |
| 5595 | struct RT Z; |
| 5596 | }; |
| 5597 | |
| 5598 | int *foo(struct ST *s) { |
| 5599 | return &s[1].Z.B[5][13]; |
| 5600 | } |
| 5601 | |
| 5602 | The LLVM code generated by Clang is: |
| 5603 | |
| 5604 | .. code-block:: llvm |
| 5605 | |
| 5606 | %struct.RT = type { i8, [10 x [20 x i32]], i8 } |
| 5607 | %struct.ST = type { i32, double, %struct.RT } |
| 5608 | |
| 5609 | define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp { |
| 5610 | entry: |
| 5611 | %arrayidx = getelementptr inbounds %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13 |
| 5612 | ret i32* %arrayidx |
| 5613 | } |
| 5614 | |
| 5615 | Semantics: |
| 5616 | """""""""" |
| 5617 | |
| 5618 | In the example above, the first index is indexing into the |
| 5619 | '``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``' |
| 5620 | = '``{ i32, double, %struct.RT }``' type, a structure. The second index |
| 5621 | indexes into the third element of the structure, yielding a |
| 5622 | '``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another |
| 5623 | structure. The third index indexes into the second element of the |
| 5624 | structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two |
| 5625 | dimensions of the array are subscripted into, yielding an '``i32``' |
| 5626 | type. The '``getelementptr``' instruction returns a pointer to this |
| 5627 | element, thus computing a value of '``i32*``' type. |
| 5628 | |
| 5629 | Note that it is perfectly legal to index partially through a structure, |
| 5630 | returning a pointer to an inner element. Because of this, the LLVM code |
| 5631 | for the given testcase is equivalent to: |
| 5632 | |
| 5633 | .. code-block:: llvm |
| 5634 | |
| 5635 | define i32* @foo(%struct.ST* %s) { |
| 5636 | %t1 = getelementptr %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1 |
| 5637 | %t2 = getelementptr %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2 |
| 5638 | %t3 = getelementptr %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3 |
| 5639 | %t4 = getelementptr [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4 |
| 5640 | %t5 = getelementptr [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5 |
| 5641 | ret i32* %t5 |
| 5642 | } |
| 5643 | |
| 5644 | If the ``inbounds`` keyword is present, the result value of the |
| 5645 | ``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base |
| 5646 | pointer is not an *in bounds* address of an allocated object, or if any |
| 5647 | of the addresses that would be formed by successive addition of the |
| 5648 | offsets implied by the indices to the base address with infinitely |
| 5649 | precise signed arithmetic are not an *in bounds* address of that |
| 5650 | allocated object. The *in bounds* addresses for an allocated object are |
| 5651 | all the addresses that point into the object, plus the address one byte |
| 5652 | past the end. In cases where the base is a vector of pointers the |
| 5653 | ``inbounds`` keyword applies to each of the computations element-wise. |
| 5654 | |
| 5655 | If the ``inbounds`` keyword is not present, the offsets are added to the |
| 5656 | base address with silently-wrapping two's complement arithmetic. If the |
| 5657 | offsets have a different width from the pointer, they are sign-extended |
| 5658 | or truncated to the width of the pointer. The result value of the |
| 5659 | ``getelementptr`` may be outside the object pointed to by the base |
| 5660 | pointer. The result value may not necessarily be used to access memory |
| 5661 | though, even if it happens to point into allocated storage. See the |
| 5662 | :ref:`Pointer Aliasing Rules <pointeraliasing>` section for more |
| 5663 | information. |
| 5664 | |
| 5665 | The getelementptr instruction is often confusing. For some more insight |
| 5666 | into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`. |
| 5667 | |
| 5668 | Example: |
| 5669 | """""""" |
| 5670 | |
| 5671 | .. code-block:: llvm |
| 5672 | |
| 5673 | ; yields [12 x i8]*:aptr |
| 5674 | %aptr = getelementptr {i32, [12 x i8]}* %saptr, i64 0, i32 1 |
| 5675 | ; yields i8*:vptr |
| 5676 | %vptr = getelementptr {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1 |
| 5677 | ; yields i8*:eptr |
| 5678 | %eptr = getelementptr [12 x i8]* %aptr, i64 0, i32 1 |
| 5679 | ; yields i32*:iptr |
| 5680 | %iptr = getelementptr [10 x i32]* @arr, i16 0, i16 0 |
| 5681 | |
| 5682 | In cases where the pointer argument is a vector of pointers, each index |
| 5683 | must be a vector with the same number of elements. For example: |
| 5684 | |
| 5685 | .. code-block:: llvm |
| 5686 | |
| 5687 | %A = getelementptr <4 x i8*> %ptrs, <4 x i64> %offsets, |
| 5688 | |
| 5689 | Conversion Operations |
| 5690 | --------------------- |
| 5691 | |
| 5692 | The instructions in this category are the conversion instructions |
| 5693 | (casting) which all take a single operand and a type. They perform |
| 5694 | various bit conversions on the operand. |
| 5695 | |
| 5696 | '``trunc .. to``' Instruction |
| 5697 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5698 | |
| 5699 | Syntax: |
| 5700 | """"""" |
| 5701 | |
| 5702 | :: |
| 5703 | |
| 5704 | <result> = trunc <ty> <value> to <ty2> ; yields ty2 |
| 5705 | |
| 5706 | Overview: |
| 5707 | """"""""" |
| 5708 | |
| 5709 | The '``trunc``' instruction truncates its operand to the type ``ty2``. |
| 5710 | |
| 5711 | Arguments: |
| 5712 | """""""""" |
| 5713 | |
| 5714 | The '``trunc``' instruction takes a value to trunc, and a type to trunc |
| 5715 | it to. Both types must be of :ref:`integer <t_integer>` types, or vectors |
| 5716 | of the same number of integers. The bit size of the ``value`` must be |
| 5717 | larger than the bit size of the destination type, ``ty2``. Equal sized |
| 5718 | types are not allowed. |
| 5719 | |
| 5720 | Semantics: |
| 5721 | """""""""" |
| 5722 | |
| 5723 | The '``trunc``' instruction truncates the high order bits in ``value`` |
| 5724 | and converts the remaining bits to ``ty2``. Since the source size must |
| 5725 | be larger than the destination size, ``trunc`` cannot be a *no-op cast*. |
| 5726 | It will always truncate bits. |
| 5727 | |
| 5728 | Example: |
| 5729 | """""""" |
| 5730 | |
| 5731 | .. code-block:: llvm |
| 5732 | |
| 5733 | %X = trunc i32 257 to i8 ; yields i8:1 |
| 5734 | %Y = trunc i32 123 to i1 ; yields i1:true |
| 5735 | %Z = trunc i32 122 to i1 ; yields i1:false |
| 5736 | %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7> |
| 5737 | |
| 5738 | '``zext .. to``' Instruction |
| 5739 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5740 | |
| 5741 | Syntax: |
| 5742 | """"""" |
| 5743 | |
| 5744 | :: |
| 5745 | |
| 5746 | <result> = zext <ty> <value> to <ty2> ; yields ty2 |
| 5747 | |
| 5748 | Overview: |
| 5749 | """"""""" |
| 5750 | |
| 5751 | The '``zext``' instruction zero extends its operand to type ``ty2``. |
| 5752 | |
| 5753 | Arguments: |
| 5754 | """""""""" |
| 5755 | |
| 5756 | The '``zext``' instruction takes a value to cast, and a type to cast it |
| 5757 | to. Both types must be of :ref:`integer <t_integer>` types, or vectors of |
| 5758 | the same number of integers. The bit size of the ``value`` must be |
| 5759 | smaller than the bit size of the destination type, ``ty2``. |
| 5760 | |
| 5761 | Semantics: |
| 5762 | """""""""" |
| 5763 | |
| 5764 | The ``zext`` fills the high order bits of the ``value`` with zero bits |
| 5765 | until it reaches the size of the destination type, ``ty2``. |
| 5766 | |
| 5767 | When zero extending from i1, the result will always be either 0 or 1. |
| 5768 | |
| 5769 | Example: |
| 5770 | """""""" |
| 5771 | |
| 5772 | .. code-block:: llvm |
| 5773 | |
| 5774 | %X = zext i32 257 to i64 ; yields i64:257 |
| 5775 | %Y = zext i1 true to i32 ; yields i32:1 |
| 5776 | %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> |
| 5777 | |
| 5778 | '``sext .. to``' Instruction |
| 5779 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5780 | |
| 5781 | Syntax: |
| 5782 | """"""" |
| 5783 | |
| 5784 | :: |
| 5785 | |
| 5786 | <result> = sext <ty> <value> to <ty2> ; yields ty2 |
| 5787 | |
| 5788 | Overview: |
| 5789 | """"""""" |
| 5790 | |
| 5791 | The '``sext``' sign extends ``value`` to the type ``ty2``. |
| 5792 | |
| 5793 | Arguments: |
| 5794 | """""""""" |
| 5795 | |
| 5796 | The '``sext``' instruction takes a value to cast, and a type to cast it |
| 5797 | to. Both types must be of :ref:`integer <t_integer>` types, or vectors of |
| 5798 | the same number of integers. The bit size of the ``value`` must be |
| 5799 | smaller than the bit size of the destination type, ``ty2``. |
| 5800 | |
| 5801 | Semantics: |
| 5802 | """""""""" |
| 5803 | |
| 5804 | The '``sext``' instruction performs a sign extension by copying the sign |
| 5805 | bit (highest order bit) of the ``value`` until it reaches the bit size |
| 5806 | of the type ``ty2``. |
| 5807 | |
| 5808 | When sign extending from i1, the extension always results in -1 or 0. |
| 5809 | |
| 5810 | Example: |
| 5811 | """""""" |
| 5812 | |
| 5813 | .. code-block:: llvm |
| 5814 | |
| 5815 | %X = sext i8 -1 to i16 ; yields i16 :65535 |
| 5816 | %Y = sext i1 true to i32 ; yields i32:-1 |
| 5817 | %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7> |
| 5818 | |
| 5819 | '``fptrunc .. to``' Instruction |
| 5820 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5821 | |
| 5822 | Syntax: |
| 5823 | """"""" |
| 5824 | |
| 5825 | :: |
| 5826 | |
| 5827 | <result> = fptrunc <ty> <value> to <ty2> ; yields ty2 |
| 5828 | |
| 5829 | Overview: |
| 5830 | """"""""" |
| 5831 | |
| 5832 | The '``fptrunc``' instruction truncates ``value`` to type ``ty2``. |
| 5833 | |
| 5834 | Arguments: |
| 5835 | """""""""" |
| 5836 | |
| 5837 | The '``fptrunc``' instruction takes a :ref:`floating point <t_floating>` |
| 5838 | value to cast and a :ref:`floating point <t_floating>` type to cast it to. |
| 5839 | The size of ``value`` must be larger than the size of ``ty2``. This |
| 5840 | implies that ``fptrunc`` cannot be used to make a *no-op cast*. |
| 5841 | |
| 5842 | Semantics: |
| 5843 | """""""""" |
| 5844 | |
| 5845 | The '``fptrunc``' instruction truncates a ``value`` from a larger |
| 5846 | :ref:`floating point <t_floating>` type to a smaller :ref:`floating |
| 5847 | point <t_floating>` type. If the value cannot fit within the |
| 5848 | destination type, ``ty2``, then the results are undefined. |
| 5849 | |
| 5850 | Example: |
| 5851 | """""""" |
| 5852 | |
| 5853 | .. code-block:: llvm |
| 5854 | |
| 5855 | %X = fptrunc double 123.0 to float ; yields float:123.0 |
| 5856 | %Y = fptrunc double 1.0E+300 to float ; yields undefined |
| 5857 | |
| 5858 | '``fpext .. to``' Instruction |
| 5859 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5860 | |
| 5861 | Syntax: |
| 5862 | """"""" |
| 5863 | |
| 5864 | :: |
| 5865 | |
| 5866 | <result> = fpext <ty> <value> to <ty2> ; yields ty2 |
| 5867 | |
| 5868 | Overview: |
| 5869 | """"""""" |
| 5870 | |
| 5871 | The '``fpext``' extends a floating point ``value`` to a larger floating |
| 5872 | point value. |
| 5873 | |
| 5874 | Arguments: |
| 5875 | """""""""" |
| 5876 | |
| 5877 | The '``fpext``' instruction takes a :ref:`floating point <t_floating>` |
| 5878 | ``value`` to cast, and a :ref:`floating point <t_floating>` type to cast it |
| 5879 | to. The source type must be smaller than the destination type. |
| 5880 | |
| 5881 | Semantics: |
| 5882 | """""""""" |
| 5883 | |
| 5884 | The '``fpext``' instruction extends the ``value`` from a smaller |
| 5885 | :ref:`floating point <t_floating>` type to a larger :ref:`floating |
| 5886 | point <t_floating>` type. The ``fpext`` cannot be used to make a |
| 5887 | *no-op cast* because it always changes bits. Use ``bitcast`` to make a |
| 5888 | *no-op cast* for a floating point cast. |
| 5889 | |
| 5890 | Example: |
| 5891 | """""""" |
| 5892 | |
| 5893 | .. code-block:: llvm |
| 5894 | |
| 5895 | %X = fpext float 3.125 to double ; yields double:3.125000e+00 |
| 5896 | %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000 |
| 5897 | |
| 5898 | '``fptoui .. to``' Instruction |
| 5899 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5900 | |
| 5901 | Syntax: |
| 5902 | """"""" |
| 5903 | |
| 5904 | :: |
| 5905 | |
| 5906 | <result> = fptoui <ty> <value> to <ty2> ; yields ty2 |
| 5907 | |
| 5908 | Overview: |
| 5909 | """"""""" |
| 5910 | |
| 5911 | The '``fptoui``' converts a floating point ``value`` to its unsigned |
| 5912 | integer equivalent of type ``ty2``. |
| 5913 | |
| 5914 | Arguments: |
| 5915 | """""""""" |
| 5916 | |
| 5917 | The '``fptoui``' instruction takes a value to cast, which must be a |
| 5918 | scalar or vector :ref:`floating point <t_floating>` value, and a type to |
| 5919 | cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If |
| 5920 | ``ty`` is a vector floating point type, ``ty2`` must be a vector integer |
| 5921 | type with the same number of elements as ``ty`` |
| 5922 | |
| 5923 | Semantics: |
| 5924 | """""""""" |
| 5925 | |
| 5926 | The '``fptoui``' instruction converts its :ref:`floating |
| 5927 | point <t_floating>` operand into the nearest (rounding towards zero) |
| 5928 | unsigned integer value. If the value cannot fit in ``ty2``, the results |
| 5929 | are undefined. |
| 5930 | |
| 5931 | Example: |
| 5932 | """""""" |
| 5933 | |
| 5934 | .. code-block:: llvm |
| 5935 | |
| 5936 | %X = fptoui double 123.0 to i32 ; yields i32:123 |
| 5937 | %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1 |
| 5938 | %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1 |
| 5939 | |
| 5940 | '``fptosi .. to``' Instruction |
| 5941 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5942 | |
| 5943 | Syntax: |
| 5944 | """"""" |
| 5945 | |
| 5946 | :: |
| 5947 | |
| 5948 | <result> = fptosi <ty> <value> to <ty2> ; yields ty2 |
| 5949 | |
| 5950 | Overview: |
| 5951 | """"""""" |
| 5952 | |
| 5953 | The '``fptosi``' instruction converts :ref:`floating point <t_floating>` |
| 5954 | ``value`` to type ``ty2``. |
| 5955 | |
| 5956 | Arguments: |
| 5957 | """""""""" |
| 5958 | |
| 5959 | The '``fptosi``' instruction takes a value to cast, which must be a |
| 5960 | scalar or vector :ref:`floating point <t_floating>` value, and a type to |
| 5961 | cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If |
| 5962 | ``ty`` is a vector floating point type, ``ty2`` must be a vector integer |
| 5963 | type with the same number of elements as ``ty`` |
| 5964 | |
| 5965 | Semantics: |
| 5966 | """""""""" |
| 5967 | |
| 5968 | The '``fptosi``' instruction converts its :ref:`floating |
| 5969 | point <t_floating>` operand into the nearest (rounding towards zero) |
| 5970 | signed integer value. If the value cannot fit in ``ty2``, the results |
| 5971 | are undefined. |
| 5972 | |
| 5973 | Example: |
| 5974 | """""""" |
| 5975 | |
| 5976 | .. code-block:: llvm |
| 5977 | |
| 5978 | %X = fptosi double -123.0 to i32 ; yields i32:-123 |
| 5979 | %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1 |
| 5980 | %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1 |
| 5981 | |
| 5982 | '``uitofp .. to``' Instruction |
| 5983 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 5984 | |
| 5985 | Syntax: |
| 5986 | """"""" |
| 5987 | |
| 5988 | :: |
| 5989 | |
| 5990 | <result> = uitofp <ty> <value> to <ty2> ; yields ty2 |
| 5991 | |
| 5992 | Overview: |
| 5993 | """"""""" |
| 5994 | |
| 5995 | The '``uitofp``' instruction regards ``value`` as an unsigned integer |
| 5996 | and converts that value to the ``ty2`` type. |
| 5997 | |
| 5998 | Arguments: |
| 5999 | """""""""" |
| 6000 | |
| 6001 | The '``uitofp``' instruction takes a value to cast, which must be a |
| 6002 | scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to |
| 6003 | ``ty2``, which must be an :ref:`floating point <t_floating>` type. If |
| 6004 | ``ty`` is a vector integer type, ``ty2`` must be a vector floating point |
| 6005 | type with the same number of elements as ``ty`` |
| 6006 | |
| 6007 | Semantics: |
| 6008 | """""""""" |
| 6009 | |
| 6010 | The '``uitofp``' instruction interprets its operand as an unsigned |
| 6011 | integer quantity and converts it to the corresponding floating point |
| 6012 | value. If the value cannot fit in the floating point value, the results |
| 6013 | are undefined. |
| 6014 | |
| 6015 | Example: |
| 6016 | """""""" |
| 6017 | |
| 6018 | .. code-block:: llvm |
| 6019 | |
| 6020 | %X = uitofp i32 257 to float ; yields float:257.0 |
| 6021 | %Y = uitofp i8 -1 to double ; yields double:255.0 |
| 6022 | |
| 6023 | '``sitofp .. to``' Instruction |
| 6024 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6025 | |
| 6026 | Syntax: |
| 6027 | """"""" |
| 6028 | |
| 6029 | :: |
| 6030 | |
| 6031 | <result> = sitofp <ty> <value> to <ty2> ; yields ty2 |
| 6032 | |
| 6033 | Overview: |
| 6034 | """"""""" |
| 6035 | |
| 6036 | The '``sitofp``' instruction regards ``value`` as a signed integer and |
| 6037 | converts that value to the ``ty2`` type. |
| 6038 | |
| 6039 | Arguments: |
| 6040 | """""""""" |
| 6041 | |
| 6042 | The '``sitofp``' instruction takes a value to cast, which must be a |
| 6043 | scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to |
| 6044 | ``ty2``, which must be an :ref:`floating point <t_floating>` type. If |
| 6045 | ``ty`` is a vector integer type, ``ty2`` must be a vector floating point |
| 6046 | type with the same number of elements as ``ty`` |
| 6047 | |
| 6048 | Semantics: |
| 6049 | """""""""" |
| 6050 | |
| 6051 | The '``sitofp``' instruction interprets its operand as a signed integer |
| 6052 | quantity and converts it to the corresponding floating point value. If |
| 6053 | the value cannot fit in the floating point value, the results are |
| 6054 | undefined. |
| 6055 | |
| 6056 | Example: |
| 6057 | """""""" |
| 6058 | |
| 6059 | .. code-block:: llvm |
| 6060 | |
| 6061 | %X = sitofp i32 257 to float ; yields float:257.0 |
| 6062 | %Y = sitofp i8 -1 to double ; yields double:-1.0 |
| 6063 | |
| 6064 | .. _i_ptrtoint: |
| 6065 | |
| 6066 | '``ptrtoint .. to``' Instruction |
| 6067 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6068 | |
| 6069 | Syntax: |
| 6070 | """"""" |
| 6071 | |
| 6072 | :: |
| 6073 | |
| 6074 | <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2 |
| 6075 | |
| 6076 | Overview: |
| 6077 | """"""""" |
| 6078 | |
| 6079 | The '``ptrtoint``' instruction converts the pointer or a vector of |
| 6080 | pointers ``value`` to the integer (or vector of integers) type ``ty2``. |
| 6081 | |
| 6082 | Arguments: |
| 6083 | """""""""" |
| 6084 | |
| 6085 | The '``ptrtoint``' instruction takes a ``value`` to cast, which must be |
| 6086 | a a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a |
| 6087 | type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or |
| 6088 | a vector of integers type. |
| 6089 | |
| 6090 | Semantics: |
| 6091 | """""""""" |
| 6092 | |
| 6093 | The '``ptrtoint``' instruction converts ``value`` to integer type |
| 6094 | ``ty2`` by interpreting the pointer value as an integer and either |
| 6095 | truncating or zero extending that value to the size of the integer type. |
| 6096 | If ``value`` is smaller than ``ty2`` then a zero extension is done. If |
| 6097 | ``value`` is larger than ``ty2`` then a truncation is done. If they are |
| 6098 | the same size, then nothing is done (*no-op cast*) other than a type |
| 6099 | change. |
| 6100 | |
| 6101 | Example: |
| 6102 | """""""" |
| 6103 | |
| 6104 | .. code-block:: llvm |
| 6105 | |
| 6106 | %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture |
| 6107 | %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture |
| 6108 | %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture |
| 6109 | |
| 6110 | .. _i_inttoptr: |
| 6111 | |
| 6112 | '``inttoptr .. to``' Instruction |
| 6113 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6114 | |
| 6115 | Syntax: |
| 6116 | """"""" |
| 6117 | |
| 6118 | :: |
| 6119 | |
| 6120 | <result> = inttoptr <ty> <value> to <ty2> ; yields ty2 |
| 6121 | |
| 6122 | Overview: |
| 6123 | """"""""" |
| 6124 | |
| 6125 | The '``inttoptr``' instruction converts an integer ``value`` to a |
| 6126 | pointer type, ``ty2``. |
| 6127 | |
| 6128 | Arguments: |
| 6129 | """""""""" |
| 6130 | |
| 6131 | The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to |
| 6132 | cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>` |
| 6133 | type. |
| 6134 | |
| 6135 | Semantics: |
| 6136 | """""""""" |
| 6137 | |
| 6138 | The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by |
| 6139 | applying either a zero extension or a truncation depending on the size |
| 6140 | of the integer ``value``. If ``value`` is larger than the size of a |
| 6141 | pointer then a truncation is done. If ``value`` is smaller than the size |
| 6142 | of a pointer then a zero extension is done. If they are the same size, |
| 6143 | nothing is done (*no-op cast*). |
| 6144 | |
| 6145 | Example: |
| 6146 | """""""" |
| 6147 | |
| 6148 | .. code-block:: llvm |
| 6149 | |
| 6150 | %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture |
| 6151 | %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture |
| 6152 | %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture |
| 6153 | %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers |
| 6154 | |
| 6155 | .. _i_bitcast: |
| 6156 | |
| 6157 | '``bitcast .. to``' Instruction |
| 6158 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6159 | |
| 6160 | Syntax: |
| 6161 | """"""" |
| 6162 | |
| 6163 | :: |
| 6164 | |
| 6165 | <result> = bitcast <ty> <value> to <ty2> ; yields ty2 |
| 6166 | |
| 6167 | Overview: |
| 6168 | """"""""" |
| 6169 | |
| 6170 | The '``bitcast``' instruction converts ``value`` to type ``ty2`` without |
| 6171 | changing any bits. |
| 6172 | |
| 6173 | Arguments: |
| 6174 | """""""""" |
| 6175 | |
| 6176 | The '``bitcast``' instruction takes a value to cast, which must be a |
| 6177 | non-aggregate first class value, and a type to cast it to, which must |
Matt Arsenault | 24b49c4 | 2013-07-31 17:49:08 +0000 | [diff] [blame] | 6178 | also be a non-aggregate :ref:`first class <t_firstclass>` type. The |
| 6179 | bit sizes of ``value`` and the destination type, ``ty2``, must be |
| 6180 | identical. If the source type is a pointer, the destination type must |
| 6181 | also be a pointer of the same size. This instruction supports bitwise |
| 6182 | conversion of vectors to integers and to vectors of other types (as |
| 6183 | long as they have the same size). |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6184 | |
| 6185 | Semantics: |
| 6186 | """""""""" |
| 6187 | |
Matt Arsenault | 24b49c4 | 2013-07-31 17:49:08 +0000 | [diff] [blame] | 6188 | The '``bitcast``' instruction converts ``value`` to type ``ty2``. It |
| 6189 | is always a *no-op cast* because no bits change with this |
| 6190 | conversion. The conversion is done as if the ``value`` had been stored |
| 6191 | to memory and read back as type ``ty2``. Pointer (or vector of |
| 6192 | pointers) types may only be converted to other pointer (or vector of |
Matt Arsenault | b03bd4d | 2013-11-15 01:34:59 +0000 | [diff] [blame] | 6193 | pointers) types with the same address space through this instruction. |
| 6194 | To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>` |
| 6195 | or :ref:`ptrtoint <i_ptrtoint>` instructions first. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6196 | |
| 6197 | Example: |
| 6198 | """""""" |
| 6199 | |
| 6200 | .. code-block:: llvm |
| 6201 | |
| 6202 | %X = bitcast i8 255 to i8 ; yields i8 :-1 |
| 6203 | %Y = bitcast i32* %x to sint* ; yields sint*:%x |
| 6204 | %Z = bitcast <2 x int> %V to i64; ; yields i64: %V |
| 6205 | %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*> |
| 6206 | |
Matt Arsenault | b03bd4d | 2013-11-15 01:34:59 +0000 | [diff] [blame] | 6207 | .. _i_addrspacecast: |
| 6208 | |
| 6209 | '``addrspacecast .. to``' Instruction |
| 6210 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6211 | |
| 6212 | Syntax: |
| 6213 | """"""" |
| 6214 | |
| 6215 | :: |
| 6216 | |
| 6217 | <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2 |
| 6218 | |
| 6219 | Overview: |
| 6220 | """"""""" |
| 6221 | |
| 6222 | The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in |
| 6223 | address space ``n`` to type ``pty2`` in address space ``m``. |
| 6224 | |
| 6225 | Arguments: |
| 6226 | """""""""" |
| 6227 | |
| 6228 | The '``addrspacecast``' instruction takes a pointer or vector of pointer value |
| 6229 | to cast and a pointer type to cast it to, which must have a different |
| 6230 | address space. |
| 6231 | |
| 6232 | Semantics: |
| 6233 | """""""""" |
| 6234 | |
| 6235 | The '``addrspacecast``' instruction converts the pointer value |
| 6236 | ``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex |
Matt Arsenault | 54a2a17 | 2013-11-15 05:44:56 +0000 | [diff] [blame] | 6237 | value modification, depending on the target and the address space |
| 6238 | pair. Pointer conversions within the same address space must be |
| 6239 | performed with the ``bitcast`` instruction. Note that if the address space |
Matt Arsenault | b03bd4d | 2013-11-15 01:34:59 +0000 | [diff] [blame] | 6240 | conversion is legal then both result and operand refer to the same memory |
| 6241 | location. |
| 6242 | |
| 6243 | Example: |
| 6244 | """""""" |
| 6245 | |
| 6246 | .. code-block:: llvm |
| 6247 | |
Matt Arsenault | 9c13dd0 | 2013-11-15 22:43:50 +0000 | [diff] [blame] | 6248 | %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x |
| 6249 | %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y |
| 6250 | %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z |
Matt Arsenault | b03bd4d | 2013-11-15 01:34:59 +0000 | [diff] [blame] | 6251 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6252 | .. _otherops: |
| 6253 | |
| 6254 | Other Operations |
| 6255 | ---------------- |
| 6256 | |
| 6257 | The instructions in this category are the "miscellaneous" instructions, |
| 6258 | which defy better classification. |
| 6259 | |
| 6260 | .. _i_icmp: |
| 6261 | |
| 6262 | '``icmp``' Instruction |
| 6263 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 6264 | |
| 6265 | Syntax: |
| 6266 | """"""" |
| 6267 | |
| 6268 | :: |
| 6269 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 6270 | <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6271 | |
| 6272 | Overview: |
| 6273 | """"""""" |
| 6274 | |
| 6275 | The '``icmp``' instruction returns a boolean value or a vector of |
| 6276 | boolean values based on comparison of its two integer, integer vector, |
| 6277 | pointer, or pointer vector operands. |
| 6278 | |
| 6279 | Arguments: |
| 6280 | """""""""" |
| 6281 | |
| 6282 | The '``icmp``' instruction takes three operands. The first operand is |
| 6283 | the condition code indicating the kind of comparison to perform. It is |
| 6284 | not a value, just a keyword. The possible condition code are: |
| 6285 | |
| 6286 | #. ``eq``: equal |
| 6287 | #. ``ne``: not equal |
| 6288 | #. ``ugt``: unsigned greater than |
| 6289 | #. ``uge``: unsigned greater or equal |
| 6290 | #. ``ult``: unsigned less than |
| 6291 | #. ``ule``: unsigned less or equal |
| 6292 | #. ``sgt``: signed greater than |
| 6293 | #. ``sge``: signed greater or equal |
| 6294 | #. ``slt``: signed less than |
| 6295 | #. ``sle``: signed less or equal |
| 6296 | |
| 6297 | The remaining two arguments must be :ref:`integer <t_integer>` or |
| 6298 | :ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They |
| 6299 | must also be identical types. |
| 6300 | |
| 6301 | Semantics: |
| 6302 | """""""""" |
| 6303 | |
| 6304 | The '``icmp``' compares ``op1`` and ``op2`` according to the condition |
| 6305 | code given as ``cond``. The comparison performed always yields either an |
| 6306 | :ref:`i1 <t_integer>` or vector of ``i1`` result, as follows: |
| 6307 | |
| 6308 | #. ``eq``: yields ``true`` if the operands are equal, ``false`` |
| 6309 | otherwise. No sign interpretation is necessary or performed. |
| 6310 | #. ``ne``: yields ``true`` if the operands are unequal, ``false`` |
| 6311 | otherwise. No sign interpretation is necessary or performed. |
| 6312 | #. ``ugt``: interprets the operands as unsigned values and yields |
| 6313 | ``true`` if ``op1`` is greater than ``op2``. |
| 6314 | #. ``uge``: interprets the operands as unsigned values and yields |
| 6315 | ``true`` if ``op1`` is greater than or equal to ``op2``. |
| 6316 | #. ``ult``: interprets the operands as unsigned values and yields |
| 6317 | ``true`` if ``op1`` is less than ``op2``. |
| 6318 | #. ``ule``: interprets the operands as unsigned values and yields |
| 6319 | ``true`` if ``op1`` is less than or equal to ``op2``. |
| 6320 | #. ``sgt``: interprets the operands as signed values and yields ``true`` |
| 6321 | if ``op1`` is greater than ``op2``. |
| 6322 | #. ``sge``: interprets the operands as signed values and yields ``true`` |
| 6323 | if ``op1`` is greater than or equal to ``op2``. |
| 6324 | #. ``slt``: interprets the operands as signed values and yields ``true`` |
| 6325 | if ``op1`` is less than ``op2``. |
| 6326 | #. ``sle``: interprets the operands as signed values and yields ``true`` |
| 6327 | if ``op1`` is less than or equal to ``op2``. |
| 6328 | |
| 6329 | If the operands are :ref:`pointer <t_pointer>` typed, the pointer values |
| 6330 | are compared as if they were integers. |
| 6331 | |
| 6332 | If the operands are integer vectors, then they are compared element by |
| 6333 | element. The result is an ``i1`` vector with the same number of elements |
| 6334 | as the values being compared. Otherwise, the result is an ``i1``. |
| 6335 | |
| 6336 | Example: |
| 6337 | """""""" |
| 6338 | |
| 6339 | .. code-block:: llvm |
| 6340 | |
| 6341 | <result> = icmp eq i32 4, 5 ; yields: result=false |
| 6342 | <result> = icmp ne float* %X, %X ; yields: result=false |
| 6343 | <result> = icmp ult i16 4, 5 ; yields: result=true |
| 6344 | <result> = icmp sgt i16 4, 5 ; yields: result=false |
| 6345 | <result> = icmp ule i16 -4, 5 ; yields: result=false |
| 6346 | <result> = icmp sge i16 4, 5 ; yields: result=false |
| 6347 | |
| 6348 | Note that the code generator does not yet support vector types with the |
| 6349 | ``icmp`` instruction. |
| 6350 | |
| 6351 | .. _i_fcmp: |
| 6352 | |
| 6353 | '``fcmp``' Instruction |
| 6354 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 6355 | |
| 6356 | Syntax: |
| 6357 | """"""" |
| 6358 | |
| 6359 | :: |
| 6360 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 6361 | <result> = fcmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6362 | |
| 6363 | Overview: |
| 6364 | """"""""" |
| 6365 | |
| 6366 | The '``fcmp``' instruction returns a boolean value or vector of boolean |
| 6367 | values based on comparison of its operands. |
| 6368 | |
| 6369 | If the operands are floating point scalars, then the result type is a |
| 6370 | boolean (:ref:`i1 <t_integer>`). |
| 6371 | |
| 6372 | If the operands are floating point vectors, then the result type is a |
| 6373 | vector of boolean with the same number of elements as the operands being |
| 6374 | compared. |
| 6375 | |
| 6376 | Arguments: |
| 6377 | """""""""" |
| 6378 | |
| 6379 | The '``fcmp``' instruction takes three operands. The first operand is |
| 6380 | the condition code indicating the kind of comparison to perform. It is |
| 6381 | not a value, just a keyword. The possible condition code are: |
| 6382 | |
| 6383 | #. ``false``: no comparison, always returns false |
| 6384 | #. ``oeq``: ordered and equal |
| 6385 | #. ``ogt``: ordered and greater than |
| 6386 | #. ``oge``: ordered and greater than or equal |
| 6387 | #. ``olt``: ordered and less than |
| 6388 | #. ``ole``: ordered and less than or equal |
| 6389 | #. ``one``: ordered and not equal |
| 6390 | #. ``ord``: ordered (no nans) |
| 6391 | #. ``ueq``: unordered or equal |
| 6392 | #. ``ugt``: unordered or greater than |
| 6393 | #. ``uge``: unordered or greater than or equal |
| 6394 | #. ``ult``: unordered or less than |
| 6395 | #. ``ule``: unordered or less than or equal |
| 6396 | #. ``une``: unordered or not equal |
| 6397 | #. ``uno``: unordered (either nans) |
| 6398 | #. ``true``: no comparison, always returns true |
| 6399 | |
| 6400 | *Ordered* means that neither operand is a QNAN while *unordered* means |
| 6401 | that either operand may be a QNAN. |
| 6402 | |
| 6403 | Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating |
| 6404 | point <t_floating>` type or a :ref:`vector <t_vector>` of floating point |
| 6405 | type. They must have identical types. |
| 6406 | |
| 6407 | Semantics: |
| 6408 | """""""""" |
| 6409 | |
| 6410 | The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the |
| 6411 | condition code given as ``cond``. If the operands are vectors, then the |
| 6412 | vectors are compared element by element. Each comparison performed |
| 6413 | always yields an :ref:`i1 <t_integer>` result, as follows: |
| 6414 | |
| 6415 | #. ``false``: always yields ``false``, regardless of operands. |
| 6416 | #. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 6417 | is equal to ``op2``. |
| 6418 | #. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 6419 | is greater than ``op2``. |
| 6420 | #. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 6421 | is greater than or equal to ``op2``. |
| 6422 | #. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 6423 | is less than ``op2``. |
| 6424 | #. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 6425 | is less than or equal to ``op2``. |
| 6426 | #. ``one``: yields ``true`` if both operands are not a QNAN and ``op1`` |
| 6427 | is not equal to ``op2``. |
| 6428 | #. ``ord``: yields ``true`` if both operands are not a QNAN. |
| 6429 | #. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 6430 | equal to ``op2``. |
| 6431 | #. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 6432 | greater than ``op2``. |
| 6433 | #. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 6434 | greater than or equal to ``op2``. |
| 6435 | #. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 6436 | less than ``op2``. |
| 6437 | #. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 6438 | less than or equal to ``op2``. |
| 6439 | #. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is |
| 6440 | not equal to ``op2``. |
| 6441 | #. ``uno``: yields ``true`` if either operand is a QNAN. |
| 6442 | #. ``true``: always yields ``true``, regardless of operands. |
| 6443 | |
| 6444 | Example: |
| 6445 | """""""" |
| 6446 | |
| 6447 | .. code-block:: llvm |
| 6448 | |
| 6449 | <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false |
| 6450 | <result> = fcmp one float 4.0, 5.0 ; yields: result=true |
| 6451 | <result> = fcmp olt float 4.0, 5.0 ; yields: result=true |
| 6452 | <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false |
| 6453 | |
| 6454 | Note that the code generator does not yet support vector types with the |
| 6455 | ``fcmp`` instruction. |
| 6456 | |
| 6457 | .. _i_phi: |
| 6458 | |
| 6459 | '``phi``' Instruction |
| 6460 | ^^^^^^^^^^^^^^^^^^^^^ |
| 6461 | |
| 6462 | Syntax: |
| 6463 | """"""" |
| 6464 | |
| 6465 | :: |
| 6466 | |
| 6467 | <result> = phi <ty> [ <val0>, <label0>], ... |
| 6468 | |
| 6469 | Overview: |
| 6470 | """"""""" |
| 6471 | |
| 6472 | The '``phi``' instruction is used to implement the φ node in the SSA |
| 6473 | graph representing the function. |
| 6474 | |
| 6475 | Arguments: |
| 6476 | """""""""" |
| 6477 | |
| 6478 | The type of the incoming values is specified with the first type field. |
| 6479 | After this, the '``phi``' instruction takes a list of pairs as |
| 6480 | arguments, with one pair for each predecessor basic block of the current |
| 6481 | block. Only values of :ref:`first class <t_firstclass>` type may be used as |
| 6482 | the value arguments to the PHI node. Only labels may be used as the |
| 6483 | label arguments. |
| 6484 | |
| 6485 | There must be no non-phi instructions between the start of a basic block |
| 6486 | and the PHI instructions: i.e. PHI instructions must be first in a basic |
| 6487 | block. |
| 6488 | |
| 6489 | For the purposes of the SSA form, the use of each incoming value is |
| 6490 | deemed to occur on the edge from the corresponding predecessor block to |
| 6491 | the current block (but after any definition of an '``invoke``' |
| 6492 | instruction's return value on the same edge). |
| 6493 | |
| 6494 | Semantics: |
| 6495 | """""""""" |
| 6496 | |
| 6497 | At runtime, the '``phi``' instruction logically takes on the value |
| 6498 | specified by the pair corresponding to the predecessor basic block that |
| 6499 | executed just prior to the current block. |
| 6500 | |
| 6501 | Example: |
| 6502 | """""""" |
| 6503 | |
| 6504 | .. code-block:: llvm |
| 6505 | |
| 6506 | Loop: ; Infinite loop that counts from 0 on up... |
| 6507 | %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ] |
| 6508 | %nextindvar = add i32 %indvar, 1 |
| 6509 | br label %Loop |
| 6510 | |
| 6511 | .. _i_select: |
| 6512 | |
| 6513 | '``select``' Instruction |
| 6514 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6515 | |
| 6516 | Syntax: |
| 6517 | """"""" |
| 6518 | |
| 6519 | :: |
| 6520 | |
| 6521 | <result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty |
| 6522 | |
| 6523 | selty is either i1 or {<N x i1>} |
| 6524 | |
| 6525 | Overview: |
| 6526 | """"""""" |
| 6527 | |
| 6528 | The '``select``' instruction is used to choose one value based on a |
Joerg Sonnenberger | 94321ec | 2014-03-26 15:30:21 +0000 | [diff] [blame] | 6529 | condition, without IR-level branching. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6530 | |
| 6531 | Arguments: |
| 6532 | """""""""" |
| 6533 | |
| 6534 | The '``select``' instruction requires an 'i1' value or a vector of 'i1' |
| 6535 | values indicating the condition, and two values of the same :ref:`first |
| 6536 | class <t_firstclass>` type. If the val1/val2 are vectors and the |
| 6537 | condition is a scalar, then entire vectors are selected, not individual |
| 6538 | elements. |
| 6539 | |
| 6540 | Semantics: |
| 6541 | """""""""" |
| 6542 | |
| 6543 | If the condition is an i1 and it evaluates to 1, the instruction returns |
| 6544 | the first value argument; otherwise, it returns the second value |
| 6545 | argument. |
| 6546 | |
| 6547 | If the condition is a vector of i1, then the value arguments must be |
| 6548 | vectors of the same size, and the selection is done element by element. |
| 6549 | |
| 6550 | Example: |
| 6551 | """""""" |
| 6552 | |
| 6553 | .. code-block:: llvm |
| 6554 | |
| 6555 | %X = select i1 true, i8 17, i8 42 ; yields i8:17 |
| 6556 | |
| 6557 | .. _i_call: |
| 6558 | |
| 6559 | '``call``' Instruction |
| 6560 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 6561 | |
| 6562 | Syntax: |
| 6563 | """"""" |
| 6564 | |
| 6565 | :: |
| 6566 | |
Reid Kleckner | 5772b77 | 2014-04-24 20:14:34 +0000 | [diff] [blame] | 6567 | <result> = [tail | musttail] call [cconv] [ret attrs] <ty> [<fnty>*] <fnptrval>(<function args>) [fn attrs] |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6568 | |
| 6569 | Overview: |
| 6570 | """"""""" |
| 6571 | |
| 6572 | The '``call``' instruction represents a simple function call. |
| 6573 | |
| 6574 | Arguments: |
| 6575 | """""""""" |
| 6576 | |
| 6577 | This instruction requires several arguments: |
| 6578 | |
Reid Kleckner | 5772b77 | 2014-04-24 20:14:34 +0000 | [diff] [blame] | 6579 | #. The optional ``tail`` and ``musttail`` markers indicate that the optimizers |
| 6580 | should perform tail call optimization. The ``tail`` marker is a hint that |
| 6581 | `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker |
| 6582 | means that the call must be tail call optimized in order for the program to |
| 6583 | be correct. The ``musttail`` marker provides these guarantees: |
| 6584 | |
| 6585 | #. The call will not cause unbounded stack growth if it is part of a |
| 6586 | recursive cycle in the call graph. |
| 6587 | #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are |
| 6588 | forwarded in place. |
| 6589 | |
| 6590 | Both markers imply that the callee does not access allocas or varargs from |
| 6591 | the caller. Calls marked ``musttail`` must obey the following additional |
| 6592 | rules: |
| 6593 | |
| 6594 | - The call must immediately precede a :ref:`ret <i_ret>` instruction, |
| 6595 | or a pointer bitcast followed by a ret instruction. |
| 6596 | - The ret instruction must return the (possibly bitcasted) value |
| 6597 | produced by the call or void. |
| 6598 | - The caller and callee prototypes must match. Pointer types of |
| 6599 | parameters or return types may differ in pointee type, but not |
| 6600 | in address space. |
| 6601 | - The calling conventions of the caller and callee must match. |
| 6602 | - All ABI-impacting function attributes, such as sret, byval, inreg, |
| 6603 | returned, and inalloca, must match. |
Reid Kleckner | 8349864 | 2014-08-26 00:33:28 +0000 | [diff] [blame] | 6604 | - The callee must be varargs iff the caller is varargs. Bitcasting a |
| 6605 | non-varargs function to the appropriate varargs type is legal so |
| 6606 | long as the non-varargs prefixes obey the other rules. |
Reid Kleckner | 5772b77 | 2014-04-24 20:14:34 +0000 | [diff] [blame] | 6607 | |
| 6608 | Tail call optimization for calls marked ``tail`` is guaranteed to occur if |
| 6609 | the following conditions are met: |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6610 | |
| 6611 | - Caller and callee both have the calling convention ``fastcc``. |
| 6612 | - The call is in tail position (ret immediately follows call and ret |
| 6613 | uses value of call or is void). |
| 6614 | - Option ``-tailcallopt`` is enabled, or |
| 6615 | ``llvm::GuaranteedTailCallOpt`` is ``true``. |
Alp Toker | cf21875 | 2014-06-30 18:57:16 +0000 | [diff] [blame] | 6616 | - `Platform-specific constraints are |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6617 | met. <CodeGenerator.html#tailcallopt>`_ |
| 6618 | |
| 6619 | #. The optional "cconv" marker indicates which :ref:`calling |
| 6620 | convention <callingconv>` the call should use. If none is |
| 6621 | specified, the call defaults to using C calling conventions. The |
| 6622 | calling convention of the call must match the calling convention of |
| 6623 | the target function, or else the behavior is undefined. |
| 6624 | #. The optional :ref:`Parameter Attributes <paramattrs>` list for return |
| 6625 | values. Only '``zeroext``', '``signext``', and '``inreg``' attributes |
| 6626 | are valid here. |
| 6627 | #. '``ty``': the type of the call instruction itself which is also the |
| 6628 | type of the return value. Functions that return no value are marked |
| 6629 | ``void``. |
| 6630 | #. '``fnty``': shall be the signature of the pointer to function value |
| 6631 | being invoked. The argument types must match the types implied by |
| 6632 | this signature. This type can be omitted if the function is not |
| 6633 | varargs and if the function type does not return a pointer to a |
| 6634 | function. |
| 6635 | #. '``fnptrval``': An LLVM value containing a pointer to a function to |
| 6636 | be invoked. In most cases, this is a direct function invocation, but |
| 6637 | indirect ``call``'s are just as possible, calling an arbitrary pointer |
| 6638 | to function value. |
| 6639 | #. '``function args``': argument list whose types match the function |
| 6640 | signature argument types and parameter attributes. All arguments must |
| 6641 | be of :ref:`first class <t_firstclass>` type. If the function signature |
| 6642 | indicates the function accepts a variable number of arguments, the |
| 6643 | extra arguments can be specified. |
| 6644 | #. The optional :ref:`function attributes <fnattrs>` list. Only |
| 6645 | '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``' |
| 6646 | attributes are valid here. |
| 6647 | |
| 6648 | Semantics: |
| 6649 | """""""""" |
| 6650 | |
| 6651 | The '``call``' instruction is used to cause control flow to transfer to |
| 6652 | a specified function, with its incoming arguments bound to the specified |
| 6653 | values. Upon a '``ret``' instruction in the called function, control |
| 6654 | flow continues with the instruction after the function call, and the |
| 6655 | return value of the function is bound to the result argument. |
| 6656 | |
| 6657 | Example: |
| 6658 | """""""" |
| 6659 | |
| 6660 | .. code-block:: llvm |
| 6661 | |
| 6662 | %retval = call i32 @test(i32 %argc) |
| 6663 | call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32 |
| 6664 | %X = tail call i32 @foo() ; yields i32 |
| 6665 | %Y = tail call fastcc i32 @foo() ; yields i32 |
| 6666 | call void %foo(i8 97 signext) |
| 6667 | |
| 6668 | %struct.A = type { i32, i8 } |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 6669 | %r = call %struct.A @foo() ; yields { i32, i8 } |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6670 | %gr = extractvalue %struct.A %r, 0 ; yields i32 |
| 6671 | %gr1 = extractvalue %struct.A %r, 1 ; yields i8 |
| 6672 | %Z = call void @foo() noreturn ; indicates that %foo never returns normally |
| 6673 | %ZZ = call zeroext i32 @bar() ; Return value is %zero extended |
| 6674 | |
| 6675 | llvm treats calls to some functions with names and arguments that match |
| 6676 | the standard C99 library as being the C99 library functions, and may |
| 6677 | perform optimizations or generate code for them under that assumption. |
| 6678 | This is something we'd like to change in the future to provide better |
| 6679 | support for freestanding environments and non-C-based languages. |
| 6680 | |
| 6681 | .. _i_va_arg: |
| 6682 | |
| 6683 | '``va_arg``' Instruction |
| 6684 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6685 | |
| 6686 | Syntax: |
| 6687 | """"""" |
| 6688 | |
| 6689 | :: |
| 6690 | |
| 6691 | <resultval> = va_arg <va_list*> <arglist>, <argty> |
| 6692 | |
| 6693 | Overview: |
| 6694 | """"""""" |
| 6695 | |
| 6696 | The '``va_arg``' instruction is used to access arguments passed through |
| 6697 | the "variable argument" area of a function call. It is used to implement |
| 6698 | the ``va_arg`` macro in C. |
| 6699 | |
| 6700 | Arguments: |
| 6701 | """""""""" |
| 6702 | |
| 6703 | This instruction takes a ``va_list*`` value and the type of the |
| 6704 | argument. It returns a value of the specified argument type and |
| 6705 | increments the ``va_list`` to point to the next argument. The actual |
| 6706 | type of ``va_list`` is target specific. |
| 6707 | |
| 6708 | Semantics: |
| 6709 | """""""""" |
| 6710 | |
| 6711 | The '``va_arg``' instruction loads an argument of the specified type |
| 6712 | from the specified ``va_list`` and causes the ``va_list`` to point to |
| 6713 | the next argument. For more information, see the variable argument |
| 6714 | handling :ref:`Intrinsic Functions <int_varargs>`. |
| 6715 | |
| 6716 | It is legal for this instruction to be called in a function which does |
| 6717 | not take a variable number of arguments, for example, the ``vfprintf`` |
| 6718 | function. |
| 6719 | |
| 6720 | ``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic |
| 6721 | function <intrinsics>` because it takes a type as an argument. |
| 6722 | |
| 6723 | Example: |
| 6724 | """""""" |
| 6725 | |
| 6726 | See the :ref:`variable argument processing <int_varargs>` section. |
| 6727 | |
| 6728 | Note that the code generator does not yet fully support va\_arg on many |
| 6729 | targets. Also, it does not currently support va\_arg with aggregate |
| 6730 | types on any target. |
| 6731 | |
| 6732 | .. _i_landingpad: |
| 6733 | |
| 6734 | '``landingpad``' Instruction |
| 6735 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6736 | |
| 6737 | Syntax: |
| 6738 | """"""" |
| 6739 | |
| 6740 | :: |
| 6741 | |
| 6742 | <resultval> = landingpad <resultty> personality <type> <pers_fn> <clause>+ |
| 6743 | <resultval> = landingpad <resultty> personality <type> <pers_fn> cleanup <clause>* |
| 6744 | |
| 6745 | <clause> := catch <type> <value> |
| 6746 | <clause> := filter <array constant type> <array constant> |
| 6747 | |
| 6748 | Overview: |
| 6749 | """"""""" |
| 6750 | |
| 6751 | The '``landingpad``' instruction is used by `LLVM's exception handling |
| 6752 | system <ExceptionHandling.html#overview>`_ to specify that a basic block |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 6753 | is a landing pad --- one where the exception lands, and corresponds to the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6754 | code found in the ``catch`` portion of a ``try``/``catch`` sequence. It |
| 6755 | defines values supplied by the personality function (``pers_fn``) upon |
| 6756 | re-entry to the function. The ``resultval`` has the type ``resultty``. |
| 6757 | |
| 6758 | Arguments: |
| 6759 | """""""""" |
| 6760 | |
| 6761 | This instruction takes a ``pers_fn`` value. This is the personality |
| 6762 | function associated with the unwinding mechanism. The optional |
| 6763 | ``cleanup`` flag indicates that the landing pad block is a cleanup. |
| 6764 | |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 6765 | A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6766 | contains the global variable representing the "type" that may be caught |
| 6767 | or filtered respectively. Unlike the ``catch`` clause, the ``filter`` |
| 6768 | clause takes an array constant as its argument. Use |
| 6769 | "``[0 x i8**] undef``" for a filter which cannot throw. The |
| 6770 | '``landingpad``' instruction must contain *at least* one ``clause`` or |
| 6771 | the ``cleanup`` flag. |
| 6772 | |
| 6773 | Semantics: |
| 6774 | """""""""" |
| 6775 | |
| 6776 | The '``landingpad``' instruction defines the values which are set by the |
| 6777 | personality function (``pers_fn``) upon re-entry to the function, and |
| 6778 | therefore the "result type" of the ``landingpad`` instruction. As with |
| 6779 | calling conventions, how the personality function results are |
| 6780 | represented in LLVM IR is target specific. |
| 6781 | |
| 6782 | The clauses are applied in order from top to bottom. If two |
| 6783 | ``landingpad`` instructions are merged together through inlining, the |
| 6784 | clauses from the calling function are appended to the list of clauses. |
| 6785 | When the call stack is being unwound due to an exception being thrown, |
| 6786 | the exception is compared against each ``clause`` in turn. If it doesn't |
| 6787 | match any of the clauses, and the ``cleanup`` flag is not set, then |
| 6788 | unwinding continues further up the call stack. |
| 6789 | |
| 6790 | The ``landingpad`` instruction has several restrictions: |
| 6791 | |
| 6792 | - A landing pad block is a basic block which is the unwind destination |
| 6793 | of an '``invoke``' instruction. |
| 6794 | - A landing pad block must have a '``landingpad``' instruction as its |
| 6795 | first non-PHI instruction. |
| 6796 | - There can be only one '``landingpad``' instruction within the landing |
| 6797 | pad block. |
| 6798 | - A basic block that is not a landing pad block may not include a |
| 6799 | '``landingpad``' instruction. |
| 6800 | - All '``landingpad``' instructions in a function must have the same |
| 6801 | personality function. |
| 6802 | |
| 6803 | Example: |
| 6804 | """""""" |
| 6805 | |
| 6806 | .. code-block:: llvm |
| 6807 | |
| 6808 | ;; A landing pad which can catch an integer. |
| 6809 | %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0 |
| 6810 | catch i8** @_ZTIi |
| 6811 | ;; A landing pad that is a cleanup. |
| 6812 | %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0 |
| 6813 | cleanup |
| 6814 | ;; A landing pad which can catch an integer and can only throw a double. |
| 6815 | %res = landingpad { i8*, i32 } personality i32 (...)* @__gxx_personality_v0 |
| 6816 | catch i8** @_ZTIi |
| 6817 | filter [1 x i8**] [@_ZTId] |
| 6818 | |
| 6819 | .. _intrinsics: |
| 6820 | |
| 6821 | Intrinsic Functions |
| 6822 | =================== |
| 6823 | |
| 6824 | LLVM supports the notion of an "intrinsic function". These functions |
| 6825 | have well known names and semantics and are required to follow certain |
| 6826 | restrictions. Overall, these intrinsics represent an extension mechanism |
| 6827 | for the LLVM language that does not require changing all of the |
| 6828 | transformations in LLVM when adding to the language (or the bitcode |
| 6829 | reader/writer, the parser, etc...). |
| 6830 | |
| 6831 | Intrinsic function names must all start with an "``llvm.``" prefix. This |
| 6832 | prefix is reserved in LLVM for intrinsic names; thus, function names may |
| 6833 | not begin with this prefix. Intrinsic functions must always be external |
| 6834 | functions: you cannot define the body of intrinsic functions. Intrinsic |
| 6835 | functions may only be used in call or invoke instructions: it is illegal |
| 6836 | to take the address of an intrinsic function. Additionally, because |
| 6837 | intrinsic functions are part of the LLVM language, it is required if any |
| 6838 | are added that they be documented here. |
| 6839 | |
| 6840 | Some intrinsic functions can be overloaded, i.e., the intrinsic |
| 6841 | represents a family of functions that perform the same operation but on |
| 6842 | different data types. Because LLVM can represent over 8 million |
| 6843 | different integer types, overloading is used commonly to allow an |
| 6844 | intrinsic function to operate on any integer type. One or more of the |
| 6845 | argument types or the result type can be overloaded to accept any |
| 6846 | integer type. Argument types may also be defined as exactly matching a |
| 6847 | previous argument's type or the result type. This allows an intrinsic |
| 6848 | function which accepts multiple arguments, but needs all of them to be |
| 6849 | of the same type, to only be overloaded with respect to a single |
| 6850 | argument or the result. |
| 6851 | |
| 6852 | Overloaded intrinsics will have the names of its overloaded argument |
| 6853 | types encoded into its function name, each preceded by a period. Only |
| 6854 | those types which are overloaded result in a name suffix. Arguments |
| 6855 | whose type is matched against another type do not. For example, the |
| 6856 | ``llvm.ctpop`` function can take an integer of any width and returns an |
| 6857 | integer of exactly the same integer width. This leads to a family of |
| 6858 | functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and |
| 6859 | ``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is |
| 6860 | overloaded, and only one type suffix is required. Because the argument's |
| 6861 | type is matched against the return type, it does not require its own |
| 6862 | name suffix. |
| 6863 | |
| 6864 | To learn how to add an intrinsic function, please see the `Extending |
| 6865 | LLVM Guide <ExtendingLLVM.html>`_. |
| 6866 | |
| 6867 | .. _int_varargs: |
| 6868 | |
| 6869 | Variable Argument Handling Intrinsics |
| 6870 | ------------------------------------- |
| 6871 | |
| 6872 | Variable argument support is defined in LLVM with the |
| 6873 | :ref:`va_arg <i_va_arg>` instruction and these three intrinsic |
| 6874 | functions. These functions are related to the similarly named macros |
| 6875 | defined in the ``<stdarg.h>`` header file. |
| 6876 | |
| 6877 | All of these functions operate on arguments that use a target-specific |
| 6878 | value type "``va_list``". The LLVM assembly language reference manual |
| 6879 | does not define what this type is, so all transformations should be |
| 6880 | prepared to handle these functions regardless of the type used. |
| 6881 | |
| 6882 | This example shows how the :ref:`va_arg <i_va_arg>` instruction and the |
| 6883 | variable argument handling intrinsic functions are used. |
| 6884 | |
| 6885 | .. code-block:: llvm |
| 6886 | |
Tim Northover | ab60bb9 | 2014-11-02 01:21:51 +0000 | [diff] [blame^] | 6887 | ; This struct is different for every platform. For most platforms, |
| 6888 | ; it is merely an i8*. |
| 6889 | %struct.va_list = type { i8* } |
| 6890 | |
| 6891 | ; For Unix x86_64 platforms, va_list is the following struct: |
| 6892 | ; %struct.va_list = type { i32, i32, i8*, i8* } |
| 6893 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6894 | define i32 @test(i32 %X, ...) { |
| 6895 | ; Initialize variable argument processing |
Tim Northover | ab60bb9 | 2014-11-02 01:21:51 +0000 | [diff] [blame^] | 6896 | %ap = alloca %struct.va_list |
| 6897 | %ap2 = bitcast %struct.va_list* %ap to i8* |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6898 | call void @llvm.va_start(i8* %ap2) |
| 6899 | |
| 6900 | ; Read a single integer argument |
Tim Northover | ab60bb9 | 2014-11-02 01:21:51 +0000 | [diff] [blame^] | 6901 | %tmp = va_arg i8* %ap2, i32 |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6902 | |
| 6903 | ; Demonstrate usage of llvm.va_copy and llvm.va_end |
| 6904 | %aq = alloca i8* |
| 6905 | %aq2 = bitcast i8** %aq to i8* |
| 6906 | call void @llvm.va_copy(i8* %aq2, i8* %ap2) |
| 6907 | call void @llvm.va_end(i8* %aq2) |
| 6908 | |
| 6909 | ; Stop processing of arguments. |
| 6910 | call void @llvm.va_end(i8* %ap2) |
| 6911 | ret i32 %tmp |
| 6912 | } |
| 6913 | |
| 6914 | declare void @llvm.va_start(i8*) |
| 6915 | declare void @llvm.va_copy(i8*, i8*) |
| 6916 | declare void @llvm.va_end(i8*) |
| 6917 | |
| 6918 | .. _int_va_start: |
| 6919 | |
| 6920 | '``llvm.va_start``' Intrinsic |
| 6921 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6922 | |
| 6923 | Syntax: |
| 6924 | """"""" |
| 6925 | |
| 6926 | :: |
| 6927 | |
Nick Lewycky | 04f6de0 | 2013-09-11 22:04:52 +0000 | [diff] [blame] | 6928 | declare void @llvm.va_start(i8* <arglist>) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 6929 | |
| 6930 | Overview: |
| 6931 | """"""""" |
| 6932 | |
| 6933 | The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for |
| 6934 | subsequent use by ``va_arg``. |
| 6935 | |
| 6936 | Arguments: |
| 6937 | """""""""" |
| 6938 | |
| 6939 | The argument is a pointer to a ``va_list`` element to initialize. |
| 6940 | |
| 6941 | Semantics: |
| 6942 | """""""""" |
| 6943 | |
| 6944 | The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro |
| 6945 | available in C. In a target-dependent way, it initializes the |
| 6946 | ``va_list`` element to which the argument points, so that the next call |
| 6947 | to ``va_arg`` will produce the first variable argument passed to the |
| 6948 | function. Unlike the C ``va_start`` macro, this intrinsic does not need |
| 6949 | to know the last argument of the function as the compiler can figure |
| 6950 | that out. |
| 6951 | |
| 6952 | '``llvm.va_end``' Intrinsic |
| 6953 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6954 | |
| 6955 | Syntax: |
| 6956 | """"""" |
| 6957 | |
| 6958 | :: |
| 6959 | |
| 6960 | declare void @llvm.va_end(i8* <arglist>) |
| 6961 | |
| 6962 | Overview: |
| 6963 | """"""""" |
| 6964 | |
| 6965 | The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been |
| 6966 | initialized previously with ``llvm.va_start`` or ``llvm.va_copy``. |
| 6967 | |
| 6968 | Arguments: |
| 6969 | """""""""" |
| 6970 | |
| 6971 | The argument is a pointer to a ``va_list`` to destroy. |
| 6972 | |
| 6973 | Semantics: |
| 6974 | """""""""" |
| 6975 | |
| 6976 | The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro |
| 6977 | available in C. In a target-dependent way, it destroys the ``va_list`` |
| 6978 | element to which the argument points. Calls to |
| 6979 | :ref:`llvm.va_start <int_va_start>` and |
| 6980 | :ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to |
| 6981 | ``llvm.va_end``. |
| 6982 | |
| 6983 | .. _int_va_copy: |
| 6984 | |
| 6985 | '``llvm.va_copy``' Intrinsic |
| 6986 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 6987 | |
| 6988 | Syntax: |
| 6989 | """"""" |
| 6990 | |
| 6991 | :: |
| 6992 | |
| 6993 | declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>) |
| 6994 | |
| 6995 | Overview: |
| 6996 | """"""""" |
| 6997 | |
| 6998 | The '``llvm.va_copy``' intrinsic copies the current argument position |
| 6999 | from the source argument list to the destination argument list. |
| 7000 | |
| 7001 | Arguments: |
| 7002 | """""""""" |
| 7003 | |
| 7004 | The first argument is a pointer to a ``va_list`` element to initialize. |
| 7005 | The second argument is a pointer to a ``va_list`` element to copy from. |
| 7006 | |
| 7007 | Semantics: |
| 7008 | """""""""" |
| 7009 | |
| 7010 | The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro |
| 7011 | available in C. In a target-dependent way, it copies the source |
| 7012 | ``va_list`` element into the destination ``va_list`` element. This |
| 7013 | intrinsic is necessary because the `` llvm.va_start`` intrinsic may be |
| 7014 | arbitrarily complex and require, for example, memory allocation. |
| 7015 | |
| 7016 | Accurate Garbage Collection Intrinsics |
| 7017 | -------------------------------------- |
| 7018 | |
| 7019 | LLVM support for `Accurate Garbage Collection <GarbageCollection.html>`_ |
| 7020 | (GC) requires the implementation and generation of these intrinsics. |
| 7021 | These intrinsics allow identification of :ref:`GC roots on the |
| 7022 | stack <int_gcroot>`, as well as garbage collector implementations that |
| 7023 | require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. |
| 7024 | Front-ends for type-safe garbage collected languages should generate |
| 7025 | these intrinsics to make use of the LLVM garbage collectors. For more |
| 7026 | details, see `Accurate Garbage Collection with |
| 7027 | LLVM <GarbageCollection.html>`_. |
| 7028 | |
| 7029 | The garbage collection intrinsics only operate on objects in the generic |
| 7030 | address space (address space zero). |
| 7031 | |
| 7032 | .. _int_gcroot: |
| 7033 | |
| 7034 | '``llvm.gcroot``' Intrinsic |
| 7035 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7036 | |
| 7037 | Syntax: |
| 7038 | """"""" |
| 7039 | |
| 7040 | :: |
| 7041 | |
| 7042 | declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata) |
| 7043 | |
| 7044 | Overview: |
| 7045 | """"""""" |
| 7046 | |
| 7047 | The '``llvm.gcroot``' intrinsic declares the existence of a GC root to |
| 7048 | the code generator, and allows some metadata to be associated with it. |
| 7049 | |
| 7050 | Arguments: |
| 7051 | """""""""" |
| 7052 | |
| 7053 | The first argument specifies the address of a stack object that contains |
| 7054 | the root pointer. The second pointer (which must be either a constant or |
| 7055 | a global value address) contains the meta-data to be associated with the |
| 7056 | root. |
| 7057 | |
| 7058 | Semantics: |
| 7059 | """""""""" |
| 7060 | |
| 7061 | At runtime, a call to this intrinsic stores a null pointer into the |
| 7062 | "ptrloc" location. At compile-time, the code generator generates |
| 7063 | information to allow the runtime to find the pointer at GC safe points. |
| 7064 | The '``llvm.gcroot``' intrinsic may only be used in a function which |
| 7065 | :ref:`specifies a GC algorithm <gc>`. |
| 7066 | |
| 7067 | .. _int_gcread: |
| 7068 | |
| 7069 | '``llvm.gcread``' Intrinsic |
| 7070 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7071 | |
| 7072 | Syntax: |
| 7073 | """"""" |
| 7074 | |
| 7075 | :: |
| 7076 | |
| 7077 | declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr) |
| 7078 | |
| 7079 | Overview: |
| 7080 | """"""""" |
| 7081 | |
| 7082 | The '``llvm.gcread``' intrinsic identifies reads of references from heap |
| 7083 | locations, allowing garbage collector implementations that require read |
| 7084 | barriers. |
| 7085 | |
| 7086 | Arguments: |
| 7087 | """""""""" |
| 7088 | |
| 7089 | The second argument is the address to read from, which should be an |
| 7090 | address allocated from the garbage collector. The first object is a |
| 7091 | pointer to the start of the referenced object, if needed by the language |
| 7092 | runtime (otherwise null). |
| 7093 | |
| 7094 | Semantics: |
| 7095 | """""""""" |
| 7096 | |
| 7097 | The '``llvm.gcread``' intrinsic has the same semantics as a load |
| 7098 | instruction, but may be replaced with substantially more complex code by |
| 7099 | the garbage collector runtime, as needed. The '``llvm.gcread``' |
| 7100 | intrinsic may only be used in a function which :ref:`specifies a GC |
| 7101 | algorithm <gc>`. |
| 7102 | |
| 7103 | .. _int_gcwrite: |
| 7104 | |
| 7105 | '``llvm.gcwrite``' Intrinsic |
| 7106 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7107 | |
| 7108 | Syntax: |
| 7109 | """"""" |
| 7110 | |
| 7111 | :: |
| 7112 | |
| 7113 | declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2) |
| 7114 | |
| 7115 | Overview: |
| 7116 | """"""""" |
| 7117 | |
| 7118 | The '``llvm.gcwrite``' intrinsic identifies writes of references to heap |
| 7119 | locations, allowing garbage collector implementations that require write |
| 7120 | barriers (such as generational or reference counting collectors). |
| 7121 | |
| 7122 | Arguments: |
| 7123 | """""""""" |
| 7124 | |
| 7125 | The first argument is the reference to store, the second is the start of |
| 7126 | the object to store it to, and the third is the address of the field of |
| 7127 | Obj to store to. If the runtime does not require a pointer to the |
| 7128 | object, Obj may be null. |
| 7129 | |
| 7130 | Semantics: |
| 7131 | """""""""" |
| 7132 | |
| 7133 | The '``llvm.gcwrite``' intrinsic has the same semantics as a store |
| 7134 | instruction, but may be replaced with substantially more complex code by |
| 7135 | the garbage collector runtime, as needed. The '``llvm.gcwrite``' |
| 7136 | intrinsic may only be used in a function which :ref:`specifies a GC |
| 7137 | algorithm <gc>`. |
| 7138 | |
| 7139 | Code Generator Intrinsics |
| 7140 | ------------------------- |
| 7141 | |
| 7142 | These intrinsics are provided by LLVM to expose special features that |
| 7143 | may only be implemented with code generator support. |
| 7144 | |
| 7145 | '``llvm.returnaddress``' Intrinsic |
| 7146 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7147 | |
| 7148 | Syntax: |
| 7149 | """"""" |
| 7150 | |
| 7151 | :: |
| 7152 | |
| 7153 | declare i8 *@llvm.returnaddress(i32 <level>) |
| 7154 | |
| 7155 | Overview: |
| 7156 | """"""""" |
| 7157 | |
| 7158 | The '``llvm.returnaddress``' intrinsic attempts to compute a |
| 7159 | target-specific value indicating the return address of the current |
| 7160 | function or one of its callers. |
| 7161 | |
| 7162 | Arguments: |
| 7163 | """""""""" |
| 7164 | |
| 7165 | The argument to this intrinsic indicates which function to return the |
| 7166 | address for. Zero indicates the calling function, one indicates its |
| 7167 | caller, etc. The argument is **required** to be a constant integer |
| 7168 | value. |
| 7169 | |
| 7170 | Semantics: |
| 7171 | """""""""" |
| 7172 | |
| 7173 | The '``llvm.returnaddress``' intrinsic either returns a pointer |
| 7174 | indicating the return address of the specified call frame, or zero if it |
| 7175 | cannot be identified. The value returned by this intrinsic is likely to |
| 7176 | be incorrect or 0 for arguments other than zero, so it should only be |
| 7177 | used for debugging purposes. |
| 7178 | |
| 7179 | Note that calling this intrinsic does not prevent function inlining or |
| 7180 | other aggressive transformations, so the value returned may not be that |
| 7181 | of the obvious source-language caller. |
| 7182 | |
| 7183 | '``llvm.frameaddress``' Intrinsic |
| 7184 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7185 | |
| 7186 | Syntax: |
| 7187 | """"""" |
| 7188 | |
| 7189 | :: |
| 7190 | |
| 7191 | declare i8* @llvm.frameaddress(i32 <level>) |
| 7192 | |
| 7193 | Overview: |
| 7194 | """"""""" |
| 7195 | |
| 7196 | The '``llvm.frameaddress``' intrinsic attempts to return the |
| 7197 | target-specific frame pointer value for the specified stack frame. |
| 7198 | |
| 7199 | Arguments: |
| 7200 | """""""""" |
| 7201 | |
| 7202 | The argument to this intrinsic indicates which function to return the |
| 7203 | frame pointer for. Zero indicates the calling function, one indicates |
| 7204 | its caller, etc. The argument is **required** to be a constant integer |
| 7205 | value. |
| 7206 | |
| 7207 | Semantics: |
| 7208 | """""""""" |
| 7209 | |
| 7210 | The '``llvm.frameaddress``' intrinsic either returns a pointer |
| 7211 | indicating the frame address of the specified call frame, or zero if it |
| 7212 | cannot be identified. The value returned by this intrinsic is likely to |
| 7213 | be incorrect or 0 for arguments other than zero, so it should only be |
| 7214 | used for debugging purposes. |
| 7215 | |
| 7216 | Note that calling this intrinsic does not prevent function inlining or |
| 7217 | other aggressive transformations, so the value returned may not be that |
| 7218 | of the obvious source-language caller. |
| 7219 | |
Renato Golin | c7aea40 | 2014-05-06 16:51:25 +0000 | [diff] [blame] | 7220 | .. _int_read_register: |
| 7221 | .. _int_write_register: |
| 7222 | |
| 7223 | '``llvm.read_register``' and '``llvm.write_register``' Intrinsics |
| 7224 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7225 | |
| 7226 | Syntax: |
| 7227 | """"""" |
| 7228 | |
| 7229 | :: |
| 7230 | |
| 7231 | declare i32 @llvm.read_register.i32(metadata) |
| 7232 | declare i64 @llvm.read_register.i64(metadata) |
| 7233 | declare void @llvm.write_register.i32(metadata, i32 @value) |
| 7234 | declare void @llvm.write_register.i64(metadata, i64 @value) |
| 7235 | !0 = metadata !{metadata !"sp\00"} |
| 7236 | |
| 7237 | Overview: |
| 7238 | """"""""" |
| 7239 | |
| 7240 | The '``llvm.read_register``' and '``llvm.write_register``' intrinsics |
| 7241 | provides access to the named register. The register must be valid on |
| 7242 | the architecture being compiled to. The type needs to be compatible |
| 7243 | with the register being read. |
| 7244 | |
| 7245 | Semantics: |
| 7246 | """""""""" |
| 7247 | |
| 7248 | The '``llvm.read_register``' intrinsic returns the current value of the |
| 7249 | register, where possible. The '``llvm.write_register``' intrinsic sets |
| 7250 | the current value of the register, where possible. |
| 7251 | |
| 7252 | This is useful to implement named register global variables that need |
| 7253 | to always be mapped to a specific register, as is common practice on |
| 7254 | bare-metal programs including OS kernels. |
| 7255 | |
| 7256 | The compiler doesn't check for register availability or use of the used |
| 7257 | register in surrounding code, including inline assembly. Because of that, |
| 7258 | allocatable registers are not supported. |
| 7259 | |
| 7260 | Warning: So far it only works with the stack pointer on selected |
Tim Northover | 3b0846e | 2014-05-24 12:50:23 +0000 | [diff] [blame] | 7261 | architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of |
Renato Golin | c7aea40 | 2014-05-06 16:51:25 +0000 | [diff] [blame] | 7262 | work is needed to support other registers and even more so, allocatable |
| 7263 | registers. |
| 7264 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7265 | .. _int_stacksave: |
| 7266 | |
| 7267 | '``llvm.stacksave``' Intrinsic |
| 7268 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7269 | |
| 7270 | Syntax: |
| 7271 | """"""" |
| 7272 | |
| 7273 | :: |
| 7274 | |
| 7275 | declare i8* @llvm.stacksave() |
| 7276 | |
| 7277 | Overview: |
| 7278 | """"""""" |
| 7279 | |
| 7280 | The '``llvm.stacksave``' intrinsic is used to remember the current state |
| 7281 | of the function stack, for use with |
| 7282 | :ref:`llvm.stackrestore <int_stackrestore>`. This is useful for |
| 7283 | implementing language features like scoped automatic variable sized |
| 7284 | arrays in C99. |
| 7285 | |
| 7286 | Semantics: |
| 7287 | """""""""" |
| 7288 | |
| 7289 | This intrinsic returns a opaque pointer value that can be passed to |
| 7290 | :ref:`llvm.stackrestore <int_stackrestore>`. When an |
| 7291 | ``llvm.stackrestore`` intrinsic is executed with a value saved from |
| 7292 | ``llvm.stacksave``, it effectively restores the state of the stack to |
| 7293 | the state it was in when the ``llvm.stacksave`` intrinsic executed. In |
| 7294 | practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that |
| 7295 | were allocated after the ``llvm.stacksave`` was executed. |
| 7296 | |
| 7297 | .. _int_stackrestore: |
| 7298 | |
| 7299 | '``llvm.stackrestore``' Intrinsic |
| 7300 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7301 | |
| 7302 | Syntax: |
| 7303 | """"""" |
| 7304 | |
| 7305 | :: |
| 7306 | |
| 7307 | declare void @llvm.stackrestore(i8* %ptr) |
| 7308 | |
| 7309 | Overview: |
| 7310 | """"""""" |
| 7311 | |
| 7312 | The '``llvm.stackrestore``' intrinsic is used to restore the state of |
| 7313 | the function stack to the state it was in when the corresponding |
| 7314 | :ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is |
| 7315 | useful for implementing language features like scoped automatic variable |
| 7316 | sized arrays in C99. |
| 7317 | |
| 7318 | Semantics: |
| 7319 | """""""""" |
| 7320 | |
| 7321 | See the description for :ref:`llvm.stacksave <int_stacksave>`. |
| 7322 | |
| 7323 | '``llvm.prefetch``' Intrinsic |
| 7324 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7325 | |
| 7326 | Syntax: |
| 7327 | """"""" |
| 7328 | |
| 7329 | :: |
| 7330 | |
| 7331 | declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>) |
| 7332 | |
| 7333 | Overview: |
| 7334 | """"""""" |
| 7335 | |
| 7336 | The '``llvm.prefetch``' intrinsic is a hint to the code generator to |
| 7337 | insert a prefetch instruction if supported; otherwise, it is a noop. |
| 7338 | Prefetches have no effect on the behavior of the program but can change |
| 7339 | its performance characteristics. |
| 7340 | |
| 7341 | Arguments: |
| 7342 | """""""""" |
| 7343 | |
| 7344 | ``address`` is the address to be prefetched, ``rw`` is the specifier |
| 7345 | determining if the fetch should be for a read (0) or write (1), and |
| 7346 | ``locality`` is a temporal locality specifier ranging from (0) - no |
| 7347 | locality, to (3) - extremely local keep in cache. The ``cache type`` |
| 7348 | specifies whether the prefetch is performed on the data (1) or |
| 7349 | instruction (0) cache. The ``rw``, ``locality`` and ``cache type`` |
| 7350 | arguments must be constant integers. |
| 7351 | |
| 7352 | Semantics: |
| 7353 | """""""""" |
| 7354 | |
| 7355 | This intrinsic does not modify the behavior of the program. In |
| 7356 | particular, prefetches cannot trap and do not produce a value. On |
| 7357 | targets that support this intrinsic, the prefetch can provide hints to |
| 7358 | the processor cache for better performance. |
| 7359 | |
| 7360 | '``llvm.pcmarker``' Intrinsic |
| 7361 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7362 | |
| 7363 | Syntax: |
| 7364 | """"""" |
| 7365 | |
| 7366 | :: |
| 7367 | |
| 7368 | declare void @llvm.pcmarker(i32 <id>) |
| 7369 | |
| 7370 | Overview: |
| 7371 | """"""""" |
| 7372 | |
| 7373 | The '``llvm.pcmarker``' intrinsic is a method to export a Program |
| 7374 | Counter (PC) in a region of code to simulators and other tools. The |
| 7375 | method is target specific, but it is expected that the marker will use |
| 7376 | exported symbols to transmit the PC of the marker. The marker makes no |
| 7377 | guarantees that it will remain with any specific instruction after |
| 7378 | optimizations. It is possible that the presence of a marker will inhibit |
| 7379 | optimizations. The intended use is to be inserted after optimizations to |
| 7380 | allow correlations of simulation runs. |
| 7381 | |
| 7382 | Arguments: |
| 7383 | """""""""" |
| 7384 | |
| 7385 | ``id`` is a numerical id identifying the marker. |
| 7386 | |
| 7387 | Semantics: |
| 7388 | """""""""" |
| 7389 | |
| 7390 | This intrinsic does not modify the behavior of the program. Backends |
| 7391 | that do not support this intrinsic may ignore it. |
| 7392 | |
| 7393 | '``llvm.readcyclecounter``' Intrinsic |
| 7394 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7395 | |
| 7396 | Syntax: |
| 7397 | """"""" |
| 7398 | |
| 7399 | :: |
| 7400 | |
| 7401 | declare i64 @llvm.readcyclecounter() |
| 7402 | |
| 7403 | Overview: |
| 7404 | """"""""" |
| 7405 | |
| 7406 | The '``llvm.readcyclecounter``' intrinsic provides access to the cycle |
| 7407 | counter register (or similar low latency, high accuracy clocks) on those |
| 7408 | targets that support it. On X86, it should map to RDTSC. On Alpha, it |
| 7409 | should map to RPCC. As the backing counters overflow quickly (on the |
| 7410 | order of 9 seconds on alpha), this should only be used for small |
| 7411 | timings. |
| 7412 | |
| 7413 | Semantics: |
| 7414 | """""""""" |
| 7415 | |
| 7416 | When directly supported, reading the cycle counter should not modify any |
| 7417 | memory. Implementations are allowed to either return a application |
| 7418 | specific value or a system wide value. On backends without support, this |
| 7419 | is lowered to a constant 0. |
| 7420 | |
Tim Northover | bc93308 | 2013-05-23 19:11:20 +0000 | [diff] [blame] | 7421 | Note that runtime support may be conditional on the privilege-level code is |
| 7422 | running at and the host platform. |
| 7423 | |
Renato Golin | c0a3c1d | 2014-03-26 12:52:28 +0000 | [diff] [blame] | 7424 | '``llvm.clear_cache``' Intrinsic |
| 7425 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7426 | |
| 7427 | Syntax: |
| 7428 | """"""" |
| 7429 | |
| 7430 | :: |
| 7431 | |
| 7432 | declare void @llvm.clear_cache(i8*, i8*) |
| 7433 | |
| 7434 | Overview: |
| 7435 | """"""""" |
| 7436 | |
Joerg Sonnenberger | 03014d6 | 2014-03-26 14:35:21 +0000 | [diff] [blame] | 7437 | The '``llvm.clear_cache``' intrinsic ensures visibility of modifications |
| 7438 | in the specified range to the execution unit of the processor. On |
| 7439 | targets with non-unified instruction and data cache, the implementation |
| 7440 | flushes the instruction cache. |
Renato Golin | c0a3c1d | 2014-03-26 12:52:28 +0000 | [diff] [blame] | 7441 | |
| 7442 | Semantics: |
| 7443 | """""""""" |
| 7444 | |
Joerg Sonnenberger | 03014d6 | 2014-03-26 14:35:21 +0000 | [diff] [blame] | 7445 | On platforms with coherent instruction and data caches (e.g. x86), this |
| 7446 | intrinsic is a nop. On platforms with non-coherent instruction and data |
Alp Toker | 16f98b2 | 2014-04-09 14:47:27 +0000 | [diff] [blame] | 7447 | cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate |
Joerg Sonnenberger | 03014d6 | 2014-03-26 14:35:21 +0000 | [diff] [blame] | 7448 | instructions or a system call, if cache flushing requires special |
| 7449 | privileges. |
Renato Golin | c0a3c1d | 2014-03-26 12:52:28 +0000 | [diff] [blame] | 7450 | |
Sean Silva | d02bf3e | 2014-04-07 22:29:53 +0000 | [diff] [blame] | 7451 | The default behavior is to emit a call to ``__clear_cache`` from the run |
Joerg Sonnenberger | 03014d6 | 2014-03-26 14:35:21 +0000 | [diff] [blame] | 7452 | time library. |
Renato Golin | 93010e6 | 2014-03-26 14:01:32 +0000 | [diff] [blame] | 7453 | |
Joerg Sonnenberger | 03014d6 | 2014-03-26 14:35:21 +0000 | [diff] [blame] | 7454 | This instrinsic does *not* empty the instruction pipeline. Modifications |
| 7455 | of the current function are outside the scope of the intrinsic. |
Renato Golin | c0a3c1d | 2014-03-26 12:52:28 +0000 | [diff] [blame] | 7456 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7457 | Standard C Library Intrinsics |
| 7458 | ----------------------------- |
| 7459 | |
| 7460 | LLVM provides intrinsics for a few important standard C library |
| 7461 | functions. These intrinsics allow source-language front-ends to pass |
| 7462 | information about the alignment of the pointer arguments to the code |
| 7463 | generator, providing opportunity for more efficient code generation. |
| 7464 | |
| 7465 | .. _int_memcpy: |
| 7466 | |
| 7467 | '``llvm.memcpy``' Intrinsic |
| 7468 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7469 | |
| 7470 | Syntax: |
| 7471 | """"""" |
| 7472 | |
| 7473 | This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any |
| 7474 | integer bit width and for different address spaces. Not all targets |
| 7475 | support all bit widths however. |
| 7476 | |
| 7477 | :: |
| 7478 | |
| 7479 | declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, |
| 7480 | i32 <len>, i32 <align>, i1 <isvolatile>) |
| 7481 | declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>, |
| 7482 | i64 <len>, i32 <align>, i1 <isvolatile>) |
| 7483 | |
| 7484 | Overview: |
| 7485 | """"""""" |
| 7486 | |
| 7487 | The '``llvm.memcpy.*``' intrinsics copy a block of memory from the |
| 7488 | source location to the destination location. |
| 7489 | |
| 7490 | Note that, unlike the standard libc function, the ``llvm.memcpy.*`` |
| 7491 | intrinsics do not return a value, takes extra alignment/isvolatile |
| 7492 | arguments and the pointers can be in specified address spaces. |
| 7493 | |
| 7494 | Arguments: |
| 7495 | """""""""" |
| 7496 | |
| 7497 | The first argument is a pointer to the destination, the second is a |
| 7498 | pointer to the source. The third argument is an integer argument |
| 7499 | specifying the number of bytes to copy, the fourth argument is the |
| 7500 | alignment of the source and destination locations, and the fifth is a |
| 7501 | boolean indicating a volatile access. |
| 7502 | |
| 7503 | If the call to this intrinsic has an alignment value that is not 0 or 1, |
| 7504 | then the caller guarantees that both the source and destination pointers |
| 7505 | are aligned to that boundary. |
| 7506 | |
| 7507 | If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is |
| 7508 | a :ref:`volatile operation <volatile>`. The detailed access behavior is not |
| 7509 | very cleanly specified and it is unwise to depend on it. |
| 7510 | |
| 7511 | Semantics: |
| 7512 | """""""""" |
| 7513 | |
| 7514 | The '``llvm.memcpy.*``' intrinsics copy a block of memory from the |
| 7515 | source location to the destination location, which are not allowed to |
| 7516 | overlap. It copies "len" bytes of memory over. If the argument is known |
| 7517 | to be aligned to some boundary, this can be specified as the fourth |
Bill Wendling | 6116315 | 2013-10-18 23:26:55 +0000 | [diff] [blame] | 7518 | argument, otherwise it should be set to 0 or 1 (both meaning no alignment). |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7519 | |
| 7520 | '``llvm.memmove``' Intrinsic |
| 7521 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7522 | |
| 7523 | Syntax: |
| 7524 | """"""" |
| 7525 | |
| 7526 | This is an overloaded intrinsic. You can use llvm.memmove on any integer |
| 7527 | bit width and for different address space. Not all targets support all |
| 7528 | bit widths however. |
| 7529 | |
| 7530 | :: |
| 7531 | |
| 7532 | declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>, |
| 7533 | i32 <len>, i32 <align>, i1 <isvolatile>) |
| 7534 | declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>, |
| 7535 | i64 <len>, i32 <align>, i1 <isvolatile>) |
| 7536 | |
| 7537 | Overview: |
| 7538 | """"""""" |
| 7539 | |
| 7540 | The '``llvm.memmove.*``' intrinsics move a block of memory from the |
| 7541 | source location to the destination location. It is similar to the |
| 7542 | '``llvm.memcpy``' intrinsic but allows the two memory locations to |
| 7543 | overlap. |
| 7544 | |
| 7545 | Note that, unlike the standard libc function, the ``llvm.memmove.*`` |
| 7546 | intrinsics do not return a value, takes extra alignment/isvolatile |
| 7547 | arguments and the pointers can be in specified address spaces. |
| 7548 | |
| 7549 | Arguments: |
| 7550 | """""""""" |
| 7551 | |
| 7552 | The first argument is a pointer to the destination, the second is a |
| 7553 | pointer to the source. The third argument is an integer argument |
| 7554 | specifying the number of bytes to copy, the fourth argument is the |
| 7555 | alignment of the source and destination locations, and the fifth is a |
| 7556 | boolean indicating a volatile access. |
| 7557 | |
| 7558 | If the call to this intrinsic has an alignment value that is not 0 or 1, |
| 7559 | then the caller guarantees that the source and destination pointers are |
| 7560 | aligned to that boundary. |
| 7561 | |
| 7562 | If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call |
| 7563 | is a :ref:`volatile operation <volatile>`. The detailed access behavior is |
| 7564 | not very cleanly specified and it is unwise to depend on it. |
| 7565 | |
| 7566 | Semantics: |
| 7567 | """""""""" |
| 7568 | |
| 7569 | The '``llvm.memmove.*``' intrinsics copy a block of memory from the |
| 7570 | source location to the destination location, which may overlap. It |
| 7571 | copies "len" bytes of memory over. If the argument is known to be |
| 7572 | aligned to some boundary, this can be specified as the fourth argument, |
Bill Wendling | 6116315 | 2013-10-18 23:26:55 +0000 | [diff] [blame] | 7573 | otherwise it should be set to 0 or 1 (both meaning no alignment). |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7574 | |
| 7575 | '``llvm.memset.*``' Intrinsics |
| 7576 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7577 | |
| 7578 | Syntax: |
| 7579 | """"""" |
| 7580 | |
| 7581 | This is an overloaded intrinsic. You can use llvm.memset on any integer |
| 7582 | bit width and for different address spaces. However, not all targets |
| 7583 | support all bit widths. |
| 7584 | |
| 7585 | :: |
| 7586 | |
| 7587 | declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>, |
| 7588 | i32 <len>, i32 <align>, i1 <isvolatile>) |
| 7589 | declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>, |
| 7590 | i64 <len>, i32 <align>, i1 <isvolatile>) |
| 7591 | |
| 7592 | Overview: |
| 7593 | """"""""" |
| 7594 | |
| 7595 | The '``llvm.memset.*``' intrinsics fill a block of memory with a |
| 7596 | particular byte value. |
| 7597 | |
| 7598 | Note that, unlike the standard libc function, the ``llvm.memset`` |
| 7599 | intrinsic does not return a value and takes extra alignment/volatile |
| 7600 | arguments. Also, the destination can be in an arbitrary address space. |
| 7601 | |
| 7602 | Arguments: |
| 7603 | """""""""" |
| 7604 | |
| 7605 | The first argument is a pointer to the destination to fill, the second |
| 7606 | is the byte value with which to fill it, the third argument is an |
| 7607 | integer argument specifying the number of bytes to fill, and the fourth |
| 7608 | argument is the known alignment of the destination location. |
| 7609 | |
| 7610 | If the call to this intrinsic has an alignment value that is not 0 or 1, |
| 7611 | then the caller guarantees that the destination pointer is aligned to |
| 7612 | that boundary. |
| 7613 | |
| 7614 | If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is |
| 7615 | a :ref:`volatile operation <volatile>`. The detailed access behavior is not |
| 7616 | very cleanly specified and it is unwise to depend on it. |
| 7617 | |
| 7618 | Semantics: |
| 7619 | """""""""" |
| 7620 | |
| 7621 | The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting |
| 7622 | at the destination location. If the argument is known to be aligned to |
| 7623 | some boundary, this can be specified as the fourth argument, otherwise |
Bill Wendling | 6116315 | 2013-10-18 23:26:55 +0000 | [diff] [blame] | 7624 | it should be set to 0 or 1 (both meaning no alignment). |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 7625 | |
| 7626 | '``llvm.sqrt.*``' Intrinsic |
| 7627 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7628 | |
| 7629 | Syntax: |
| 7630 | """"""" |
| 7631 | |
| 7632 | This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any |
| 7633 | floating point or vector of floating point type. Not all targets support |
| 7634 | all types however. |
| 7635 | |
| 7636 | :: |
| 7637 | |
| 7638 | declare float @llvm.sqrt.f32(float %Val) |
| 7639 | declare double @llvm.sqrt.f64(double %Val) |
| 7640 | declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val) |
| 7641 | declare fp128 @llvm.sqrt.f128(fp128 %Val) |
| 7642 | declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val) |
| 7643 | |
| 7644 | Overview: |
| 7645 | """"""""" |
| 7646 | |
| 7647 | The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand, |
| 7648 | returning the same value as the libm '``sqrt``' functions would. Unlike |
| 7649 | ``sqrt`` in libm, however, ``llvm.sqrt`` has undefined behavior for |
| 7650 | negative numbers other than -0.0 (which allows for better optimization, |
| 7651 | because there is no need to worry about errno being set). |
| 7652 | ``llvm.sqrt(-0.0)`` is defined to return -0.0 like IEEE sqrt. |
| 7653 | |
| 7654 | Arguments: |
| 7655 | """""""""" |
| 7656 | |
| 7657 | The argument and return value are floating point numbers of the same |
| 7658 | type. |
| 7659 | |
| 7660 | Semantics: |
| 7661 | """""""""" |
| 7662 | |
| 7663 | This function returns the sqrt of the specified operand if it is a |
| 7664 | nonnegative floating point number. |
| 7665 | |
| 7666 | '``llvm.powi.*``' Intrinsic |
| 7667 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7668 | |
| 7669 | Syntax: |
| 7670 | """"""" |
| 7671 | |
| 7672 | This is an overloaded intrinsic. You can use ``llvm.powi`` on any |
| 7673 | floating point or vector of floating point type. Not all targets support |
| 7674 | all types however. |
| 7675 | |
| 7676 | :: |
| 7677 | |
| 7678 | declare float @llvm.powi.f32(float %Val, i32 %power) |
| 7679 | declare double @llvm.powi.f64(double %Val, i32 %power) |
| 7680 | declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power) |
| 7681 | declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power) |
| 7682 | declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power) |
| 7683 | |
| 7684 | Overview: |
| 7685 | """"""""" |
| 7686 | |
| 7687 | The '``llvm.powi.*``' intrinsics return the first operand raised to the |
| 7688 | specified (positive or negative) power. The order of evaluation of |
| 7689 | multiplications is not defined. When a vector of floating point type is |
| 7690 | used, the second argument remains a scalar integer value. |
| 7691 | |
| 7692 | Arguments: |
| 7693 | """""""""" |
| 7694 | |
| 7695 | The second argument is an integer power, and the first is a value to |
| 7696 | raise to that power. |
| 7697 | |
| 7698 | Semantics: |
| 7699 | """""""""" |
| 7700 | |
| 7701 | This function returns the first value raised to the second power with an |
| 7702 | unspecified sequence of rounding operations. |
| 7703 | |
| 7704 | '``llvm.sin.*``' Intrinsic |
| 7705 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7706 | |
| 7707 | Syntax: |
| 7708 | """"""" |
| 7709 | |
| 7710 | This is an overloaded intrinsic. You can use ``llvm.sin`` on any |
| 7711 | floating point or vector of floating point type. Not all targets support |
| 7712 | all types however. |
| 7713 | |
| 7714 | :: |
| 7715 | |
| 7716 | declare float @llvm.sin.f32(float %Val) |
| 7717 | declare double @llvm.sin.f64(double %Val) |
| 7718 | declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val) |
| 7719 | declare fp128 @llvm.sin.f128(fp128 %Val) |
| 7720 | declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val) |
| 7721 | |
| 7722 | Overview: |
| 7723 | """"""""" |
| 7724 | |
| 7725 | The '``llvm.sin.*``' intrinsics return the sine of the operand. |
| 7726 | |
| 7727 | Arguments: |
| 7728 | """""""""" |
| 7729 | |
| 7730 | The argument and return value are floating point numbers of the same |
| 7731 | type. |
| 7732 | |
| 7733 | Semantics: |
| 7734 | """""""""" |
| 7735 | |
| 7736 | This function returns the sine of the specified operand, returning the |
| 7737 | same values as the libm ``sin`` functions would, and handles error |
| 7738 | conditions in the same way. |
| 7739 | |
| 7740 | '``llvm.cos.*``' Intrinsic |
| 7741 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7742 | |
| 7743 | Syntax: |
| 7744 | """"""" |
| 7745 | |
| 7746 | This is an overloaded intrinsic. You can use ``llvm.cos`` on any |
| 7747 | floating point or vector of floating point type. Not all targets support |
| 7748 | all types however. |
| 7749 | |
| 7750 | :: |
| 7751 | |
| 7752 | declare float @llvm.cos.f32(float %Val) |
| 7753 | declare double @llvm.cos.f64(double %Val) |
| 7754 | declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val) |
| 7755 | declare fp128 @llvm.cos.f128(fp128 %Val) |
| 7756 | declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val) |
| 7757 | |
| 7758 | Overview: |
| 7759 | """"""""" |
| 7760 | |
| 7761 | The '``llvm.cos.*``' intrinsics return the cosine of the operand. |
| 7762 | |
| 7763 | Arguments: |
| 7764 | """""""""" |
| 7765 | |
| 7766 | The argument and return value are floating point numbers of the same |
| 7767 | type. |
| 7768 | |
| 7769 | Semantics: |
| 7770 | """""""""" |
| 7771 | |
| 7772 | This function returns the cosine of the specified operand, returning the |
| 7773 | same values as the libm ``cos`` functions would, and handles error |
| 7774 | conditions in the same way. |
| 7775 | |
| 7776 | '``llvm.pow.*``' Intrinsic |
| 7777 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7778 | |
| 7779 | Syntax: |
| 7780 | """"""" |
| 7781 | |
| 7782 | This is an overloaded intrinsic. You can use ``llvm.pow`` on any |
| 7783 | floating point or vector of floating point type. Not all targets support |
| 7784 | all types however. |
| 7785 | |
| 7786 | :: |
| 7787 | |
| 7788 | declare float @llvm.pow.f32(float %Val, float %Power) |
| 7789 | declare double @llvm.pow.f64(double %Val, double %Power) |
| 7790 | declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power) |
| 7791 | declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power) |
| 7792 | declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power) |
| 7793 | |
| 7794 | Overview: |
| 7795 | """"""""" |
| 7796 | |
| 7797 | The '``llvm.pow.*``' intrinsics return the first operand raised to the |
| 7798 | specified (positive or negative) power. |
| 7799 | |
| 7800 | Arguments: |
| 7801 | """""""""" |
| 7802 | |
| 7803 | The second argument is a floating point power, and the first is a value |
| 7804 | to raise to that power. |
| 7805 | |
| 7806 | Semantics: |
| 7807 | """""""""" |
| 7808 | |
| 7809 | This function returns the first value raised to the second power, |
| 7810 | returning the same values as the libm ``pow`` functions would, and |
| 7811 | handles error conditions in the same way. |
| 7812 | |
| 7813 | '``llvm.exp.*``' Intrinsic |
| 7814 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7815 | |
| 7816 | Syntax: |
| 7817 | """"""" |
| 7818 | |
| 7819 | This is an overloaded intrinsic. You can use ``llvm.exp`` on any |
| 7820 | floating point or vector of floating point type. Not all targets support |
| 7821 | all types however. |
| 7822 | |
| 7823 | :: |
| 7824 | |
| 7825 | declare float @llvm.exp.f32(float %Val) |
| 7826 | declare double @llvm.exp.f64(double %Val) |
| 7827 | declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val) |
| 7828 | declare fp128 @llvm.exp.f128(fp128 %Val) |
| 7829 | declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val) |
| 7830 | |
| 7831 | Overview: |
| 7832 | """"""""" |
| 7833 | |
| 7834 | The '``llvm.exp.*``' intrinsics perform the exp function. |
| 7835 | |
| 7836 | Arguments: |
| 7837 | """""""""" |
| 7838 | |
| 7839 | The argument and return value are floating point numbers of the same |
| 7840 | type. |
| 7841 | |
| 7842 | Semantics: |
| 7843 | """""""""" |
| 7844 | |
| 7845 | This function returns the same values as the libm ``exp`` functions |
| 7846 | would, and handles error conditions in the same way. |
| 7847 | |
| 7848 | '``llvm.exp2.*``' Intrinsic |
| 7849 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7850 | |
| 7851 | Syntax: |
| 7852 | """"""" |
| 7853 | |
| 7854 | This is an overloaded intrinsic. You can use ``llvm.exp2`` on any |
| 7855 | floating point or vector of floating point type. Not all targets support |
| 7856 | all types however. |
| 7857 | |
| 7858 | :: |
| 7859 | |
| 7860 | declare float @llvm.exp2.f32(float %Val) |
| 7861 | declare double @llvm.exp2.f64(double %Val) |
| 7862 | declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val) |
| 7863 | declare fp128 @llvm.exp2.f128(fp128 %Val) |
| 7864 | declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val) |
| 7865 | |
| 7866 | Overview: |
| 7867 | """"""""" |
| 7868 | |
| 7869 | The '``llvm.exp2.*``' intrinsics perform the exp2 function. |
| 7870 | |
| 7871 | Arguments: |
| 7872 | """""""""" |
| 7873 | |
| 7874 | The argument and return value are floating point numbers of the same |
| 7875 | type. |
| 7876 | |
| 7877 | Semantics: |
| 7878 | """""""""" |
| 7879 | |
| 7880 | This function returns the same values as the libm ``exp2`` functions |
| 7881 | would, and handles error conditions in the same way. |
| 7882 | |
| 7883 | '``llvm.log.*``' Intrinsic |
| 7884 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7885 | |
| 7886 | Syntax: |
| 7887 | """"""" |
| 7888 | |
| 7889 | This is an overloaded intrinsic. You can use ``llvm.log`` on any |
| 7890 | floating point or vector of floating point type. Not all targets support |
| 7891 | all types however. |
| 7892 | |
| 7893 | :: |
| 7894 | |
| 7895 | declare float @llvm.log.f32(float %Val) |
| 7896 | declare double @llvm.log.f64(double %Val) |
| 7897 | declare x86_fp80 @llvm.log.f80(x86_fp80 %Val) |
| 7898 | declare fp128 @llvm.log.f128(fp128 %Val) |
| 7899 | declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val) |
| 7900 | |
| 7901 | Overview: |
| 7902 | """"""""" |
| 7903 | |
| 7904 | The '``llvm.log.*``' intrinsics perform the log function. |
| 7905 | |
| 7906 | Arguments: |
| 7907 | """""""""" |
| 7908 | |
| 7909 | The argument and return value are floating point numbers of the same |
| 7910 | type. |
| 7911 | |
| 7912 | Semantics: |
| 7913 | """""""""" |
| 7914 | |
| 7915 | This function returns the same values as the libm ``log`` functions |
| 7916 | would, and handles error conditions in the same way. |
| 7917 | |
| 7918 | '``llvm.log10.*``' Intrinsic |
| 7919 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7920 | |
| 7921 | Syntax: |
| 7922 | """"""" |
| 7923 | |
| 7924 | This is an overloaded intrinsic. You can use ``llvm.log10`` on any |
| 7925 | floating point or vector of floating point type. Not all targets support |
| 7926 | all types however. |
| 7927 | |
| 7928 | :: |
| 7929 | |
| 7930 | declare float @llvm.log10.f32(float %Val) |
| 7931 | declare double @llvm.log10.f64(double %Val) |
| 7932 | declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val) |
| 7933 | declare fp128 @llvm.log10.f128(fp128 %Val) |
| 7934 | declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val) |
| 7935 | |
| 7936 | Overview: |
| 7937 | """"""""" |
| 7938 | |
| 7939 | The '``llvm.log10.*``' intrinsics perform the log10 function. |
| 7940 | |
| 7941 | Arguments: |
| 7942 | """""""""" |
| 7943 | |
| 7944 | The argument and return value are floating point numbers of the same |
| 7945 | type. |
| 7946 | |
| 7947 | Semantics: |
| 7948 | """""""""" |
| 7949 | |
| 7950 | This function returns the same values as the libm ``log10`` functions |
| 7951 | would, and handles error conditions in the same way. |
| 7952 | |
| 7953 | '``llvm.log2.*``' Intrinsic |
| 7954 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7955 | |
| 7956 | Syntax: |
| 7957 | """"""" |
| 7958 | |
| 7959 | This is an overloaded intrinsic. You can use ``llvm.log2`` on any |
| 7960 | floating point or vector of floating point type. Not all targets support |
| 7961 | all types however. |
| 7962 | |
| 7963 | :: |
| 7964 | |
| 7965 | declare float @llvm.log2.f32(float %Val) |
| 7966 | declare double @llvm.log2.f64(double %Val) |
| 7967 | declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val) |
| 7968 | declare fp128 @llvm.log2.f128(fp128 %Val) |
| 7969 | declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val) |
| 7970 | |
| 7971 | Overview: |
| 7972 | """"""""" |
| 7973 | |
| 7974 | The '``llvm.log2.*``' intrinsics perform the log2 function. |
| 7975 | |
| 7976 | Arguments: |
| 7977 | """""""""" |
| 7978 | |
| 7979 | The argument and return value are floating point numbers of the same |
| 7980 | type. |
| 7981 | |
| 7982 | Semantics: |
| 7983 | """""""""" |
| 7984 | |
| 7985 | This function returns the same values as the libm ``log2`` functions |
| 7986 | would, and handles error conditions in the same way. |
| 7987 | |
| 7988 | '``llvm.fma.*``' Intrinsic |
| 7989 | ^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 7990 | |
| 7991 | Syntax: |
| 7992 | """"""" |
| 7993 | |
| 7994 | This is an overloaded intrinsic. You can use ``llvm.fma`` on any |
| 7995 | floating point or vector of floating point type. Not all targets support |
| 7996 | all types however. |
| 7997 | |
| 7998 | :: |
| 7999 | |
| 8000 | declare float @llvm.fma.f32(float %a, float %b, float %c) |
| 8001 | declare double @llvm.fma.f64(double %a, double %b, double %c) |
| 8002 | declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c) |
| 8003 | declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c) |
| 8004 | declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c) |
| 8005 | |
| 8006 | Overview: |
| 8007 | """"""""" |
| 8008 | |
| 8009 | The '``llvm.fma.*``' intrinsics perform the fused multiply-add |
| 8010 | operation. |
| 8011 | |
| 8012 | Arguments: |
| 8013 | """""""""" |
| 8014 | |
| 8015 | The argument and return value are floating point numbers of the same |
| 8016 | type. |
| 8017 | |
| 8018 | Semantics: |
| 8019 | """""""""" |
| 8020 | |
| 8021 | This function returns the same values as the libm ``fma`` functions |
Matt Arsenault | ee364ee | 2014-01-31 00:09:00 +0000 | [diff] [blame] | 8022 | would, and does not set errno. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8023 | |
| 8024 | '``llvm.fabs.*``' Intrinsic |
| 8025 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8026 | |
| 8027 | Syntax: |
| 8028 | """"""" |
| 8029 | |
| 8030 | This is an overloaded intrinsic. You can use ``llvm.fabs`` on any |
| 8031 | floating point or vector of floating point type. Not all targets support |
| 8032 | all types however. |
| 8033 | |
| 8034 | :: |
| 8035 | |
| 8036 | declare float @llvm.fabs.f32(float %Val) |
| 8037 | declare double @llvm.fabs.f64(double %Val) |
Matt Arsenault | d6511b4 | 2014-10-21 23:00:20 +0000 | [diff] [blame] | 8038 | declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8039 | declare fp128 @llvm.fabs.f128(fp128 %Val) |
Matt Arsenault | d6511b4 | 2014-10-21 23:00:20 +0000 | [diff] [blame] | 8040 | declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8041 | |
| 8042 | Overview: |
| 8043 | """"""""" |
| 8044 | |
| 8045 | The '``llvm.fabs.*``' intrinsics return the absolute value of the |
| 8046 | operand. |
| 8047 | |
| 8048 | Arguments: |
| 8049 | """""""""" |
| 8050 | |
| 8051 | The argument and return value are floating point numbers of the same |
| 8052 | type. |
| 8053 | |
| 8054 | Semantics: |
| 8055 | """""""""" |
| 8056 | |
| 8057 | This function returns the same values as the libm ``fabs`` functions |
| 8058 | would, and handles error conditions in the same way. |
| 8059 | |
Matt Arsenault | d6511b4 | 2014-10-21 23:00:20 +0000 | [diff] [blame] | 8060 | '``llvm.minnum.*``' Intrinsic |
Matt Arsenault | 9886b0d | 2014-10-22 00:15:53 +0000 | [diff] [blame] | 8061 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Matt Arsenault | d6511b4 | 2014-10-21 23:00:20 +0000 | [diff] [blame] | 8062 | |
| 8063 | Syntax: |
| 8064 | """"""" |
| 8065 | |
| 8066 | This is an overloaded intrinsic. You can use ``llvm.minnum`` on any |
| 8067 | floating point or vector of floating point type. Not all targets support |
| 8068 | all types however. |
| 8069 | |
| 8070 | :: |
| 8071 | |
Matt Arsenault | 64313c9 | 2014-10-22 18:25:02 +0000 | [diff] [blame] | 8072 | declare float @llvm.minnum.f32(float %Val0, float %Val1) |
| 8073 | declare double @llvm.minnum.f64(double %Val0, double %Val1) |
| 8074 | declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) |
| 8075 | declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1) |
| 8076 | declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) |
Matt Arsenault | d6511b4 | 2014-10-21 23:00:20 +0000 | [diff] [blame] | 8077 | |
| 8078 | Overview: |
| 8079 | """"""""" |
| 8080 | |
| 8081 | The '``llvm.minnum.*``' intrinsics return the minimum of the two |
| 8082 | arguments. |
| 8083 | |
| 8084 | |
| 8085 | Arguments: |
| 8086 | """""""""" |
| 8087 | |
| 8088 | The arguments and return value are floating point numbers of the same |
| 8089 | type. |
| 8090 | |
| 8091 | Semantics: |
| 8092 | """""""""" |
| 8093 | |
| 8094 | Follows the IEEE-754 semantics for minNum, which also match for libm's |
| 8095 | fmin. |
| 8096 | |
| 8097 | If either operand is a NaN, returns the other non-NaN operand. Returns |
| 8098 | NaN only if both operands are NaN. If the operands compare equal, |
| 8099 | returns a value that compares equal to both operands. This means that |
| 8100 | fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0. |
| 8101 | |
| 8102 | '``llvm.maxnum.*``' Intrinsic |
Matt Arsenault | 9886b0d | 2014-10-22 00:15:53 +0000 | [diff] [blame] | 8103 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Matt Arsenault | d6511b4 | 2014-10-21 23:00:20 +0000 | [diff] [blame] | 8104 | |
| 8105 | Syntax: |
| 8106 | """"""" |
| 8107 | |
| 8108 | This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any |
| 8109 | floating point or vector of floating point type. Not all targets support |
| 8110 | all types however. |
| 8111 | |
| 8112 | :: |
| 8113 | |
Matt Arsenault | 64313c9 | 2014-10-22 18:25:02 +0000 | [diff] [blame] | 8114 | declare float @llvm.maxnum.f32(float %Val0, float %Val1l) |
| 8115 | declare double @llvm.maxnum.f64(double %Val0, double %Val1) |
| 8116 | declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1) |
| 8117 | declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1) |
| 8118 | declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1) |
Matt Arsenault | d6511b4 | 2014-10-21 23:00:20 +0000 | [diff] [blame] | 8119 | |
| 8120 | Overview: |
| 8121 | """"""""" |
| 8122 | |
| 8123 | The '``llvm.maxnum.*``' intrinsics return the maximum of the two |
| 8124 | arguments. |
| 8125 | |
| 8126 | |
| 8127 | Arguments: |
| 8128 | """""""""" |
| 8129 | |
| 8130 | The arguments and return value are floating point numbers of the same |
| 8131 | type. |
| 8132 | |
| 8133 | Semantics: |
| 8134 | """""""""" |
| 8135 | Follows the IEEE-754 semantics for maxNum, which also match for libm's |
| 8136 | fmax. |
| 8137 | |
| 8138 | If either operand is a NaN, returns the other non-NaN operand. Returns |
| 8139 | NaN only if both operands are NaN. If the operands compare equal, |
| 8140 | returns a value that compares equal to both operands. This means that |
| 8141 | fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0. |
| 8142 | |
Hal Finkel | 0c5c01aa | 2013-08-19 23:35:46 +0000 | [diff] [blame] | 8143 | '``llvm.copysign.*``' Intrinsic |
| 8144 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8145 | |
| 8146 | Syntax: |
| 8147 | """"""" |
| 8148 | |
| 8149 | This is an overloaded intrinsic. You can use ``llvm.copysign`` on any |
| 8150 | floating point or vector of floating point type. Not all targets support |
| 8151 | all types however. |
| 8152 | |
| 8153 | :: |
| 8154 | |
| 8155 | declare float @llvm.copysign.f32(float %Mag, float %Sgn) |
| 8156 | declare double @llvm.copysign.f64(double %Mag, double %Sgn) |
| 8157 | declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn) |
| 8158 | declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn) |
| 8159 | declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn) |
| 8160 | |
| 8161 | Overview: |
| 8162 | """"""""" |
| 8163 | |
| 8164 | The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the |
| 8165 | first operand and the sign of the second operand. |
| 8166 | |
| 8167 | Arguments: |
| 8168 | """""""""" |
| 8169 | |
| 8170 | The arguments and return value are floating point numbers of the same |
| 8171 | type. |
| 8172 | |
| 8173 | Semantics: |
| 8174 | """""""""" |
| 8175 | |
| 8176 | This function returns the same values as the libm ``copysign`` |
| 8177 | functions would, and handles error conditions in the same way. |
| 8178 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8179 | '``llvm.floor.*``' Intrinsic |
| 8180 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8181 | |
| 8182 | Syntax: |
| 8183 | """"""" |
| 8184 | |
| 8185 | This is an overloaded intrinsic. You can use ``llvm.floor`` on any |
| 8186 | floating point or vector of floating point type. Not all targets support |
| 8187 | all types however. |
| 8188 | |
| 8189 | :: |
| 8190 | |
| 8191 | declare float @llvm.floor.f32(float %Val) |
| 8192 | declare double @llvm.floor.f64(double %Val) |
| 8193 | declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val) |
| 8194 | declare fp128 @llvm.floor.f128(fp128 %Val) |
| 8195 | declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val) |
| 8196 | |
| 8197 | Overview: |
| 8198 | """"""""" |
| 8199 | |
| 8200 | The '``llvm.floor.*``' intrinsics return the floor of the operand. |
| 8201 | |
| 8202 | Arguments: |
| 8203 | """""""""" |
| 8204 | |
| 8205 | The argument and return value are floating point numbers of the same |
| 8206 | type. |
| 8207 | |
| 8208 | Semantics: |
| 8209 | """""""""" |
| 8210 | |
| 8211 | This function returns the same values as the libm ``floor`` functions |
| 8212 | would, and handles error conditions in the same way. |
| 8213 | |
| 8214 | '``llvm.ceil.*``' Intrinsic |
| 8215 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8216 | |
| 8217 | Syntax: |
| 8218 | """"""" |
| 8219 | |
| 8220 | This is an overloaded intrinsic. You can use ``llvm.ceil`` on any |
| 8221 | floating point or vector of floating point type. Not all targets support |
| 8222 | all types however. |
| 8223 | |
| 8224 | :: |
| 8225 | |
| 8226 | declare float @llvm.ceil.f32(float %Val) |
| 8227 | declare double @llvm.ceil.f64(double %Val) |
| 8228 | declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val) |
| 8229 | declare fp128 @llvm.ceil.f128(fp128 %Val) |
| 8230 | declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val) |
| 8231 | |
| 8232 | Overview: |
| 8233 | """"""""" |
| 8234 | |
| 8235 | The '``llvm.ceil.*``' intrinsics return the ceiling of the operand. |
| 8236 | |
| 8237 | Arguments: |
| 8238 | """""""""" |
| 8239 | |
| 8240 | The argument and return value are floating point numbers of the same |
| 8241 | type. |
| 8242 | |
| 8243 | Semantics: |
| 8244 | """""""""" |
| 8245 | |
| 8246 | This function returns the same values as the libm ``ceil`` functions |
| 8247 | would, and handles error conditions in the same way. |
| 8248 | |
| 8249 | '``llvm.trunc.*``' Intrinsic |
| 8250 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8251 | |
| 8252 | Syntax: |
| 8253 | """"""" |
| 8254 | |
| 8255 | This is an overloaded intrinsic. You can use ``llvm.trunc`` on any |
| 8256 | floating point or vector of floating point type. Not all targets support |
| 8257 | all types however. |
| 8258 | |
| 8259 | :: |
| 8260 | |
| 8261 | declare float @llvm.trunc.f32(float %Val) |
| 8262 | declare double @llvm.trunc.f64(double %Val) |
| 8263 | declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val) |
| 8264 | declare fp128 @llvm.trunc.f128(fp128 %Val) |
| 8265 | declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val) |
| 8266 | |
| 8267 | Overview: |
| 8268 | """"""""" |
| 8269 | |
| 8270 | The '``llvm.trunc.*``' intrinsics returns the operand rounded to the |
| 8271 | nearest integer not larger in magnitude than the operand. |
| 8272 | |
| 8273 | Arguments: |
| 8274 | """""""""" |
| 8275 | |
| 8276 | The argument and return value are floating point numbers of the same |
| 8277 | type. |
| 8278 | |
| 8279 | Semantics: |
| 8280 | """""""""" |
| 8281 | |
| 8282 | This function returns the same values as the libm ``trunc`` functions |
| 8283 | would, and handles error conditions in the same way. |
| 8284 | |
| 8285 | '``llvm.rint.*``' Intrinsic |
| 8286 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8287 | |
| 8288 | Syntax: |
| 8289 | """"""" |
| 8290 | |
| 8291 | This is an overloaded intrinsic. You can use ``llvm.rint`` on any |
| 8292 | floating point or vector of floating point type. Not all targets support |
| 8293 | all types however. |
| 8294 | |
| 8295 | :: |
| 8296 | |
| 8297 | declare float @llvm.rint.f32(float %Val) |
| 8298 | declare double @llvm.rint.f64(double %Val) |
| 8299 | declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val) |
| 8300 | declare fp128 @llvm.rint.f128(fp128 %Val) |
| 8301 | declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val) |
| 8302 | |
| 8303 | Overview: |
| 8304 | """"""""" |
| 8305 | |
| 8306 | The '``llvm.rint.*``' intrinsics returns the operand rounded to the |
| 8307 | nearest integer. It may raise an inexact floating-point exception if the |
| 8308 | operand isn't an integer. |
| 8309 | |
| 8310 | Arguments: |
| 8311 | """""""""" |
| 8312 | |
| 8313 | The argument and return value are floating point numbers of the same |
| 8314 | type. |
| 8315 | |
| 8316 | Semantics: |
| 8317 | """""""""" |
| 8318 | |
| 8319 | This function returns the same values as the libm ``rint`` functions |
| 8320 | would, and handles error conditions in the same way. |
| 8321 | |
| 8322 | '``llvm.nearbyint.*``' Intrinsic |
| 8323 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8324 | |
| 8325 | Syntax: |
| 8326 | """"""" |
| 8327 | |
| 8328 | This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any |
| 8329 | floating point or vector of floating point type. Not all targets support |
| 8330 | all types however. |
| 8331 | |
| 8332 | :: |
| 8333 | |
| 8334 | declare float @llvm.nearbyint.f32(float %Val) |
| 8335 | declare double @llvm.nearbyint.f64(double %Val) |
| 8336 | declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val) |
| 8337 | declare fp128 @llvm.nearbyint.f128(fp128 %Val) |
| 8338 | declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val) |
| 8339 | |
| 8340 | Overview: |
| 8341 | """"""""" |
| 8342 | |
| 8343 | The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the |
| 8344 | nearest integer. |
| 8345 | |
| 8346 | Arguments: |
| 8347 | """""""""" |
| 8348 | |
| 8349 | The argument and return value are floating point numbers of the same |
| 8350 | type. |
| 8351 | |
| 8352 | Semantics: |
| 8353 | """""""""" |
| 8354 | |
| 8355 | This function returns the same values as the libm ``nearbyint`` |
| 8356 | functions would, and handles error conditions in the same way. |
| 8357 | |
Hal Finkel | 171817e | 2013-08-07 22:49:12 +0000 | [diff] [blame] | 8358 | '``llvm.round.*``' Intrinsic |
| 8359 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8360 | |
| 8361 | Syntax: |
| 8362 | """"""" |
| 8363 | |
| 8364 | This is an overloaded intrinsic. You can use ``llvm.round`` on any |
| 8365 | floating point or vector of floating point type. Not all targets support |
| 8366 | all types however. |
| 8367 | |
| 8368 | :: |
| 8369 | |
| 8370 | declare float @llvm.round.f32(float %Val) |
| 8371 | declare double @llvm.round.f64(double %Val) |
| 8372 | declare x86_fp80 @llvm.round.f80(x86_fp80 %Val) |
| 8373 | declare fp128 @llvm.round.f128(fp128 %Val) |
| 8374 | declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val) |
| 8375 | |
| 8376 | Overview: |
| 8377 | """"""""" |
| 8378 | |
| 8379 | The '``llvm.round.*``' intrinsics returns the operand rounded to the |
| 8380 | nearest integer. |
| 8381 | |
| 8382 | Arguments: |
| 8383 | """""""""" |
| 8384 | |
| 8385 | The argument and return value are floating point numbers of the same |
| 8386 | type. |
| 8387 | |
| 8388 | Semantics: |
| 8389 | """""""""" |
| 8390 | |
| 8391 | This function returns the same values as the libm ``round`` |
| 8392 | functions would, and handles error conditions in the same way. |
| 8393 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8394 | Bit Manipulation Intrinsics |
| 8395 | --------------------------- |
| 8396 | |
| 8397 | LLVM provides intrinsics for a few important bit manipulation |
| 8398 | operations. These allow efficient code generation for some algorithms. |
| 8399 | |
| 8400 | '``llvm.bswap.*``' Intrinsics |
| 8401 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8402 | |
| 8403 | Syntax: |
| 8404 | """"""" |
| 8405 | |
| 8406 | This is an overloaded intrinsic function. You can use bswap on any |
| 8407 | integer type that is an even number of bytes (i.e. BitWidth % 16 == 0). |
| 8408 | |
| 8409 | :: |
| 8410 | |
| 8411 | declare i16 @llvm.bswap.i16(i16 <id>) |
| 8412 | declare i32 @llvm.bswap.i32(i32 <id>) |
| 8413 | declare i64 @llvm.bswap.i64(i64 <id>) |
| 8414 | |
| 8415 | Overview: |
| 8416 | """"""""" |
| 8417 | |
| 8418 | The '``llvm.bswap``' family of intrinsics is used to byte swap integer |
| 8419 | values with an even number of bytes (positive multiple of 16 bits). |
| 8420 | These are useful for performing operations on data that is not in the |
| 8421 | target's native byte order. |
| 8422 | |
| 8423 | Semantics: |
| 8424 | """""""""" |
| 8425 | |
| 8426 | The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high |
| 8427 | and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32`` |
| 8428 | intrinsic returns an i32 value that has the four bytes of the input i32 |
| 8429 | swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the |
| 8430 | returned i32 will have its bytes in 3, 2, 1, 0 order. The |
| 8431 | ``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this |
| 8432 | concept to additional even-byte lengths (6 bytes, 8 bytes and more, |
| 8433 | respectively). |
| 8434 | |
| 8435 | '``llvm.ctpop.*``' Intrinsic |
| 8436 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8437 | |
| 8438 | Syntax: |
| 8439 | """"""" |
| 8440 | |
| 8441 | This is an overloaded intrinsic. You can use llvm.ctpop on any integer |
| 8442 | bit width, or on any vector with integer elements. Not all targets |
| 8443 | support all bit widths or vector types, however. |
| 8444 | |
| 8445 | :: |
| 8446 | |
| 8447 | declare i8 @llvm.ctpop.i8(i8 <src>) |
| 8448 | declare i16 @llvm.ctpop.i16(i16 <src>) |
| 8449 | declare i32 @llvm.ctpop.i32(i32 <src>) |
| 8450 | declare i64 @llvm.ctpop.i64(i64 <src>) |
| 8451 | declare i256 @llvm.ctpop.i256(i256 <src>) |
| 8452 | declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>) |
| 8453 | |
| 8454 | Overview: |
| 8455 | """"""""" |
| 8456 | |
| 8457 | The '``llvm.ctpop``' family of intrinsics counts the number of bits set |
| 8458 | in a value. |
| 8459 | |
| 8460 | Arguments: |
| 8461 | """""""""" |
| 8462 | |
| 8463 | The only argument is the value to be counted. The argument may be of any |
| 8464 | integer type, or a vector with integer elements. The return type must |
| 8465 | match the argument type. |
| 8466 | |
| 8467 | Semantics: |
| 8468 | """""""""" |
| 8469 | |
| 8470 | The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within |
| 8471 | each element of a vector. |
| 8472 | |
| 8473 | '``llvm.ctlz.*``' Intrinsic |
| 8474 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8475 | |
| 8476 | Syntax: |
| 8477 | """"""" |
| 8478 | |
| 8479 | This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any |
| 8480 | integer bit width, or any vector whose elements are integers. Not all |
| 8481 | targets support all bit widths or vector types, however. |
| 8482 | |
| 8483 | :: |
| 8484 | |
| 8485 | declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>) |
| 8486 | declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>) |
| 8487 | declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>) |
| 8488 | declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>) |
| 8489 | declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>) |
| 8490 | declase <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) |
| 8491 | |
| 8492 | Overview: |
| 8493 | """"""""" |
| 8494 | |
| 8495 | The '``llvm.ctlz``' family of intrinsic functions counts the number of |
| 8496 | leading zeros in a variable. |
| 8497 | |
| 8498 | Arguments: |
| 8499 | """""""""" |
| 8500 | |
| 8501 | The first argument is the value to be counted. This argument may be of |
| 8502 | any integer type, or a vectory with integer element type. The return |
| 8503 | type must match the first argument type. |
| 8504 | |
| 8505 | The second argument must be a constant and is a flag to indicate whether |
| 8506 | the intrinsic should ensure that a zero as the first argument produces a |
| 8507 | defined result. Historically some architectures did not provide a |
| 8508 | defined result for zero values as efficiently, and many algorithms are |
| 8509 | now predicated on avoiding zero-value inputs. |
| 8510 | |
| 8511 | Semantics: |
| 8512 | """""""""" |
| 8513 | |
| 8514 | The '``llvm.ctlz``' intrinsic counts the leading (most significant) |
| 8515 | zeros in a variable, or within each element of the vector. If |
| 8516 | ``src == 0`` then the result is the size in bits of the type of ``src`` |
| 8517 | if ``is_zero_undef == 0`` and ``undef`` otherwise. For example, |
| 8518 | ``llvm.ctlz(i32 2) = 30``. |
| 8519 | |
| 8520 | '``llvm.cttz.*``' Intrinsic |
| 8521 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8522 | |
| 8523 | Syntax: |
| 8524 | """"""" |
| 8525 | |
| 8526 | This is an overloaded intrinsic. You can use ``llvm.cttz`` on any |
| 8527 | integer bit width, or any vector of integer elements. Not all targets |
| 8528 | support all bit widths or vector types, however. |
| 8529 | |
| 8530 | :: |
| 8531 | |
| 8532 | declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>) |
| 8533 | declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>) |
| 8534 | declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>) |
| 8535 | declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>) |
| 8536 | declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>) |
| 8537 | declase <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>) |
| 8538 | |
| 8539 | Overview: |
| 8540 | """"""""" |
| 8541 | |
| 8542 | The '``llvm.cttz``' family of intrinsic functions counts the number of |
| 8543 | trailing zeros. |
| 8544 | |
| 8545 | Arguments: |
| 8546 | """""""""" |
| 8547 | |
| 8548 | The first argument is the value to be counted. This argument may be of |
| 8549 | any integer type, or a vectory with integer element type. The return |
| 8550 | type must match the first argument type. |
| 8551 | |
| 8552 | The second argument must be a constant and is a flag to indicate whether |
| 8553 | the intrinsic should ensure that a zero as the first argument produces a |
| 8554 | defined result. Historically some architectures did not provide a |
| 8555 | defined result for zero values as efficiently, and many algorithms are |
| 8556 | now predicated on avoiding zero-value inputs. |
| 8557 | |
| 8558 | Semantics: |
| 8559 | """""""""" |
| 8560 | |
| 8561 | The '``llvm.cttz``' intrinsic counts the trailing (least significant) |
| 8562 | zeros in a variable, or within each element of a vector. If ``src == 0`` |
| 8563 | then the result is the size in bits of the type of ``src`` if |
| 8564 | ``is_zero_undef == 0`` and ``undef`` otherwise. For example, |
| 8565 | ``llvm.cttz(2) = 1``. |
| 8566 | |
| 8567 | Arithmetic with Overflow Intrinsics |
| 8568 | ----------------------------------- |
| 8569 | |
| 8570 | LLVM provides intrinsics for some arithmetic with overflow operations. |
| 8571 | |
| 8572 | '``llvm.sadd.with.overflow.*``' Intrinsics |
| 8573 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8574 | |
| 8575 | Syntax: |
| 8576 | """"""" |
| 8577 | |
| 8578 | This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow`` |
| 8579 | on any integer bit width. |
| 8580 | |
| 8581 | :: |
| 8582 | |
| 8583 | declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) |
| 8584 | declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) |
| 8585 | declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b) |
| 8586 | |
| 8587 | Overview: |
| 8588 | """"""""" |
| 8589 | |
| 8590 | The '``llvm.sadd.with.overflow``' family of intrinsic functions perform |
| 8591 | a signed addition of the two arguments, and indicate whether an overflow |
| 8592 | occurred during the signed summation. |
| 8593 | |
| 8594 | Arguments: |
| 8595 | """""""""" |
| 8596 | |
| 8597 | The arguments (%a and %b) and the first element of the result structure |
| 8598 | may be of integer types of any bit width, but they must have the same |
| 8599 | bit width. The second element of the result structure must be of type |
| 8600 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed |
| 8601 | addition. |
| 8602 | |
| 8603 | Semantics: |
| 8604 | """""""""" |
| 8605 | |
| 8606 | The '``llvm.sadd.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 8607 | a signed addition of the two variables. They return a structure --- the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8608 | first element of which is the signed summation, and the second element |
| 8609 | of which is a bit specifying if the signed summation resulted in an |
| 8610 | overflow. |
| 8611 | |
| 8612 | Examples: |
| 8613 | """"""""" |
| 8614 | |
| 8615 | .. code-block:: llvm |
| 8616 | |
| 8617 | %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) |
| 8618 | %sum = extractvalue {i32, i1} %res, 0 |
| 8619 | %obit = extractvalue {i32, i1} %res, 1 |
| 8620 | br i1 %obit, label %overflow, label %normal |
| 8621 | |
| 8622 | '``llvm.uadd.with.overflow.*``' Intrinsics |
| 8623 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8624 | |
| 8625 | Syntax: |
| 8626 | """"""" |
| 8627 | |
| 8628 | This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow`` |
| 8629 | on any integer bit width. |
| 8630 | |
| 8631 | :: |
| 8632 | |
| 8633 | declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b) |
| 8634 | declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) |
| 8635 | declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b) |
| 8636 | |
| 8637 | Overview: |
| 8638 | """"""""" |
| 8639 | |
| 8640 | The '``llvm.uadd.with.overflow``' family of intrinsic functions perform |
| 8641 | an unsigned addition of the two arguments, and indicate whether a carry |
| 8642 | occurred during the unsigned summation. |
| 8643 | |
| 8644 | Arguments: |
| 8645 | """""""""" |
| 8646 | |
| 8647 | The arguments (%a and %b) and the first element of the result structure |
| 8648 | may be of integer types of any bit width, but they must have the same |
| 8649 | bit width. The second element of the result structure must be of type |
| 8650 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned |
| 8651 | addition. |
| 8652 | |
| 8653 | Semantics: |
| 8654 | """""""""" |
| 8655 | |
| 8656 | The '``llvm.uadd.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 8657 | an unsigned addition of the two arguments. They return a structure --- the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8658 | first element of which is the sum, and the second element of which is a |
| 8659 | bit specifying if the unsigned summation resulted in a carry. |
| 8660 | |
| 8661 | Examples: |
| 8662 | """"""""" |
| 8663 | |
| 8664 | .. code-block:: llvm |
| 8665 | |
| 8666 | %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b) |
| 8667 | %sum = extractvalue {i32, i1} %res, 0 |
| 8668 | %obit = extractvalue {i32, i1} %res, 1 |
| 8669 | br i1 %obit, label %carry, label %normal |
| 8670 | |
| 8671 | '``llvm.ssub.with.overflow.*``' Intrinsics |
| 8672 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8673 | |
| 8674 | Syntax: |
| 8675 | """"""" |
| 8676 | |
| 8677 | This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow`` |
| 8678 | on any integer bit width. |
| 8679 | |
| 8680 | :: |
| 8681 | |
| 8682 | declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b) |
| 8683 | declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) |
| 8684 | declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b) |
| 8685 | |
| 8686 | Overview: |
| 8687 | """"""""" |
| 8688 | |
| 8689 | The '``llvm.ssub.with.overflow``' family of intrinsic functions perform |
| 8690 | a signed subtraction of the two arguments, and indicate whether an |
| 8691 | overflow occurred during the signed subtraction. |
| 8692 | |
| 8693 | Arguments: |
| 8694 | """""""""" |
| 8695 | |
| 8696 | The arguments (%a and %b) and the first element of the result structure |
| 8697 | may be of integer types of any bit width, but they must have the same |
| 8698 | bit width. The second element of the result structure must be of type |
| 8699 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed |
| 8700 | subtraction. |
| 8701 | |
| 8702 | Semantics: |
| 8703 | """""""""" |
| 8704 | |
| 8705 | The '``llvm.ssub.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 8706 | a signed subtraction of the two arguments. They return a structure --- the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8707 | first element of which is the subtraction, and the second element of |
| 8708 | which is a bit specifying if the signed subtraction resulted in an |
| 8709 | overflow. |
| 8710 | |
| 8711 | Examples: |
| 8712 | """"""""" |
| 8713 | |
| 8714 | .. code-block:: llvm |
| 8715 | |
| 8716 | %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b) |
| 8717 | %sum = extractvalue {i32, i1} %res, 0 |
| 8718 | %obit = extractvalue {i32, i1} %res, 1 |
| 8719 | br i1 %obit, label %overflow, label %normal |
| 8720 | |
| 8721 | '``llvm.usub.with.overflow.*``' Intrinsics |
| 8722 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8723 | |
| 8724 | Syntax: |
| 8725 | """"""" |
| 8726 | |
| 8727 | This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow`` |
| 8728 | on any integer bit width. |
| 8729 | |
| 8730 | :: |
| 8731 | |
| 8732 | declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b) |
| 8733 | declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) |
| 8734 | declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b) |
| 8735 | |
| 8736 | Overview: |
| 8737 | """"""""" |
| 8738 | |
| 8739 | The '``llvm.usub.with.overflow``' family of intrinsic functions perform |
| 8740 | an unsigned subtraction of the two arguments, and indicate whether an |
| 8741 | overflow occurred during the unsigned subtraction. |
| 8742 | |
| 8743 | Arguments: |
| 8744 | """""""""" |
| 8745 | |
| 8746 | The arguments (%a and %b) and the first element of the result structure |
| 8747 | may be of integer types of any bit width, but they must have the same |
| 8748 | bit width. The second element of the result structure must be of type |
| 8749 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned |
| 8750 | subtraction. |
| 8751 | |
| 8752 | Semantics: |
| 8753 | """""""""" |
| 8754 | |
| 8755 | The '``llvm.usub.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 8756 | an unsigned subtraction of the two arguments. They return a structure --- |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8757 | the first element of which is the subtraction, and the second element of |
| 8758 | which is a bit specifying if the unsigned subtraction resulted in an |
| 8759 | overflow. |
| 8760 | |
| 8761 | Examples: |
| 8762 | """"""""" |
| 8763 | |
| 8764 | .. code-block:: llvm |
| 8765 | |
| 8766 | %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b) |
| 8767 | %sum = extractvalue {i32, i1} %res, 0 |
| 8768 | %obit = extractvalue {i32, i1} %res, 1 |
| 8769 | br i1 %obit, label %overflow, label %normal |
| 8770 | |
| 8771 | '``llvm.smul.with.overflow.*``' Intrinsics |
| 8772 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8773 | |
| 8774 | Syntax: |
| 8775 | """"""" |
| 8776 | |
| 8777 | This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow`` |
| 8778 | on any integer bit width. |
| 8779 | |
| 8780 | :: |
| 8781 | |
| 8782 | declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b) |
| 8783 | declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) |
| 8784 | declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b) |
| 8785 | |
| 8786 | Overview: |
| 8787 | """"""""" |
| 8788 | |
| 8789 | The '``llvm.smul.with.overflow``' family of intrinsic functions perform |
| 8790 | a signed multiplication of the two arguments, and indicate whether an |
| 8791 | overflow occurred during the signed multiplication. |
| 8792 | |
| 8793 | Arguments: |
| 8794 | """""""""" |
| 8795 | |
| 8796 | The arguments (%a and %b) and the first element of the result structure |
| 8797 | may be of integer types of any bit width, but they must have the same |
| 8798 | bit width. The second element of the result structure must be of type |
| 8799 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo signed |
| 8800 | multiplication. |
| 8801 | |
| 8802 | Semantics: |
| 8803 | """""""""" |
| 8804 | |
| 8805 | The '``llvm.smul.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 8806 | a signed multiplication of the two arguments. They return a structure --- |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8807 | the first element of which is the multiplication, and the second element |
| 8808 | of which is a bit specifying if the signed multiplication resulted in an |
| 8809 | overflow. |
| 8810 | |
| 8811 | Examples: |
| 8812 | """"""""" |
| 8813 | |
| 8814 | .. code-block:: llvm |
| 8815 | |
| 8816 | %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b) |
| 8817 | %sum = extractvalue {i32, i1} %res, 0 |
| 8818 | %obit = extractvalue {i32, i1} %res, 1 |
| 8819 | br i1 %obit, label %overflow, label %normal |
| 8820 | |
| 8821 | '``llvm.umul.with.overflow.*``' Intrinsics |
| 8822 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8823 | |
| 8824 | Syntax: |
| 8825 | """"""" |
| 8826 | |
| 8827 | This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow`` |
| 8828 | on any integer bit width. |
| 8829 | |
| 8830 | :: |
| 8831 | |
| 8832 | declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b) |
| 8833 | declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) |
| 8834 | declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b) |
| 8835 | |
| 8836 | Overview: |
| 8837 | """"""""" |
| 8838 | |
| 8839 | The '``llvm.umul.with.overflow``' family of intrinsic functions perform |
| 8840 | a unsigned multiplication of the two arguments, and indicate whether an |
| 8841 | overflow occurred during the unsigned multiplication. |
| 8842 | |
| 8843 | Arguments: |
| 8844 | """""""""" |
| 8845 | |
| 8846 | The arguments (%a and %b) and the first element of the result structure |
| 8847 | may be of integer types of any bit width, but they must have the same |
| 8848 | bit width. The second element of the result structure must be of type |
| 8849 | ``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned |
| 8850 | multiplication. |
| 8851 | |
| 8852 | Semantics: |
| 8853 | """""""""" |
| 8854 | |
| 8855 | The '``llvm.umul.with.overflow``' family of intrinsic functions perform |
Dmitri Gribenko | e813112 | 2013-01-19 20:34:20 +0000 | [diff] [blame] | 8856 | an unsigned multiplication of the two arguments. They return a structure --- |
| 8857 | the first element of which is the multiplication, and the second |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8858 | element of which is a bit specifying if the unsigned multiplication |
| 8859 | resulted in an overflow. |
| 8860 | |
| 8861 | Examples: |
| 8862 | """"""""" |
| 8863 | |
| 8864 | .. code-block:: llvm |
| 8865 | |
| 8866 | %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b) |
| 8867 | %sum = extractvalue {i32, i1} %res, 0 |
| 8868 | %obit = extractvalue {i32, i1} %res, 1 |
| 8869 | br i1 %obit, label %overflow, label %normal |
| 8870 | |
| 8871 | Specialised Arithmetic Intrinsics |
| 8872 | --------------------------------- |
| 8873 | |
| 8874 | '``llvm.fmuladd.*``' Intrinsic |
| 8875 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8876 | |
| 8877 | Syntax: |
| 8878 | """"""" |
| 8879 | |
| 8880 | :: |
| 8881 | |
| 8882 | declare float @llvm.fmuladd.f32(float %a, float %b, float %c) |
| 8883 | declare double @llvm.fmuladd.f64(double %a, double %b, double %c) |
| 8884 | |
| 8885 | Overview: |
| 8886 | """"""""" |
| 8887 | |
| 8888 | The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add |
Lang Hames | 045f439 | 2013-01-17 00:00:49 +0000 | [diff] [blame] | 8889 | expressions that can be fused if the code generator determines that (a) the |
| 8890 | target instruction set has support for a fused operation, and (b) that the |
| 8891 | fused operation is more efficient than the equivalent, separate pair of mul |
| 8892 | and add instructions. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8893 | |
| 8894 | Arguments: |
| 8895 | """""""""" |
| 8896 | |
| 8897 | The '``llvm.fmuladd.*``' intrinsics each take three arguments: two |
| 8898 | multiplicands, a and b, and an addend c. |
| 8899 | |
| 8900 | Semantics: |
| 8901 | """""""""" |
| 8902 | |
| 8903 | The expression: |
| 8904 | |
| 8905 | :: |
| 8906 | |
| 8907 | %0 = call float @llvm.fmuladd.f32(%a, %b, %c) |
| 8908 | |
| 8909 | is equivalent to the expression a \* b + c, except that rounding will |
| 8910 | not be performed between the multiplication and addition steps if the |
| 8911 | code generator fuses the operations. Fusion is not guaranteed, even if |
| 8912 | the target platform supports it. If a fused multiply-add is required the |
Matt Arsenault | ee364ee | 2014-01-31 00:09:00 +0000 | [diff] [blame] | 8913 | corresponding llvm.fma.\* intrinsic function should be used |
| 8914 | instead. This never sets errno, just as '``llvm.fma.*``'. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8915 | |
| 8916 | Examples: |
| 8917 | """"""""" |
| 8918 | |
| 8919 | .. code-block:: llvm |
| 8920 | |
Tim Northover | 675a096 | 2014-06-13 14:24:23 +0000 | [diff] [blame] | 8921 | %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8922 | |
| 8923 | Half Precision Floating Point Intrinsics |
| 8924 | ---------------------------------------- |
| 8925 | |
| 8926 | For most target platforms, half precision floating point is a |
| 8927 | storage-only format. This means that it is a dense encoding (in memory) |
| 8928 | but does not support computation in the format. |
| 8929 | |
| 8930 | This means that code must first load the half-precision floating point |
| 8931 | value as an i16, then convert it to float with |
| 8932 | :ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can |
| 8933 | then be performed on the float value (including extending to double |
| 8934 | etc). To store the value back to memory, it is first converted to float |
| 8935 | if needed, then converted to i16 with |
| 8936 | :ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an |
| 8937 | i16 value. |
| 8938 | |
| 8939 | .. _int_convert_to_fp16: |
| 8940 | |
| 8941 | '``llvm.convert.to.fp16``' Intrinsic |
| 8942 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8943 | |
| 8944 | Syntax: |
| 8945 | """"""" |
| 8946 | |
| 8947 | :: |
| 8948 | |
Tim Northover | fd7e424 | 2014-07-17 10:51:23 +0000 | [diff] [blame] | 8949 | declare i16 @llvm.convert.to.fp16.f32(float %a) |
| 8950 | declare i16 @llvm.convert.to.fp16.f64(double %a) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8951 | |
| 8952 | Overview: |
| 8953 | """"""""" |
| 8954 | |
Tim Northover | fd7e424 | 2014-07-17 10:51:23 +0000 | [diff] [blame] | 8955 | The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a |
| 8956 | conventional floating point type to half precision floating point format. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8957 | |
| 8958 | Arguments: |
| 8959 | """""""""" |
| 8960 | |
| 8961 | The intrinsic function contains single argument - the value to be |
| 8962 | converted. |
| 8963 | |
| 8964 | Semantics: |
| 8965 | """""""""" |
| 8966 | |
Tim Northover | fd7e424 | 2014-07-17 10:51:23 +0000 | [diff] [blame] | 8967 | The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a |
| 8968 | conventional floating point format to half precision floating point format. The |
| 8969 | return value is an ``i16`` which contains the converted number. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8970 | |
| 8971 | Examples: |
| 8972 | """"""""" |
| 8973 | |
| 8974 | .. code-block:: llvm |
| 8975 | |
Tim Northover | fd7e424 | 2014-07-17 10:51:23 +0000 | [diff] [blame] | 8976 | %res = call i16 @llvm.convert.to.fp16.f32(float %a) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8977 | store i16 %res, i16* @x, align 2 |
| 8978 | |
| 8979 | .. _int_convert_from_fp16: |
| 8980 | |
| 8981 | '``llvm.convert.from.fp16``' Intrinsic |
| 8982 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 8983 | |
| 8984 | Syntax: |
| 8985 | """"""" |
| 8986 | |
| 8987 | :: |
| 8988 | |
Tim Northover | fd7e424 | 2014-07-17 10:51:23 +0000 | [diff] [blame] | 8989 | declare float @llvm.convert.from.fp16.f32(i16 %a) |
| 8990 | declare double @llvm.convert.from.fp16.f64(i16 %a) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 8991 | |
| 8992 | Overview: |
| 8993 | """"""""" |
| 8994 | |
| 8995 | The '``llvm.convert.from.fp16``' intrinsic function performs a |
| 8996 | conversion from half precision floating point format to single precision |
| 8997 | floating point format. |
| 8998 | |
| 8999 | Arguments: |
| 9000 | """""""""" |
| 9001 | |
| 9002 | The intrinsic function contains single argument - the value to be |
| 9003 | converted. |
| 9004 | |
| 9005 | Semantics: |
| 9006 | """""""""" |
| 9007 | |
| 9008 | The '``llvm.convert.from.fp16``' intrinsic function performs a |
| 9009 | conversion from half single precision floating point format to single |
| 9010 | precision floating point format. The input half-float value is |
| 9011 | represented by an ``i16`` value. |
| 9012 | |
| 9013 | Examples: |
| 9014 | """"""""" |
| 9015 | |
| 9016 | .. code-block:: llvm |
| 9017 | |
| 9018 | %a = load i16* @x, align 2 |
Matt Arsenault | 3e3ddda | 2014-07-10 03:22:16 +0000 | [diff] [blame] | 9019 | %res = call float @llvm.convert.from.fp16(i16 %a) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9020 | |
| 9021 | Debugger Intrinsics |
| 9022 | ------------------- |
| 9023 | |
| 9024 | The LLVM debugger intrinsics (which all start with ``llvm.dbg.`` |
| 9025 | prefix), are described in the `LLVM Source Level |
| 9026 | Debugging <SourceLevelDebugging.html#format_common_intrinsics>`_ |
| 9027 | document. |
| 9028 | |
| 9029 | Exception Handling Intrinsics |
| 9030 | ----------------------------- |
| 9031 | |
| 9032 | The LLVM exception handling intrinsics (which all start with |
| 9033 | ``llvm.eh.`` prefix), are described in the `LLVM Exception |
| 9034 | Handling <ExceptionHandling.html#format_common_intrinsics>`_ document. |
| 9035 | |
| 9036 | .. _int_trampoline: |
| 9037 | |
| 9038 | Trampoline Intrinsics |
| 9039 | --------------------- |
| 9040 | |
| 9041 | These intrinsics make it possible to excise one parameter, marked with |
| 9042 | the :ref:`nest <nest>` attribute, from a function. The result is a |
| 9043 | callable function pointer lacking the nest parameter - the caller does |
| 9044 | not need to provide a value for it. Instead, the value to use is stored |
| 9045 | in advance in a "trampoline", a block of memory usually allocated on the |
| 9046 | stack, which also contains code to splice the nest value into the |
| 9047 | argument list. This is used to implement the GCC nested function address |
| 9048 | extension. |
| 9049 | |
| 9050 | For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)`` |
| 9051 | then the resulting function pointer has signature ``i32 (i32, i32)*``. |
| 9052 | It can be created as follows: |
| 9053 | |
| 9054 | .. code-block:: llvm |
| 9055 | |
| 9056 | %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86 |
| 9057 | %tramp1 = getelementptr [10 x i8]* %tramp, i32 0, i32 0 |
| 9058 | call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval) |
| 9059 | %p = call i8* @llvm.adjust.trampoline(i8* %tramp1) |
| 9060 | %fp = bitcast i8* %p to i32 (i32, i32)* |
| 9061 | |
| 9062 | The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to |
| 9063 | ``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``. |
| 9064 | |
| 9065 | .. _int_it: |
| 9066 | |
| 9067 | '``llvm.init.trampoline``' Intrinsic |
| 9068 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9069 | |
| 9070 | Syntax: |
| 9071 | """"""" |
| 9072 | |
| 9073 | :: |
| 9074 | |
| 9075 | declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>) |
| 9076 | |
| 9077 | Overview: |
| 9078 | """"""""" |
| 9079 | |
| 9080 | This fills the memory pointed to by ``tramp`` with executable code, |
| 9081 | turning it into a trampoline. |
| 9082 | |
| 9083 | Arguments: |
| 9084 | """""""""" |
| 9085 | |
| 9086 | The ``llvm.init.trampoline`` intrinsic takes three arguments, all |
| 9087 | pointers. The ``tramp`` argument must point to a sufficiently large and |
| 9088 | sufficiently aligned block of memory; this memory is written to by the |
| 9089 | intrinsic. Note that the size and the alignment are target-specific - |
| 9090 | LLVM currently provides no portable way of determining them, so a |
| 9091 | front-end that generates this intrinsic needs to have some |
| 9092 | target-specific knowledge. The ``func`` argument must hold a function |
| 9093 | bitcast to an ``i8*``. |
| 9094 | |
| 9095 | Semantics: |
| 9096 | """""""""" |
| 9097 | |
| 9098 | The block of memory pointed to by ``tramp`` is filled with target |
| 9099 | dependent code, turning it into a function. Then ``tramp`` needs to be |
| 9100 | passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can |
| 9101 | be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new |
| 9102 | function's signature is the same as that of ``func`` with any arguments |
| 9103 | marked with the ``nest`` attribute removed. At most one such ``nest`` |
| 9104 | argument is allowed, and it must be of pointer type. Calling the new |
| 9105 | function is equivalent to calling ``func`` with the same argument list, |
| 9106 | but with ``nval`` used for the missing ``nest`` argument. If, after |
| 9107 | calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is |
| 9108 | modified, then the effect of any later call to the returned function |
| 9109 | pointer is undefined. |
| 9110 | |
| 9111 | .. _int_at: |
| 9112 | |
| 9113 | '``llvm.adjust.trampoline``' Intrinsic |
| 9114 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9115 | |
| 9116 | Syntax: |
| 9117 | """"""" |
| 9118 | |
| 9119 | :: |
| 9120 | |
| 9121 | declare i8* @llvm.adjust.trampoline(i8* <tramp>) |
| 9122 | |
| 9123 | Overview: |
| 9124 | """"""""" |
| 9125 | |
| 9126 | This performs any required machine-specific adjustment to the address of |
| 9127 | a trampoline (passed as ``tramp``). |
| 9128 | |
| 9129 | Arguments: |
| 9130 | """""""""" |
| 9131 | |
| 9132 | ``tramp`` must point to a block of memory which already has trampoline |
| 9133 | code filled in by a previous call to |
| 9134 | :ref:`llvm.init.trampoline <int_it>`. |
| 9135 | |
| 9136 | Semantics: |
| 9137 | """""""""" |
| 9138 | |
| 9139 | On some architectures the address of the code to be executed needs to be |
Sanjay Patel | 69bf48e | 2014-07-04 19:40:43 +0000 | [diff] [blame] | 9140 | different than the address where the trampoline is actually stored. This |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9141 | intrinsic returns the executable address corresponding to ``tramp`` |
| 9142 | after performing the required machine specific adjustments. The pointer |
| 9143 | returned can then be :ref:`bitcast and executed <int_trampoline>`. |
| 9144 | |
| 9145 | Memory Use Markers |
| 9146 | ------------------ |
| 9147 | |
Sanjay Patel | 69bf48e | 2014-07-04 19:40:43 +0000 | [diff] [blame] | 9148 | This class of intrinsics provides information about the lifetime of |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9149 | memory objects and ranges where variables are immutable. |
| 9150 | |
Reid Kleckner | a534a38 | 2013-12-19 02:14:12 +0000 | [diff] [blame] | 9151 | .. _int_lifestart: |
| 9152 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9153 | '``llvm.lifetime.start``' Intrinsic |
| 9154 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9155 | |
| 9156 | Syntax: |
| 9157 | """"""" |
| 9158 | |
| 9159 | :: |
| 9160 | |
| 9161 | declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>) |
| 9162 | |
| 9163 | Overview: |
| 9164 | """"""""" |
| 9165 | |
| 9166 | The '``llvm.lifetime.start``' intrinsic specifies the start of a memory |
| 9167 | object's lifetime. |
| 9168 | |
| 9169 | Arguments: |
| 9170 | """""""""" |
| 9171 | |
| 9172 | The first argument is a constant integer representing the size of the |
| 9173 | object, or -1 if it is variable sized. The second argument is a pointer |
| 9174 | to the object. |
| 9175 | |
| 9176 | Semantics: |
| 9177 | """""""""" |
| 9178 | |
| 9179 | This intrinsic indicates that before this point in the code, the value |
| 9180 | of the memory pointed to by ``ptr`` is dead. This means that it is known |
| 9181 | to never be used and has an undefined value. A load from the pointer |
| 9182 | that precedes this intrinsic can be replaced with ``'undef'``. |
| 9183 | |
Reid Kleckner | a534a38 | 2013-12-19 02:14:12 +0000 | [diff] [blame] | 9184 | .. _int_lifeend: |
| 9185 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9186 | '``llvm.lifetime.end``' Intrinsic |
| 9187 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9188 | |
| 9189 | Syntax: |
| 9190 | """"""" |
| 9191 | |
| 9192 | :: |
| 9193 | |
| 9194 | declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>) |
| 9195 | |
| 9196 | Overview: |
| 9197 | """"""""" |
| 9198 | |
| 9199 | The '``llvm.lifetime.end``' intrinsic specifies the end of a memory |
| 9200 | object's lifetime. |
| 9201 | |
| 9202 | Arguments: |
| 9203 | """""""""" |
| 9204 | |
| 9205 | The first argument is a constant integer representing the size of the |
| 9206 | object, or -1 if it is variable sized. The second argument is a pointer |
| 9207 | to the object. |
| 9208 | |
| 9209 | Semantics: |
| 9210 | """""""""" |
| 9211 | |
| 9212 | This intrinsic indicates that after this point in the code, the value of |
| 9213 | the memory pointed to by ``ptr`` is dead. This means that it is known to |
| 9214 | never be used and has an undefined value. Any stores into the memory |
| 9215 | object following this intrinsic may be removed as dead. |
| 9216 | |
| 9217 | '``llvm.invariant.start``' Intrinsic |
| 9218 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9219 | |
| 9220 | Syntax: |
| 9221 | """"""" |
| 9222 | |
| 9223 | :: |
| 9224 | |
| 9225 | declare {}* @llvm.invariant.start(i64 <size>, i8* nocapture <ptr>) |
| 9226 | |
| 9227 | Overview: |
| 9228 | """"""""" |
| 9229 | |
| 9230 | The '``llvm.invariant.start``' intrinsic specifies that the contents of |
| 9231 | a memory object will not change. |
| 9232 | |
| 9233 | Arguments: |
| 9234 | """""""""" |
| 9235 | |
| 9236 | The first argument is a constant integer representing the size of the |
| 9237 | object, or -1 if it is variable sized. The second argument is a pointer |
| 9238 | to the object. |
| 9239 | |
| 9240 | Semantics: |
| 9241 | """""""""" |
| 9242 | |
| 9243 | This intrinsic indicates that until an ``llvm.invariant.end`` that uses |
| 9244 | the return value, the referenced memory location is constant and |
| 9245 | unchanging. |
| 9246 | |
| 9247 | '``llvm.invariant.end``' Intrinsic |
| 9248 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9249 | |
| 9250 | Syntax: |
| 9251 | """"""" |
| 9252 | |
| 9253 | :: |
| 9254 | |
| 9255 | declare void @llvm.invariant.end({}* <start>, i64 <size>, i8* nocapture <ptr>) |
| 9256 | |
| 9257 | Overview: |
| 9258 | """"""""" |
| 9259 | |
| 9260 | The '``llvm.invariant.end``' intrinsic specifies that the contents of a |
| 9261 | memory object are mutable. |
| 9262 | |
| 9263 | Arguments: |
| 9264 | """""""""" |
| 9265 | |
| 9266 | The first argument is the matching ``llvm.invariant.start`` intrinsic. |
| 9267 | The second argument is a constant integer representing the size of the |
| 9268 | object, or -1 if it is variable sized and the third argument is a |
| 9269 | pointer to the object. |
| 9270 | |
| 9271 | Semantics: |
| 9272 | """""""""" |
| 9273 | |
| 9274 | This intrinsic indicates that the memory is mutable again. |
| 9275 | |
| 9276 | General Intrinsics |
| 9277 | ------------------ |
| 9278 | |
| 9279 | This class of intrinsics is designed to be generic and has no specific |
| 9280 | purpose. |
| 9281 | |
| 9282 | '``llvm.var.annotation``' Intrinsic |
| 9283 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9284 | |
| 9285 | Syntax: |
| 9286 | """"""" |
| 9287 | |
| 9288 | :: |
| 9289 | |
| 9290 | declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9291 | |
| 9292 | Overview: |
| 9293 | """"""""" |
| 9294 | |
| 9295 | The '``llvm.var.annotation``' intrinsic. |
| 9296 | |
| 9297 | Arguments: |
| 9298 | """""""""" |
| 9299 | |
| 9300 | The first argument is a pointer to a value, the second is a pointer to a |
| 9301 | global string, the third is a pointer to a global string which is the |
| 9302 | source file name, and the last argument is the line number. |
| 9303 | |
| 9304 | Semantics: |
| 9305 | """""""""" |
| 9306 | |
| 9307 | This intrinsic allows annotation of local variables with arbitrary |
| 9308 | strings. This can be useful for special purpose optimizations that want |
| 9309 | to look for these annotations. These have no other defined use; they are |
| 9310 | ignored by code generation and optimization. |
| 9311 | |
Michael Gottesman | 88d1883 | 2013-03-26 00:34:27 +0000 | [diff] [blame] | 9312 | '``llvm.ptr.annotation.*``' Intrinsic |
| 9313 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9314 | |
| 9315 | Syntax: |
| 9316 | """"""" |
| 9317 | |
| 9318 | This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a |
| 9319 | pointer to an integer of any width. *NOTE* you must specify an address space for |
| 9320 | the pointer. The identifier for the default address space is the integer |
| 9321 | '``0``'. |
| 9322 | |
| 9323 | :: |
| 9324 | |
| 9325 | declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9326 | declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9327 | declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9328 | declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9329 | declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9330 | |
| 9331 | Overview: |
| 9332 | """"""""" |
| 9333 | |
| 9334 | The '``llvm.ptr.annotation``' intrinsic. |
| 9335 | |
| 9336 | Arguments: |
| 9337 | """""""""" |
| 9338 | |
| 9339 | The first argument is a pointer to an integer value of arbitrary bitwidth |
| 9340 | (result of some expression), the second is a pointer to a global string, the |
| 9341 | third is a pointer to a global string which is the source file name, and the |
| 9342 | last argument is the line number. It returns the value of the first argument. |
| 9343 | |
| 9344 | Semantics: |
| 9345 | """""""""" |
| 9346 | |
| 9347 | This intrinsic allows annotation of a pointer to an integer with arbitrary |
| 9348 | strings. This can be useful for special purpose optimizations that want to look |
| 9349 | for these annotations. These have no other defined use; they are ignored by code |
| 9350 | generation and optimization. |
| 9351 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9352 | '``llvm.annotation.*``' Intrinsic |
| 9353 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9354 | |
| 9355 | Syntax: |
| 9356 | """"""" |
| 9357 | |
| 9358 | This is an overloaded intrinsic. You can use '``llvm.annotation``' on |
| 9359 | any integer bit width. |
| 9360 | |
| 9361 | :: |
| 9362 | |
| 9363 | declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9364 | declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9365 | declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9366 | declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9367 | declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>) |
| 9368 | |
| 9369 | Overview: |
| 9370 | """"""""" |
| 9371 | |
| 9372 | The '``llvm.annotation``' intrinsic. |
| 9373 | |
| 9374 | Arguments: |
| 9375 | """""""""" |
| 9376 | |
| 9377 | The first argument is an integer value (result of some expression), the |
| 9378 | second is a pointer to a global string, the third is a pointer to a |
| 9379 | global string which is the source file name, and the last argument is |
| 9380 | the line number. It returns the value of the first argument. |
| 9381 | |
| 9382 | Semantics: |
| 9383 | """""""""" |
| 9384 | |
| 9385 | This intrinsic allows annotations to be put on arbitrary expressions |
| 9386 | with arbitrary strings. This can be useful for special purpose |
| 9387 | optimizations that want to look for these annotations. These have no |
| 9388 | other defined use; they are ignored by code generation and optimization. |
| 9389 | |
| 9390 | '``llvm.trap``' Intrinsic |
| 9391 | ^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9392 | |
| 9393 | Syntax: |
| 9394 | """"""" |
| 9395 | |
| 9396 | :: |
| 9397 | |
| 9398 | declare void @llvm.trap() noreturn nounwind |
| 9399 | |
| 9400 | Overview: |
| 9401 | """"""""" |
| 9402 | |
| 9403 | The '``llvm.trap``' intrinsic. |
| 9404 | |
| 9405 | Arguments: |
| 9406 | """""""""" |
| 9407 | |
| 9408 | None. |
| 9409 | |
| 9410 | Semantics: |
| 9411 | """""""""" |
| 9412 | |
| 9413 | This intrinsic is lowered to the target dependent trap instruction. If |
| 9414 | the target does not have a trap instruction, this intrinsic will be |
| 9415 | lowered to a call of the ``abort()`` function. |
| 9416 | |
| 9417 | '``llvm.debugtrap``' Intrinsic |
| 9418 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9419 | |
| 9420 | Syntax: |
| 9421 | """"""" |
| 9422 | |
| 9423 | :: |
| 9424 | |
| 9425 | declare void @llvm.debugtrap() nounwind |
| 9426 | |
| 9427 | Overview: |
| 9428 | """"""""" |
| 9429 | |
| 9430 | The '``llvm.debugtrap``' intrinsic. |
| 9431 | |
| 9432 | Arguments: |
| 9433 | """""""""" |
| 9434 | |
| 9435 | None. |
| 9436 | |
| 9437 | Semantics: |
| 9438 | """""""""" |
| 9439 | |
| 9440 | This intrinsic is lowered to code which is intended to cause an |
| 9441 | execution trap with the intention of requesting the attention of a |
| 9442 | debugger. |
| 9443 | |
| 9444 | '``llvm.stackprotector``' Intrinsic |
| 9445 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9446 | |
| 9447 | Syntax: |
| 9448 | """"""" |
| 9449 | |
| 9450 | :: |
| 9451 | |
| 9452 | declare void @llvm.stackprotector(i8* <guard>, i8** <slot>) |
| 9453 | |
| 9454 | Overview: |
| 9455 | """"""""" |
| 9456 | |
| 9457 | The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it |
| 9458 | onto the stack at ``slot``. The stack slot is adjusted to ensure that it |
| 9459 | is placed on the stack before local variables. |
| 9460 | |
| 9461 | Arguments: |
| 9462 | """""""""" |
| 9463 | |
| 9464 | The ``llvm.stackprotector`` intrinsic requires two pointer arguments. |
| 9465 | The first argument is the value loaded from the stack guard |
| 9466 | ``@__stack_chk_guard``. The second variable is an ``alloca`` that has |
| 9467 | enough space to hold the value of the guard. |
| 9468 | |
| 9469 | Semantics: |
| 9470 | """""""""" |
| 9471 | |
Michael Gottesman | dafc7d9 | 2013-08-12 18:35:32 +0000 | [diff] [blame] | 9472 | This intrinsic causes the prologue/epilogue inserter to force the position of |
| 9473 | the ``AllocaInst`` stack slot to be before local variables on the stack. This is |
| 9474 | to ensure that if a local variable on the stack is overwritten, it will destroy |
| 9475 | the value of the guard. When the function exits, the guard on the stack is |
| 9476 | checked against the original guard by ``llvm.stackprotectorcheck``. If they are |
| 9477 | different, then ``llvm.stackprotectorcheck`` causes the program to abort by |
| 9478 | calling the ``__stack_chk_fail()`` function. |
| 9479 | |
| 9480 | '``llvm.stackprotectorcheck``' Intrinsic |
Sean Silva | 9d1e1a3 | 2013-09-09 19:13:28 +0000 | [diff] [blame] | 9481 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Michael Gottesman | dafc7d9 | 2013-08-12 18:35:32 +0000 | [diff] [blame] | 9482 | |
| 9483 | Syntax: |
| 9484 | """"""" |
| 9485 | |
| 9486 | :: |
| 9487 | |
| 9488 | declare void @llvm.stackprotectorcheck(i8** <guard>) |
| 9489 | |
| 9490 | Overview: |
| 9491 | """"""""" |
| 9492 | |
| 9493 | The ``llvm.stackprotectorcheck`` intrinsic compares ``guard`` against an already |
Michael Gottesman | 98850bd | 2013-08-12 19:44:09 +0000 | [diff] [blame] | 9494 | created stack protector and if they are not equal calls the |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9495 | ``__stack_chk_fail()`` function. |
| 9496 | |
Michael Gottesman | dafc7d9 | 2013-08-12 18:35:32 +0000 | [diff] [blame] | 9497 | Arguments: |
| 9498 | """""""""" |
| 9499 | |
| 9500 | The ``llvm.stackprotectorcheck`` intrinsic requires one pointer argument, the |
| 9501 | the variable ``@__stack_chk_guard``. |
| 9502 | |
| 9503 | Semantics: |
| 9504 | """""""""" |
| 9505 | |
| 9506 | This intrinsic is provided to perform the stack protector check by comparing |
| 9507 | ``guard`` with the stack slot created by ``llvm.stackprotector`` and if the |
| 9508 | values do not match call the ``__stack_chk_fail()`` function. |
| 9509 | |
| 9510 | The reason to provide this as an IR level intrinsic instead of implementing it |
| 9511 | via other IR operations is that in order to perform this operation at the IR |
| 9512 | level without an intrinsic, one would need to create additional basic blocks to |
| 9513 | handle the success/failure cases. This makes it difficult to stop the stack |
| 9514 | protector check from disrupting sibling tail calls in Codegen. With this |
| 9515 | intrinsic, we are able to generate the stack protector basic blocks late in |
Benjamin Kramer | 3b32b2f | 2013-10-29 17:53:27 +0000 | [diff] [blame] | 9516 | codegen after the tail call decision has occurred. |
Michael Gottesman | dafc7d9 | 2013-08-12 18:35:32 +0000 | [diff] [blame] | 9517 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9518 | '``llvm.objectsize``' Intrinsic |
| 9519 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9520 | |
| 9521 | Syntax: |
| 9522 | """"""" |
| 9523 | |
| 9524 | :: |
| 9525 | |
| 9526 | declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>) |
| 9527 | declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>) |
| 9528 | |
| 9529 | Overview: |
| 9530 | """"""""" |
| 9531 | |
| 9532 | The ``llvm.objectsize`` intrinsic is designed to provide information to |
| 9533 | the optimizers to determine at compile time whether a) an operation |
| 9534 | (like memcpy) will overflow a buffer that corresponds to an object, or |
| 9535 | b) that a runtime check for overflow isn't necessary. An object in this |
| 9536 | context means an allocation of a specific class, structure, array, or |
| 9537 | other object. |
| 9538 | |
| 9539 | Arguments: |
| 9540 | """""""""" |
| 9541 | |
| 9542 | The ``llvm.objectsize`` intrinsic takes two arguments. The first |
| 9543 | argument is a pointer to or into the ``object``. The second argument is |
| 9544 | a boolean and determines whether ``llvm.objectsize`` returns 0 (if true) |
| 9545 | or -1 (if false) when the object size is unknown. The second argument |
| 9546 | only accepts constants. |
| 9547 | |
| 9548 | Semantics: |
| 9549 | """""""""" |
| 9550 | |
| 9551 | The ``llvm.objectsize`` intrinsic is lowered to a constant representing |
| 9552 | the size of the object concerned. If the size cannot be determined at |
| 9553 | compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending |
| 9554 | on the ``min`` argument). |
| 9555 | |
| 9556 | '``llvm.expect``' Intrinsic |
| 9557 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9558 | |
| 9559 | Syntax: |
| 9560 | """"""" |
| 9561 | |
Duncan P. N. Exon Smith | 1ff08e3 | 2014-02-02 22:43:55 +0000 | [diff] [blame] | 9562 | This is an overloaded intrinsic. You can use ``llvm.expect`` on any |
| 9563 | integer bit width. |
| 9564 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9565 | :: |
| 9566 | |
Duncan P. N. Exon Smith | 1ff08e3 | 2014-02-02 22:43:55 +0000 | [diff] [blame] | 9567 | declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>) |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9568 | declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>) |
| 9569 | declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>) |
| 9570 | |
| 9571 | Overview: |
| 9572 | """"""""" |
| 9573 | |
| 9574 | The ``llvm.expect`` intrinsic provides information about expected (the |
| 9575 | most probable) value of ``val``, which can be used by optimizers. |
| 9576 | |
| 9577 | Arguments: |
| 9578 | """""""""" |
| 9579 | |
| 9580 | The ``llvm.expect`` intrinsic takes two arguments. The first argument is |
| 9581 | a value. The second argument is an expected value, this needs to be a |
| 9582 | constant value, variables are not allowed. |
| 9583 | |
| 9584 | Semantics: |
| 9585 | """""""""" |
| 9586 | |
| 9587 | This intrinsic is lowered to the ``val``. |
| 9588 | |
Hal Finkel | 9304691 | 2014-07-25 21:13:35 +0000 | [diff] [blame] | 9589 | '``llvm.assume``' Intrinsic |
| 9590 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9591 | |
| 9592 | Syntax: |
| 9593 | """"""" |
| 9594 | |
| 9595 | :: |
| 9596 | |
| 9597 | declare void @llvm.assume(i1 %cond) |
| 9598 | |
| 9599 | Overview: |
| 9600 | """"""""" |
| 9601 | |
| 9602 | The ``llvm.assume`` allows the optimizer to assume that the provided |
| 9603 | condition is true. This information can then be used in simplifying other parts |
| 9604 | of the code. |
| 9605 | |
| 9606 | Arguments: |
| 9607 | """""""""" |
| 9608 | |
| 9609 | The condition which the optimizer may assume is always true. |
| 9610 | |
| 9611 | Semantics: |
| 9612 | """""""""" |
| 9613 | |
| 9614 | The intrinsic allows the optimizer to assume that the provided condition is |
| 9615 | always true whenever the control flow reaches the intrinsic call. No code is |
| 9616 | generated for this intrinsic, and instructions that contribute only to the |
| 9617 | provided condition are not used for code generation. If the condition is |
| 9618 | violated during execution, the behavior is undefined. |
| 9619 | |
| 9620 | Please note that optimizer might limit the transformations performed on values |
| 9621 | used by the ``llvm.assume`` intrinsic in order to preserve the instructions |
| 9622 | only used to form the intrinsic's input argument. This might prove undesirable |
| 9623 | if the extra information provided by the ``llvm.assume`` intrinsic does cause |
| 9624 | sufficient overall improvement in code quality. For this reason, |
| 9625 | ``llvm.assume`` should not be used to document basic mathematical invariants |
| 9626 | that the optimizer can otherwise deduce or facts that are of little use to the |
| 9627 | optimizer. |
| 9628 | |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9629 | '``llvm.donothing``' Intrinsic |
| 9630 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 9631 | |
| 9632 | Syntax: |
| 9633 | """"""" |
| 9634 | |
| 9635 | :: |
| 9636 | |
| 9637 | declare void @llvm.donothing() nounwind readnone |
| 9638 | |
| 9639 | Overview: |
| 9640 | """"""""" |
| 9641 | |
Juergen Ributzka | c916119 | 2014-10-23 22:36:13 +0000 | [diff] [blame] | 9642 | The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only |
| 9643 | two intrinsics (besides ``llvm.experimental.patchpoint``) that can be called |
| 9644 | with an invoke instruction. |
Sean Silva | b084af4 | 2012-12-07 10:36:55 +0000 | [diff] [blame] | 9645 | |
| 9646 | Arguments: |
| 9647 | """""""""" |
| 9648 | |
| 9649 | None. |
| 9650 | |
| 9651 | Semantics: |
| 9652 | """""""""" |
| 9653 | |
| 9654 | This intrinsic does nothing, and it's removed by optimizers and ignored |
| 9655 | by codegen. |
Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 9656 | |
| 9657 | Stack Map Intrinsics |
| 9658 | -------------------- |
| 9659 | |
| 9660 | LLVM provides experimental intrinsics to support runtime patching |
| 9661 | mechanisms commonly desired in dynamic language JITs. These intrinsics |
| 9662 | are described in :doc:`StackMaps`. |