blob: 7f69d693294674e9711ecd946657c89a2553c64a [file] [log] [blame]
Sean Silvab084af42012-12-07 10:36:55 +00001==============================
2LLVM Language Reference Manual
3==============================
4
5.. contents::
6 :local:
Rafael Espindola08013342013-12-07 19:34:20 +00007 :depth: 4
Sean Silvab084af42012-12-07 10:36:55 +00008
Sean Silvab084af42012-12-07 10:36:55 +00009Abstract
10========
11
12This document is a reference manual for the LLVM assembly language. LLVM
13is a Static Single Assignment (SSA) based representation that provides
14type safety, low-level operations, flexibility, and the capability of
15representing 'all' high-level languages cleanly. It is the common code
16representation used throughout all phases of the LLVM compilation
17strategy.
18
19Introduction
20============
21
22The LLVM code representation is designed to be used in three different
23forms: as an in-memory compiler IR, as an on-disk bitcode representation
24(suitable for fast loading by a Just-In-Time compiler), and as a human
25readable assembly language representation. This allows LLVM to provide a
26powerful intermediate representation for efficient compiler
27transformations and analysis, while providing a natural means to debug
28and visualize the transformations. The three different forms of LLVM are
29all equivalent. This document describes the human readable
30representation and notation.
31
32The LLVM representation aims to be light-weight and low-level while
33being expressive, typed, and extensible at the same time. It aims to be
34a "universal IR" of sorts, by being at a low enough level that
35high-level ideas may be cleanly mapped to it (similar to how
36microprocessors are "universal IR's", allowing many source languages to
37be mapped to them). By providing type information, LLVM can be used as
38the target of optimizations: for example, through pointer analysis, it
39can be proven that a C automatic variable is never accessed outside of
40the current function, allowing it to be promoted to a simple SSA value
41instead of a memory location.
42
43.. _wellformed:
44
45Well-Formedness
46---------------
47
48It is important to note that this document describes 'well formed' LLVM
49assembly language. There is a difference between what the parser accepts
50and what is considered 'well formed'. For example, the following
51instruction is syntactically okay, but not well formed:
52
53.. code-block:: llvm
54
55 %x = add i32 1, %x
56
57because the definition of ``%x`` does not dominate all of its uses. The
58LLVM infrastructure provides a verification pass that may be used to
59verify that an LLVM module is well formed. This pass is automatically
60run by the parser after parsing input assembly and by the optimizer
61before it outputs bitcode. The violations pointed out by the verifier
62pass indicate bugs in transformation passes or input to the parser.
63
64.. _identifiers:
65
66Identifiers
67===========
68
69LLVM identifiers come in two basic types: global and local. Global
70identifiers (functions, global variables) begin with the ``'@'``
71character. Local identifiers (register names, types) begin with the
72``'%'`` character. Additionally, there are three different formats for
73identifiers, for different purposes:
74
75#. Named values are represented as a string of characters with their
76 prefix. For example, ``%foo``, ``@DivisionByZero``,
77 ``%a.really.long.identifier``. The actual regular expression used is
Sean Silva9d01a5b2015-01-07 21:35:14 +000078 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
Sean Silvab084af42012-12-07 10:36:55 +000079 characters in their names can be surrounded with quotes. Special
80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81 code for the character in hexadecimal. In this way, any character can
Hans Wennborg85e06532014-07-30 20:02:08 +000082 be used in a name value, even quotes themselves. The ``"\01"`` prefix
83 can be used on global variables to suppress mangling.
Sean Silvab084af42012-12-07 10:36:55 +000084#. Unnamed values are represented as an unsigned numeric value with
85 their prefix. For example, ``%12``, ``@2``, ``%44``.
Sean Silvaa1190322015-08-06 22:56:48 +000086#. Constants, which are described in the section Constants_ below.
Sean Silvab084af42012-12-07 10:36:55 +000087
88LLVM requires that values start with a prefix for two reasons: Compilers
89don't need to worry about name clashes with reserved words, and the set
90of reserved words may be expanded in the future without penalty.
91Additionally, unnamed identifiers allow a compiler to quickly come up
92with a temporary variable without having to avoid symbol table
93conflicts.
94
95Reserved words in LLVM are very similar to reserved words in other
96languages. There are keywords for different opcodes ('``add``',
97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98'``i32``', etc...), and others. These reserved words cannot conflict
99with variable names, because none of them start with a prefix character
100(``'%'`` or ``'@'``).
101
102Here is an example of LLVM code to multiply the integer variable
103'``%X``' by 8:
104
105The easy way:
106
107.. code-block:: llvm
108
109 %result = mul i32 %X, 8
110
111After strength reduction:
112
113.. code-block:: llvm
114
Dmitri Gribenko675911d2013-01-26 13:30:13 +0000115 %result = shl i32 %X, 3
Sean Silvab084af42012-12-07 10:36:55 +0000116
117And the hard way:
118
119.. code-block:: llvm
120
Tim Northover675a0962014-06-13 14:24:23 +0000121 %0 = add i32 %X, %X ; yields i32:%0
122 %1 = add i32 %0, %0 ; yields i32:%1
Sean Silvab084af42012-12-07 10:36:55 +0000123 %result = add i32 %1, %1
124
125This last way of multiplying ``%X`` by 8 illustrates several important
126lexical features of LLVM:
127
128#. Comments are delimited with a '``;``' and go until the end of line.
129#. Unnamed temporaries are created when the result of a computation is
130 not assigned to a named value.
Sean Silva8ca11782013-05-20 23:31:12 +0000131#. Unnamed temporaries are numbered sequentially (using a per-function
Dan Liew2661dfc2014-08-20 15:06:30 +0000132 incrementing counter, starting with 0). Note that basic blocks and unnamed
133 function parameters are included in this numbering. For example, if the
134 entry basic block is not given a label name and all function parameters are
135 named, then it will get number 0.
Sean Silvab084af42012-12-07 10:36:55 +0000136
137It also shows a convention that we follow in this document. When
138demonstrating instructions, we will follow an instruction with a comment
139that defines the type and name of value produced.
140
141High Level Structure
142====================
143
144Module Structure
145----------------
146
147LLVM programs are composed of ``Module``'s, each of which is a
148translation unit of the input programs. Each module consists of
149functions, global variables, and symbol table entries. Modules may be
150combined together with the LLVM linker, which merges function (and
151global variable) definitions, resolves forward declarations, and merges
152symbol table entries. Here is an example of the "hello world" module:
153
154.. code-block:: llvm
155
Michael Liaoa7699082013-03-06 18:24:34 +0000156 ; Declare the string constant as a global constant.
157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
Sean Silvab084af42012-12-07 10:36:55 +0000158
Michael Liaoa7699082013-03-06 18:24:34 +0000159 ; External declaration of the puts function
160 declare i32 @puts(i8* nocapture) nounwind
Sean Silvab084af42012-12-07 10:36:55 +0000161
162 ; Definition of main function
Michael Liaoa7699082013-03-06 18:24:34 +0000163 define i32 @main() { ; i32()*
164 ; Convert [13 x i8]* to i8 *...
David Blaikie16a97eb2015-03-04 22:02:58 +0000165 %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
Sean Silvab084af42012-12-07 10:36:55 +0000166
Michael Liaoa7699082013-03-06 18:24:34 +0000167 ; Call puts function to write out the string to stdout.
Sean Silvab084af42012-12-07 10:36:55 +0000168 call i32 @puts(i8* %cast210)
Michael Liaoa7699082013-03-06 18:24:34 +0000169 ret i32 0
Sean Silvab084af42012-12-07 10:36:55 +0000170 }
171
172 ; Named metadata
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +0000173 !0 = !{i32 42, null, !"string"}
Nick Lewyckya0de40a2014-08-13 04:54:05 +0000174 !foo = !{!0}
Sean Silvab084af42012-12-07 10:36:55 +0000175
176This example is made up of a :ref:`global variable <globalvars>` named
177"``.str``", an external declaration of the "``puts``" function, a
178:ref:`function definition <functionstructure>` for "``main``" and
179:ref:`named metadata <namedmetadatastructure>` "``foo``".
180
181In general, a module is made up of a list of global values (where both
182functions and global variables are global values). Global values are
183represented by a pointer to a memory location (in this case, a pointer
184to an array of char, and a pointer to a function), and have one of the
185following :ref:`linkage types <linkage>`.
186
187.. _linkage:
188
189Linkage Types
190-------------
191
192All Global Variables and Functions have one of the following types of
193linkage:
194
195``private``
196 Global values with "``private``" linkage are only directly
197 accessible by objects in the current module. In particular, linking
198 code into a module with an private global value may cause the
199 private to be renamed as necessary to avoid collisions. Because the
200 symbol is private to the module, all references can be updated. This
201 doesn't show up in any symbol table in the object file.
Sean Silvab084af42012-12-07 10:36:55 +0000202``internal``
203 Similar to private, but the value shows as a local symbol
204 (``STB_LOCAL`` in the case of ELF) in the object file. This
205 corresponds to the notion of the '``static``' keyword in C.
206``available_externally``
Peter Collingbourne45cd0c32015-12-14 19:22:37 +0000207 Globals with "``available_externally``" linkage are never emitted into
208 the object file corresponding to the LLVM module. From the linker's
209 perspective, an ``available_externally`` global is equivalent to
210 an external declaration. They exist to allow inlining and other
211 optimizations to take place given knowledge of the definition of the
212 global, which is known to be somewhere outside the module. Globals
213 with ``available_externally`` linkage are allowed to be discarded at
214 will, and allow inlining and other optimizations. This linkage type is
215 only allowed on definitions, not declarations.
Sean Silvab084af42012-12-07 10:36:55 +0000216``linkonce``
217 Globals with "``linkonce``" linkage are merged with other globals of
218 the same name when linkage occurs. This can be used to implement
219 some forms of inline functions, templates, or other code which must
220 be generated in each translation unit that uses it, but where the
221 body may be overridden with a more definitive definition later.
222 Unreferenced ``linkonce`` globals are allowed to be discarded. Note
223 that ``linkonce`` linkage does not actually allow the optimizer to
224 inline the body of this function into callers because it doesn't
225 know if this definition of the function is the definitive definition
226 within the program or whether it will be overridden by a stronger
227 definition. To enable inlining and other optimizations, use
228 "``linkonce_odr``" linkage.
229``weak``
230 "``weak``" linkage has the same merging semantics as ``linkonce``
231 linkage, except that unreferenced globals with ``weak`` linkage may
232 not be discarded. This is used for globals that are declared "weak"
233 in C source code.
234``common``
235 "``common``" linkage is most similar to "``weak``" linkage, but they
236 are used for tentative definitions in C, such as "``int X;``" at
237 global scope. Symbols with "``common``" linkage are merged in the
238 same way as ``weak symbols``, and they may not be deleted if
239 unreferenced. ``common`` symbols may not have an explicit section,
240 must have a zero initializer, and may not be marked
241 ':ref:`constant <globalvars>`'. Functions and aliases may not have
242 common linkage.
243
244.. _linkage_appending:
245
246``appending``
247 "``appending``" linkage may only be applied to global variables of
248 pointer to array type. When two global variables with appending
249 linkage are linked together, the two global arrays are appended
250 together. This is the LLVM, typesafe, equivalent of having the
251 system linker append together "sections" with identical names when
252 .o files are linked.
Rafael Espindolae64619c2016-05-16 21:14:24 +0000253
254 Unfortunately this doesn't correspond to any feature in .o files, so it
255 can only be used for variables like ``llvm.global_ctors`` which llvm
256 interprets specially.
257
Sean Silvab084af42012-12-07 10:36:55 +0000258``extern_weak``
259 The semantics of this linkage follow the ELF object file model: the
260 symbol is weak until linked, if not linked, the symbol becomes null
261 instead of being an undefined reference.
262``linkonce_odr``, ``weak_odr``
263 Some languages allow differing globals to be merged, such as two
264 functions with different semantics. Other languages, such as
265 ``C++``, ensure that only equivalent globals are ever merged (the
Sean Silvaa1190322015-08-06 22:56:48 +0000266 "one definition rule" --- "ODR"). Such languages can use the
Sean Silvab084af42012-12-07 10:36:55 +0000267 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
268 global will only be merged with equivalent globals. These linkage
269 types are otherwise the same as their non-``odr`` versions.
Sean Silvab084af42012-12-07 10:36:55 +0000270``external``
271 If none of the above identifiers are used, the global is externally
272 visible, meaning that it participates in linkage and can be used to
273 resolve external symbol references.
274
Sean Silvab084af42012-12-07 10:36:55 +0000275It is illegal for a function *declaration* to have any linkage type
Nico Rieck7157bb72014-01-14 15:22:47 +0000276other than ``external`` or ``extern_weak``.
Sean Silvab084af42012-12-07 10:36:55 +0000277
Sean Silvab084af42012-12-07 10:36:55 +0000278.. _callingconv:
279
280Calling Conventions
281-------------------
282
283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
284:ref:`invokes <i_invoke>` can all have an optional calling convention
285specified for the call. The calling convention of any pair of dynamic
286caller/callee must match, or the behavior of the program is undefined.
287The following calling conventions are supported by LLVM, and more may be
288added in the future:
289
290"``ccc``" - The C calling convention
291 This calling convention (the default if no other calling convention
292 is specified) matches the target C calling conventions. This calling
293 convention supports varargs function calls and tolerates some
294 mismatch in the declared prototype and implemented declaration of
295 the function (as does normal C).
296"``fastcc``" - The fast calling convention
297 This calling convention attempts to make calls as fast as possible
298 (e.g. by passing things in registers). This calling convention
299 allows the target to use whatever tricks it wants to produce fast
300 code for the target, without having to conform to an externally
301 specified ABI (Application Binary Interface). `Tail calls can only
302 be optimized when this, the GHC or the HiPE convention is
303 used. <CodeGenerator.html#id80>`_ This calling convention does not
304 support varargs and requires the prototype of all callees to exactly
305 match the prototype of the function definition.
306"``coldcc``" - The cold calling convention
307 This calling convention attempts to make code in the caller as
308 efficient as possible under the assumption that the call is not
309 commonly executed. As such, these calls often preserve all registers
310 so that the call does not break any live ranges in the caller side.
311 This calling convention does not support varargs and requires the
312 prototype of all callees to exactly match the prototype of the
Juergen Ributzka5d05ed12014-01-17 22:24:35 +0000313 function definition. Furthermore the inliner doesn't consider such function
314 calls for inlining.
Sean Silvab084af42012-12-07 10:36:55 +0000315"``cc 10``" - GHC convention
316 This calling convention has been implemented specifically for use by
317 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
318 It passes everything in registers, going to extremes to achieve this
319 by disabling callee save registers. This calling convention should
320 not be used lightly but only for specific situations such as an
321 alternative to the *register pinning* performance technique often
322 used when implementing functional programming languages. At the
323 moment only X86 supports this convention and it has the following
324 limitations:
325
326 - On *X86-32* only supports up to 4 bit type parameters. No
327 floating point types are supported.
328 - On *X86-64* only supports up to 10 bit type parameters and 6
329 floating point parameters.
330
331 This calling convention supports `tail call
332 optimization <CodeGenerator.html#id80>`_ but requires both the
333 caller and callee are using it.
334"``cc 11``" - The HiPE calling convention
335 This calling convention has been implemented specifically for use by
336 the `High-Performance Erlang
337 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
338 native code compiler of the `Ericsson's Open Source Erlang/OTP
339 system <http://www.erlang.org/download.shtml>`_. It uses more
340 registers for argument passing than the ordinary C calling
341 convention and defines no callee-saved registers. The calling
342 convention properly supports `tail call
343 optimization <CodeGenerator.html#id80>`_ but requires that both the
344 caller and the callee use it. It uses a *register pinning*
345 mechanism, similar to GHC's convention, for keeping frequently
346 accessed runtime components pinned to specific hardware registers.
347 At the moment only X86 supports this convention (both 32 and 64
348 bit).
Andrew Trick5e029ce2013-12-24 02:57:25 +0000349"``webkit_jscc``" - WebKit's JavaScript calling convention
350 This calling convention has been implemented for `WebKit FTL JIT
351 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
352 stack right to left (as cdecl does), and returns a value in the
353 platform's customary return register.
354"``anyregcc``" - Dynamic calling convention for code patching
355 This is a special convention that supports patching an arbitrary code
356 sequence in place of a call site. This convention forces the call
Eli Bendersky45324ce2015-04-02 15:20:04 +0000357 arguments into registers but allows them to be dynamically
Andrew Trick5e029ce2013-12-24 02:57:25 +0000358 allocated. This can currently only be used with calls to
359 llvm.experimental.patchpoint because only this intrinsic records
360 the location of its arguments in a side table. See :doc:`StackMaps`.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000361"``preserve_mostcc``" - The `PreserveMost` calling convention
Eli Bendersky45324ce2015-04-02 15:20:04 +0000362 This calling convention attempts to make the code in the caller as
363 unintrusive as possible. This convention behaves identically to the `C`
Juergen Ributzkae6250132014-01-17 19:47:03 +0000364 calling convention on how arguments and return values are passed, but it
365 uses a different set of caller/callee-saved registers. This alleviates the
366 burden of saving and recovering a large register set before and after the
Juergen Ributzka980f2dc2014-01-30 02:39:00 +0000367 call in the caller. If the arguments are passed in callee-saved registers,
368 then they will be preserved by the callee across the call. This doesn't
369 apply for values returned in callee-saved registers.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000370
371 - On X86-64 the callee preserves all general purpose registers, except for
372 R11. R11 can be used as a scratch register. Floating-point registers
373 (XMMs/YMMs) are not preserved and need to be saved by the caller.
374
375 The idea behind this convention is to support calls to runtime functions
376 that have a hot path and a cold path. The hot path is usually a small piece
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000377 of code that doesn't use many registers. The cold path might need to call out to
Juergen Ributzkae6250132014-01-17 19:47:03 +0000378 another function and therefore only needs to preserve the caller-saved
Juergen Ributzka5d05ed12014-01-17 22:24:35 +0000379 registers, which haven't already been saved by the caller. The
380 `PreserveMost` calling convention is very similar to the `cold` calling
381 convention in terms of caller/callee-saved registers, but they are used for
382 different types of function calls. `coldcc` is for function calls that are
383 rarely executed, whereas `preserve_mostcc` function calls are intended to be
384 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
385 doesn't prevent the inliner from inlining the function call.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000386
387 This calling convention will be used by a future version of the ObjectiveC
388 runtime and should therefore still be considered experimental at this time.
389 Although this convention was created to optimize certain runtime calls to
390 the ObjectiveC runtime, it is not limited to this runtime and might be used
391 by other runtimes in the future too. The current implementation only
392 supports X86-64, but the intention is to support more architectures in the
393 future.
394"``preserve_allcc``" - The `PreserveAll` calling convention
395 This calling convention attempts to make the code in the caller even less
396 intrusive than the `PreserveMost` calling convention. This calling
397 convention also behaves identical to the `C` calling convention on how
398 arguments and return values are passed, but it uses a different set of
399 caller/callee-saved registers. This removes the burden of saving and
Juergen Ributzka980f2dc2014-01-30 02:39:00 +0000400 recovering a large register set before and after the call in the caller. If
401 the arguments are passed in callee-saved registers, then they will be
402 preserved by the callee across the call. This doesn't apply for values
403 returned in callee-saved registers.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000404
405 - On X86-64 the callee preserves all general purpose registers, except for
406 R11. R11 can be used as a scratch register. Furthermore it also preserves
407 all floating-point registers (XMMs/YMMs).
408
409 The idea behind this convention is to support calls to runtime functions
410 that don't need to call out to any other functions.
411
412 This calling convention, like the `PreserveMost` calling convention, will be
413 used by a future version of the ObjectiveC runtime and should be considered
414 experimental at this time.
Manman Ren19c7bbe2015-12-04 17:40:13 +0000415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
Manman Ren17567d22015-12-07 21:40:09 +0000416 Clang generates an access function to access C++-style TLS. The access
417 function generally has an entry block, an exit block and an initialization
418 block that is run at the first time. The entry and exit blocks can access
419 a few TLS IR variables, each access will be lowered to a platform-specific
420 sequence.
421
Manman Ren19c7bbe2015-12-04 17:40:13 +0000422 This calling convention aims to minimize overhead in the caller by
Manman Ren17567d22015-12-07 21:40:09 +0000423 preserving as many registers as possible (all the registers that are
424 perserved on the fast path, composed of the entry and exit blocks).
425
426 This calling convention behaves identical to the `C` calling convention on
427 how arguments and return values are passed, but it uses a different set of
428 caller/callee-saved registers.
429
430 Given that each platform has its own lowering sequence, hence its own set
431 of preserved registers, we can't use the existing `PreserveMost`.
Manman Ren19c7bbe2015-12-04 17:40:13 +0000432
433 - On X86-64 the callee preserves all general purpose registers, except for
434 RDI and RAX.
Manman Renf8bdd882016-04-05 22:41:47 +0000435"``swiftcc``" - This calling convention is used for Swift language.
436 - On X86-64 RCX and R8 are available for additional integer returns, and
437 XMM2 and XMM3 are available for additional FP/vector returns.
Manman Ren802cd6f2016-04-05 22:44:44 +0000438 - On iOS platforms, we use AAPCS-VFP calling convention.
Sean Silvab084af42012-12-07 10:36:55 +0000439"``cc <n>``" - Numbered convention
440 Any calling convention may be specified by number, allowing
441 target-specific calling conventions to be used. Target specific
442 calling conventions start at 64.
443
444More calling conventions can be added/defined on an as-needed basis, to
445support Pascal conventions or any other well-known target-independent
446convention.
447
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000448.. _visibilitystyles:
449
Sean Silvab084af42012-12-07 10:36:55 +0000450Visibility Styles
451-----------------
452
453All Global Variables and Functions have one of the following visibility
454styles:
455
456"``default``" - Default style
457 On targets that use the ELF object file format, default visibility
458 means that the declaration is visible to other modules and, in
459 shared libraries, means that the declared entity may be overridden.
460 On Darwin, default visibility means that the declaration is visible
461 to other modules. Default visibility corresponds to "external
462 linkage" in the language.
463"``hidden``" - Hidden style
464 Two declarations of an object with hidden visibility refer to the
465 same object if they are in the same shared object. Usually, hidden
466 visibility indicates that the symbol will not be placed into the
467 dynamic symbol table, so no other module (executable or shared
468 library) can reference it directly.
469"``protected``" - Protected style
470 On ELF, protected visibility indicates that the symbol will be
471 placed in the dynamic symbol table, but that references within the
472 defining module will bind to the local symbol. That is, the symbol
473 cannot be overridden by another module.
474
Duncan P. N. Exon Smithb80de102014-05-07 22:57:20 +0000475A symbol with ``internal`` or ``private`` linkage must have ``default``
476visibility.
477
Rafael Espindola3bc64d52014-05-26 21:30:40 +0000478.. _dllstorageclass:
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000479
Nico Rieck7157bb72014-01-14 15:22:47 +0000480DLL Storage Classes
481-------------------
482
483All Global Variables, Functions and Aliases can have one of the following
484DLL storage class:
485
486``dllimport``
487 "``dllimport``" causes the compiler to reference a function or variable via
488 a global pointer to a pointer that is set up by the DLL exporting the
489 symbol. On Microsoft Windows targets, the pointer name is formed by
490 combining ``__imp_`` and the function or variable name.
491``dllexport``
492 "``dllexport``" causes the compiler to provide a global pointer to a pointer
493 in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
494 Microsoft Windows targets, the pointer name is formed by combining
495 ``__imp_`` and the function or variable name. Since this storage class
496 exists for defining a dll interface, the compiler, assembler and linker know
497 it is externally referenced and must refrain from deleting the symbol.
498
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000499.. _tls_model:
500
501Thread Local Storage Models
502---------------------------
503
504A variable may be defined as ``thread_local``, which means that it will
505not be shared by threads (each thread will have a separated copy of the
506variable). Not all targets support thread-local variables. Optionally, a
507TLS model may be specified:
508
509``localdynamic``
510 For variables that are only used within the current shared library.
511``initialexec``
512 For variables in modules that will not be loaded dynamically.
513``localexec``
514 For variables defined in the executable and only used within it.
515
516If no explicit model is given, the "general dynamic" model is used.
517
518The models correspond to the ELF TLS models; see `ELF Handling For
519Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
520more information on under which circumstances the different models may
521be used. The target may choose a different TLS model if the specified
522model is not supported, or if a better choice of model can be made.
523
Sean Silva706fba52015-08-06 22:56:24 +0000524A model can also be specified in an alias, but then it only governs how
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000525the alias is accessed. It will not have any effect in the aliasee.
526
Chih-Hung Hsieh1e859582015-07-28 16:24:05 +0000527For platforms without linker support of ELF TLS model, the -femulated-tls
528flag can be used to generate GCC compatible emulated TLS code.
529
Rafael Espindola3bc64d52014-05-26 21:30:40 +0000530.. _namedtypes:
531
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000532Structure Types
533---------------
Sean Silvab084af42012-12-07 10:36:55 +0000534
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000535LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
Sean Silvaa1190322015-08-06 22:56:48 +0000536types <t_struct>`. Literal types are uniqued structurally, but identified types
537are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
Richard Smith32dbdf62014-07-31 04:25:36 +0000538to forward declare a type that is not yet available.
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000539
Sean Silva706fba52015-08-06 22:56:24 +0000540An example of an identified structure specification is:
Sean Silvab084af42012-12-07 10:36:55 +0000541
542.. code-block:: llvm
543
544 %mytype = type { %mytype*, i32 }
545
Sean Silvaa1190322015-08-06 22:56:48 +0000546Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000547literal types are uniqued in recent versions of LLVM.
Sean Silvab084af42012-12-07 10:36:55 +0000548
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +0000549.. _nointptrtype:
550
551Non-Integral Pointer Type
552-------------------------
553
554Note: non-integral pointer types are a work in progress, and they should be
555considered experimental at this time.
556
557LLVM IR optionally allows the frontend to denote pointers in certain address
558spaces as "non-integral" via the :ref:```datalayout``
559string<langref_datalayout>`. Non-integral pointer types represent pointers that
560have an *unspecified* bitwise representation; that is, the integral
561representation may be target dependent or unstable (not backed by a fixed
562integer).
563
564``inttoptr`` instructions converting integers to non-integral pointer types are
565ill-typed, and so are ``ptrtoint`` instructions converting values of
566non-integral pointer types to integers. Vector versions of said instructions
567are ill-typed as well.
568
Sean Silvab084af42012-12-07 10:36:55 +0000569.. _globalvars:
570
571Global Variables
572----------------
573
574Global variables define regions of memory allocated at compilation time
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000575instead of run-time.
576
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000577Global variable definitions must be initialized.
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000578
579Global variables in other translation units can also be declared, in which
580case they don't have an initializer.
Sean Silvab084af42012-12-07 10:36:55 +0000581
Bob Wilson85b24f22014-06-12 20:40:33 +0000582Either global variable definitions or declarations may have an explicit section
583to be placed in and may have an optional explicit alignment specified.
584
Michael Gottesman006039c2013-01-31 05:48:48 +0000585A variable may be defined as a global ``constant``, which indicates that
Sean Silvab084af42012-12-07 10:36:55 +0000586the contents of the variable will **never** be modified (enabling better
587optimization, allowing the global data to be placed in the read-only
588section of an executable, etc). Note that variables that need runtime
Michael Gottesman1cffcf742013-01-31 05:44:04 +0000589initialization cannot be marked ``constant`` as there is a store to the
Sean Silvab084af42012-12-07 10:36:55 +0000590variable.
591
592LLVM explicitly allows *declarations* of global variables to be marked
593constant, even if the final definition of the global is not. This
594capability can be used to enable slightly better optimization of the
595program, but requires the language definition to guarantee that
596optimizations based on the 'constantness' are valid for the translation
597units that do not include the definition.
598
599As SSA values, global variables define pointer values that are in scope
600(i.e. they dominate) all basic blocks in the program. Global variables
601always define a pointer to their "content" type because they describe a
602region of memory, and all memory objects in LLVM are accessed through
603pointers.
604
605Global variables can be marked with ``unnamed_addr`` which indicates
606that the address is not significant, only the content. Constants marked
607like this can be merged with other constants if they have the same
608initializer. Note that a constant with significant address *can* be
609merged with a ``unnamed_addr`` constant, the result being a constant
610whose address is significant.
611
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000612If the ``local_unnamed_addr`` attribute is given, the address is known to
613not be significant within the module.
614
Sean Silvab084af42012-12-07 10:36:55 +0000615A global variable may be declared to reside in a target-specific
616numbered address space. For targets that support them, address spaces
617may affect how optimizations are performed and/or what target
618instructions are used to access the variable. The default address space
619is zero. The address space qualifier must precede any other attributes.
620
621LLVM allows an explicit section to be specified for globals. If the
622target supports it, it will emit globals to the section specified.
David Majnemerdad0a642014-06-27 18:19:56 +0000623Additionally, the global can placed in a comdat if the target has the necessary
624support.
Sean Silvab084af42012-12-07 10:36:55 +0000625
Michael Gottesmane743a302013-02-04 03:22:00 +0000626By default, global initializers are optimized by assuming that global
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000627variables defined within the module are not modified from their
Sean Silvaa1190322015-08-06 22:56:48 +0000628initial values before the start of the global initializer. This is
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000629true even for variables potentially accessible from outside the
630module, including those with external linkage or appearing in
Yunzhong Gaof5b769e2013-12-05 18:37:54 +0000631``@llvm.used`` or dllexported variables. This assumption may be suppressed
632by marking the variable with ``externally_initialized``.
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000633
Sean Silvab084af42012-12-07 10:36:55 +0000634An explicit alignment may be specified for a global, which must be a
635power of 2. If not present, or if the alignment is set to zero, the
636alignment of the global is set by the target to whatever it feels
637convenient. If an explicit alignment is specified, the global is forced
638to have exactly that alignment. Targets and optimizers are not allowed
639to over-align the global if the global has an assigned section. In this
640case, the extra alignment could be observable: for example, code could
641assume that the globals are densely packed in their section and try to
642iterate over them as an array, alignment padding would break this
Reid Kleckner15fe7a52014-07-15 01:16:09 +0000643iteration. The maximum alignment is ``1 << 29``.
Sean Silvab084af42012-12-07 10:36:55 +0000644
Peter Collingbournecceae7f2016-05-31 23:01:54 +0000645Globals can also have a :ref:`DLL storage class <dllstorageclass>` and
646an optional list of attached :ref:`metadata <metadata>`,
Nico Rieck7157bb72014-01-14 15:22:47 +0000647
Peter Collingbourne69ba0162015-02-04 00:42:45 +0000648Variables and aliases can have a
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000649:ref:`Thread Local Storage Model <tls_model>`.
650
Nico Rieck7157bb72014-01-14 15:22:47 +0000651Syntax::
652
Rafael Espindola32483a72016-05-10 18:22:45 +0000653 @<GlobalVarName> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal]
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000654 [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
655 [ExternallyInitialized]
Bob Wilson85b24f22014-06-12 20:40:33 +0000656 <global | constant> <Type> [<InitializerConstant>]
Rafael Espindola83a362c2015-01-06 22:55:16 +0000657 [, section "name"] [, comdat [($name)]]
Peter Collingbournecceae7f2016-05-31 23:01:54 +0000658 [, align <Alignment>] (, !name !N)*
Nico Rieck7157bb72014-01-14 15:22:47 +0000659
Sean Silvab084af42012-12-07 10:36:55 +0000660For example, the following defines a global in a numbered address space
661with an initializer, section, and alignment:
662
663.. code-block:: llvm
664
665 @G = addrspace(5) constant float 1.0, section "foo", align 4
666
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000667The following example just declares a global variable
668
669.. code-block:: llvm
670
671 @G = external global i32
672
Sean Silvab084af42012-12-07 10:36:55 +0000673The following example defines a thread-local global with the
674``initialexec`` TLS model:
675
676.. code-block:: llvm
677
678 @G = thread_local(initialexec) global i32 0, align 4
679
680.. _functionstructure:
681
682Functions
683---------
684
685LLVM function definitions consist of the "``define``" keyword, an
686optional :ref:`linkage type <linkage>`, an optional :ref:`visibility
Nico Rieck7157bb72014-01-14 15:22:47 +0000687style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
688an optional :ref:`calling convention <callingconv>`,
Sean Silvab084af42012-12-07 10:36:55 +0000689an optional ``unnamed_addr`` attribute, a return type, an optional
690:ref:`parameter attribute <paramattrs>` for the return type, a function
691name, a (possibly empty) argument list (each with optional :ref:`parameter
692attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
David Majnemerdad0a642014-06-27 18:19:56 +0000693an optional section, an optional alignment,
694an optional :ref:`comdat <langref_comdats>`,
Peter Collingbourne51d2de72014-12-03 02:08:38 +0000695an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
David Majnemer7fddecc2015-06-17 20:52:32 +0000696an optional :ref:`prologue <prologuedata>`,
697an optional :ref:`personality <personalityfn>`,
Peter Collingbourne50108682015-11-06 02:41:02 +0000698an optional list of attached :ref:`metadata <metadata>`,
David Majnemer7fddecc2015-06-17 20:52:32 +0000699an opening curly brace, a list of basic blocks, and a closing curly brace.
Sean Silvab084af42012-12-07 10:36:55 +0000700
701LLVM function declarations consist of the "``declare``" keyword, an
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000702optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
703<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
704optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
705or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter
706attribute <paramattrs>` for the return type, a function name, a possibly
707empty list of arguments, an optional alignment, an optional :ref:`garbage
708collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
709:ref:`prologue <prologuedata>`.
Sean Silvab084af42012-12-07 10:36:55 +0000710
Bill Wendling6822ecb2013-10-27 05:09:12 +0000711A function definition contains a list of basic blocks, forming the CFG (Control
712Flow Graph) for the function. Each basic block may optionally start with a label
713(giving the basic block a symbol table entry), contains a list of instructions,
714and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
715function return). If an explicit label is not provided, a block is assigned an
716implicit numbered label, using the next value from the same counter as used for
717unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function
718entry block does not have an explicit label, it will be assigned label "%0",
719then the first unnamed temporary in that block will be "%1", etc.
Sean Silvab084af42012-12-07 10:36:55 +0000720
721The first basic block in a function is special in two ways: it is
722immediately executed on entrance to the function, and it is not allowed
723to have predecessor basic blocks (i.e. there can not be any branches to
724the entry block of a function). Because the block can have no
725predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
726
727LLVM allows an explicit section to be specified for functions. If the
728target supports it, it will emit functions to the section specified.
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000729Additionally, the function can be placed in a COMDAT.
Sean Silvab084af42012-12-07 10:36:55 +0000730
731An explicit alignment may be specified for a function. If not present,
732or if the alignment is set to zero, the alignment of the function is set
733by the target to whatever it feels convenient. If an explicit alignment
734is specified, the function is forced to have at least that much
735alignment. All alignments must be a power of 2.
736
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000737If the ``unnamed_addr`` attribute is given, the address is known to not
Sean Silvab084af42012-12-07 10:36:55 +0000738be significant and two identical functions can be merged.
739
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000740If the ``local_unnamed_addr`` attribute is given, the address is known to
741not be significant within the module.
742
Sean Silvab084af42012-12-07 10:36:55 +0000743Syntax::
744
Nico Rieck7157bb72014-01-14 15:22:47 +0000745 define [linkage] [visibility] [DLLStorageClass]
Sean Silvab084af42012-12-07 10:36:55 +0000746 [cconv] [ret attrs]
747 <ResultType> @<FunctionName> ([argument list])
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000748 [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"]
749 [comdat [($name)]] [align N] [gc] [prefix Constant]
750 [prologue Constant] [personality Constant] (!name !N)* { ... }
Sean Silvab084af42012-12-07 10:36:55 +0000751
Sean Silva706fba52015-08-06 22:56:24 +0000752The argument list is a comma separated sequence of arguments where each
753argument is of the following form:
Dan Liew2661dfc2014-08-20 15:06:30 +0000754
755Syntax::
756
757 <type> [parameter Attrs] [name]
758
759
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000760.. _langref_aliases:
761
Sean Silvab084af42012-12-07 10:36:55 +0000762Aliases
763-------
764
Rafael Espindola64c1e182014-06-03 02:41:57 +0000765Aliases, unlike function or variables, don't create any new data. They
766are just a new symbol and metadata for an existing position.
767
768Aliases have a name and an aliasee that is either a global value or a
769constant expression.
770
Nico Rieck7157bb72014-01-14 15:22:47 +0000771Aliases may have an optional :ref:`linkage type <linkage>`, an optional
Rafael Espindola64c1e182014-06-03 02:41:57 +0000772:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
773<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
Sean Silvab084af42012-12-07 10:36:55 +0000774
775Syntax::
776
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000777 @<Name> = [Linkage] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
Sean Silvab084af42012-12-07 10:36:55 +0000778
Rafael Espindola2fb5bc32014-03-13 23:18:37 +0000779The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
Rafael Espindola716e7402013-11-01 17:09:14 +0000780``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
Rafael Espindola64c1e182014-06-03 02:41:57 +0000781might not correctly handle dropping a weak symbol that is aliased.
Rafael Espindola78527052013-10-06 15:10:43 +0000782
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000783Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
Rafael Espindola42a4c9f2014-06-06 01:20:28 +0000784the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
785to the same content.
Rafael Espindolaf3336bc2014-03-12 20:15:49 +0000786
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000787If the ``local_unnamed_addr`` attribute is given, the address is known to
788not be significant within the module.
789
Rafael Espindola64c1e182014-06-03 02:41:57 +0000790Since aliases are only a second name, some restrictions apply, of which
791some can only be checked when producing an object file:
Rafael Espindolaf3336bc2014-03-12 20:15:49 +0000792
Rafael Espindola64c1e182014-06-03 02:41:57 +0000793* The expression defining the aliasee must be computable at assembly
794 time. Since it is just a name, no relocations can be used.
795
796* No alias in the expression can be weak as the possibility of the
797 intermediate alias being overridden cannot be represented in an
798 object file.
799
800* No global value in the expression can be a declaration, since that
801 would require a relocation, which is not possible.
Rafael Espindola24a669d2014-03-27 15:26:56 +0000802
Dmitry Polukhina1feff72016-04-07 12:32:19 +0000803.. _langref_ifunc:
804
805IFuncs
806-------
807
808IFuncs, like as aliases, don't create any new data or func. They are just a new
809symbol that dynamic linker resolves at runtime by calling a resolver function.
810
811IFuncs have a name and a resolver that is a function called by dynamic linker
812that returns address of another function associated with the name.
813
814IFunc may have an optional :ref:`linkage type <linkage>` and an optional
815:ref:`visibility style <visibility>`.
816
817Syntax::
818
819 @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
820
821
David Majnemerdad0a642014-06-27 18:19:56 +0000822.. _langref_comdats:
823
824Comdats
825-------
826
827Comdat IR provides access to COFF and ELF object file COMDAT functionality.
828
Sean Silvaa1190322015-08-06 22:56:48 +0000829Comdats have a name which represents the COMDAT key. All global objects that
David Majnemerdad0a642014-06-27 18:19:56 +0000830specify this key will only end up in the final object file if the linker chooses
Sean Silvaa1190322015-08-06 22:56:48 +0000831that key over some other key. Aliases are placed in the same COMDAT that their
David Majnemerdad0a642014-06-27 18:19:56 +0000832aliasee computes to, if any.
833
834Comdats have a selection kind to provide input on how the linker should
835choose between keys in two different object files.
836
837Syntax::
838
839 $<Name> = comdat SelectionKind
840
841The selection kind must be one of the following:
842
843``any``
844 The linker may choose any COMDAT key, the choice is arbitrary.
845``exactmatch``
846 The linker may choose any COMDAT key but the sections must contain the
847 same data.
848``largest``
849 The linker will choose the section containing the largest COMDAT key.
850``noduplicates``
851 The linker requires that only section with this COMDAT key exist.
852``samesize``
853 The linker may choose any COMDAT key but the sections must contain the
854 same amount of data.
855
856Note that the Mach-O platform doesn't support COMDATs and ELF only supports
857``any`` as a selection kind.
858
859Here is an example of a COMDAT group where a function will only be selected if
860the COMDAT key's section is the largest:
861
Renato Golin124f2592016-07-20 12:16:38 +0000862.. code-block:: text
David Majnemerdad0a642014-06-27 18:19:56 +0000863
864 $foo = comdat largest
Rafael Espindola83a362c2015-01-06 22:55:16 +0000865 @foo = global i32 2, comdat($foo)
David Majnemerdad0a642014-06-27 18:19:56 +0000866
Rafael Espindola83a362c2015-01-06 22:55:16 +0000867 define void @bar() comdat($foo) {
David Majnemerdad0a642014-06-27 18:19:56 +0000868 ret void
869 }
870
Rafael Espindola83a362c2015-01-06 22:55:16 +0000871As a syntactic sugar the ``$name`` can be omitted if the name is the same as
872the global name:
873
Renato Golin124f2592016-07-20 12:16:38 +0000874.. code-block:: text
Rafael Espindola83a362c2015-01-06 22:55:16 +0000875
876 $foo = comdat any
877 @foo = global i32 2, comdat
878
879
David Majnemerdad0a642014-06-27 18:19:56 +0000880In a COFF object file, this will create a COMDAT section with selection kind
881``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
882and another COMDAT section with selection kind
883``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
Hans Wennborg0def0662014-09-10 17:05:08 +0000884section and contains the contents of the ``@bar`` symbol.
David Majnemerdad0a642014-06-27 18:19:56 +0000885
886There are some restrictions on the properties of the global object.
887It, or an alias to it, must have the same name as the COMDAT group when
888targeting COFF.
889The contents and size of this object may be used during link-time to determine
890which COMDAT groups get selected depending on the selection kind.
891Because the name of the object must match the name of the COMDAT group, the
892linkage of the global object must not be local; local symbols can get renamed
893if a collision occurs in the symbol table.
894
895The combined use of COMDATS and section attributes may yield surprising results.
896For example:
897
Renato Golin124f2592016-07-20 12:16:38 +0000898.. code-block:: text
David Majnemerdad0a642014-06-27 18:19:56 +0000899
900 $foo = comdat any
901 $bar = comdat any
Rafael Espindola83a362c2015-01-06 22:55:16 +0000902 @g1 = global i32 42, section "sec", comdat($foo)
903 @g2 = global i32 42, section "sec", comdat($bar)
David Majnemerdad0a642014-06-27 18:19:56 +0000904
905From the object file perspective, this requires the creation of two sections
Sean Silvaa1190322015-08-06 22:56:48 +0000906with the same name. This is necessary because both globals belong to different
David Majnemerdad0a642014-06-27 18:19:56 +0000907COMDAT groups and COMDATs, at the object file level, are represented by
908sections.
909
Peter Collingbourne1feef2e2015-06-30 19:10:31 +0000910Note that certain IR constructs like global variables and functions may
911create COMDATs in the object file in addition to any which are specified using
Sean Silvaa1190322015-08-06 22:56:48 +0000912COMDAT IR. This arises when the code generator is configured to emit globals
Peter Collingbourne1feef2e2015-06-30 19:10:31 +0000913in individual sections (e.g. when `-data-sections` or `-function-sections`
914is supplied to `llc`).
David Majnemerdad0a642014-06-27 18:19:56 +0000915
Sean Silvab084af42012-12-07 10:36:55 +0000916.. _namedmetadatastructure:
917
918Named Metadata
919--------------
920
921Named metadata is a collection of metadata. :ref:`Metadata
922nodes <metadata>` (but not metadata strings) are the only valid
923operands for a named metadata.
924
Filipe Cabecinhas62431b12015-06-02 21:25:08 +0000925#. Named metadata are represented as a string of characters with the
926 metadata prefix. The rules for metadata names are the same as for
927 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
928 are still valid, which allows any character to be part of a name.
929
Sean Silvab084af42012-12-07 10:36:55 +0000930Syntax::
931
932 ; Some unnamed metadata nodes, which are referenced by the named metadata.
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +0000933 !0 = !{!"zero"}
934 !1 = !{!"one"}
935 !2 = !{!"two"}
Sean Silvab084af42012-12-07 10:36:55 +0000936 ; A named metadata.
937 !name = !{!0, !1, !2}
938
939.. _paramattrs:
940
941Parameter Attributes
942--------------------
943
944The return type and each parameter of a function type may have a set of
945*parameter attributes* associated with them. Parameter attributes are
946used to communicate additional information about the result or
947parameters of a function. Parameter attributes are considered to be part
948of the function, not of the function type, so functions with different
949parameter attributes can have the same function type.
950
951Parameter attributes are simple keywords that follow the type specified.
952If multiple parameter attributes are needed, they are space separated.
953For example:
954
955.. code-block:: llvm
956
957 declare i32 @printf(i8* noalias nocapture, ...)
958 declare i32 @atoi(i8 zeroext)
959 declare signext i8 @returns_signed_char()
960
961Note that any attributes for the function result (``nounwind``,
962``readonly``) come immediately after the argument list.
963
964Currently, only the following parameter attributes are defined:
965
966``zeroext``
967 This indicates to the code generator that the parameter or return
968 value should be zero-extended to the extent required by the target's
Hans Wennborg850ec6c2016-02-08 19:34:30 +0000969 ABI by the caller (for a parameter) or the callee (for a return value).
Sean Silvab084af42012-12-07 10:36:55 +0000970``signext``
971 This indicates to the code generator that the parameter or return
972 value should be sign-extended to the extent required by the target's
973 ABI (which is usually 32-bits) by the caller (for a parameter) or
974 the callee (for a return value).
975``inreg``
976 This indicates that this parameter or return value should be treated
Sean Silva706fba52015-08-06 22:56:24 +0000977 in a special target-dependent fashion while emitting code for
Sean Silvab084af42012-12-07 10:36:55 +0000978 a function call or return (usually, by putting it in a register as
979 opposed to memory, though some targets use it to distinguish between
980 two different kinds of registers). Use of this attribute is
981 target-specific.
982``byval``
983 This indicates that the pointer parameter should really be passed by
984 value to the function. The attribute implies that a hidden copy of
985 the pointee is made between the caller and the callee, so the callee
986 is unable to modify the value in the caller. This attribute is only
987 valid on LLVM pointer arguments. It is generally used to pass
988 structs and arrays by value, but is also valid on pointers to
989 scalars. The copy is considered to belong to the caller not the
990 callee (for example, ``readonly`` functions should not write to
991 ``byval`` parameters). This is not a valid attribute for return
992 values.
993
994 The byval attribute also supports specifying an alignment with the
995 align attribute. It indicates the alignment of the stack slot to
996 form and the known alignment of the pointer specified to the call
997 site. If the alignment is not specified, then the code generator
998 makes a target-specific assumption.
999
Reid Klecknera534a382013-12-19 02:14:12 +00001000.. _attr_inalloca:
1001
1002``inalloca``
1003
Reid Kleckner60d3a832014-01-16 22:59:24 +00001004 The ``inalloca`` argument attribute allows the caller to take the
Sean Silvaa1190322015-08-06 22:56:48 +00001005 address of outgoing stack arguments. An ``inalloca`` argument must
Reid Kleckner436c42e2014-01-17 23:58:17 +00001006 be a pointer to stack memory produced by an ``alloca`` instruction.
1007 The alloca, or argument allocation, must also be tagged with the
Sean Silvaa1190322015-08-06 22:56:48 +00001008 inalloca keyword. Only the last argument may have the ``inalloca``
Reid Kleckner436c42e2014-01-17 23:58:17 +00001009 attribute, and that argument is guaranteed to be passed in memory.
Reid Klecknera534a382013-12-19 02:14:12 +00001010
Reid Kleckner436c42e2014-01-17 23:58:17 +00001011 An argument allocation may be used by a call at most once because
Sean Silvaa1190322015-08-06 22:56:48 +00001012 the call may deallocate it. The ``inalloca`` attribute cannot be
Reid Kleckner436c42e2014-01-17 23:58:17 +00001013 used in conjunction with other attributes that affect argument
Sean Silvaa1190322015-08-06 22:56:48 +00001014 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
Reid Klecknerf5b76512014-01-31 23:50:57 +00001015 ``inalloca`` attribute also disables LLVM's implicit lowering of
1016 large aggregate return values, which means that frontend authors
1017 must lower them with ``sret`` pointers.
Reid Klecknera534a382013-12-19 02:14:12 +00001018
Reid Kleckner60d3a832014-01-16 22:59:24 +00001019 When the call site is reached, the argument allocation must have
1020 been the most recent stack allocation that is still live, or the
Sean Silvaa1190322015-08-06 22:56:48 +00001021 results are undefined. It is possible to allocate additional stack
Reid Kleckner60d3a832014-01-16 22:59:24 +00001022 space after an argument allocation and before its call site, but it
1023 must be cleared off with :ref:`llvm.stackrestore
1024 <int_stackrestore>`.
Reid Klecknera534a382013-12-19 02:14:12 +00001025
1026 See :doc:`InAlloca` for more information on how to use this
1027 attribute.
1028
Sean Silvab084af42012-12-07 10:36:55 +00001029``sret``
1030 This indicates that the pointer parameter specifies the address of a
1031 structure that is the return value of the function in the source
1032 program. This pointer must be guaranteed by the caller to be valid:
Eli Bendersky4f2162f2013-01-23 22:05:19 +00001033 loads and stores to the structure may be assumed by the callee
Sean Silvab084af42012-12-07 10:36:55 +00001034 not to trap and to be properly aligned. This may only be applied to
1035 the first parameter. This is not a valid attribute for return
1036 values.
Sean Silva1703e702014-04-08 21:06:22 +00001037
Hal Finkelccc70902014-07-22 16:58:55 +00001038``align <n>``
1039 This indicates that the pointer value may be assumed by the optimizer to
1040 have the specified alignment.
1041
1042 Note that this attribute has additional semantics when combined with the
1043 ``byval`` attribute.
1044
Sean Silva1703e702014-04-08 21:06:22 +00001045.. _noalias:
1046
Sean Silvab084af42012-12-07 10:36:55 +00001047``noalias``
Hal Finkel12d36302014-11-21 02:22:46 +00001048 This indicates that objects accessed via pointer values
1049 :ref:`based <pointeraliasing>` on the argument or return value are not also
1050 accessed, during the execution of the function, via pointer values not
1051 *based* on the argument or return value. The attribute on a return value
1052 also has additional semantics described below. The caller shares the
1053 responsibility with the callee for ensuring that these requirements are met.
1054 For further details, please see the discussion of the NoAlias response in
1055 :ref:`alias analysis <Must, May, or No>`.
Sean Silvab084af42012-12-07 10:36:55 +00001056
1057 Note that this definition of ``noalias`` is intentionally similar
Hal Finkel12d36302014-11-21 02:22:46 +00001058 to the definition of ``restrict`` in C99 for function arguments.
Sean Silvab084af42012-12-07 10:36:55 +00001059
1060 For function return values, C99's ``restrict`` is not meaningful,
Hal Finkel12d36302014-11-21 02:22:46 +00001061 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1062 attribute on return values are stronger than the semantics of the attribute
1063 when used on function arguments. On function return values, the ``noalias``
1064 attribute indicates that the function acts like a system memory allocation
1065 function, returning a pointer to allocated storage disjoint from the
1066 storage for any other object accessible to the caller.
1067
Sean Silvab084af42012-12-07 10:36:55 +00001068``nocapture``
1069 This indicates that the callee does not make any copies of the
1070 pointer that outlive the callee itself. This is not a valid
David Majnemer7f324202016-05-26 17:36:22 +00001071 attribute for return values. Addresses used in volatile operations
1072 are considered to be captured.
Sean Silvab084af42012-12-07 10:36:55 +00001073
1074.. _nest:
1075
1076``nest``
1077 This indicates that the pointer parameter can be excised using the
1078 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
Stephen Linb8bd2322013-04-20 05:14:40 +00001079 attribute for return values and can only be applied to one parameter.
1080
1081``returned``
Stephen Linfec5b0b2013-06-20 21:55:10 +00001082 This indicates that the function always returns the argument as its return
Hal Finkel3b66caa2016-07-10 21:52:39 +00001083 value. This is a hint to the optimizer and code generator used when
1084 generating the caller, allowing value propagation, tail call optimization,
1085 and omission of register saves and restores in some cases; it is not
1086 checked or enforced when generating the callee. The parameter and the
1087 function return type must be valid operands for the
1088 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1089 return values and can only be applied to one parameter.
Sean Silvab084af42012-12-07 10:36:55 +00001090
Nick Lewyckyd52b1522014-05-20 01:23:40 +00001091``nonnull``
1092 This indicates that the parameter or return pointer is not null. This
1093 attribute may only be applied to pointer typed parameters. This is not
1094 checked or enforced by LLVM, the caller must ensure that the pointer
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001095 passed in is non-null, or the callee must ensure that the returned pointer
Nick Lewyckyd52b1522014-05-20 01:23:40 +00001096 is non-null.
1097
Hal Finkelb0407ba2014-07-18 15:51:28 +00001098``dereferenceable(<n>)``
1099 This indicates that the parameter or return pointer is dereferenceable. This
1100 attribute may only be applied to pointer typed parameters. A pointer that
1101 is dereferenceable can be loaded from speculatively without a risk of
1102 trapping. The number of bytes known to be dereferenceable must be provided
1103 in parentheses. It is legal for the number of bytes to be less than the
1104 size of the pointee type. The ``nonnull`` attribute does not imply
1105 dereferenceability (consider a pointer to one element past the end of an
1106 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1107 ``addrspace(0)`` (which is the default address space).
1108
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001109``dereferenceable_or_null(<n>)``
1110 This indicates that the parameter or return value isn't both
1111 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
Sean Silvaa1190322015-08-06 22:56:48 +00001112 time. All non-null pointers tagged with
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001113 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1114 For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1115 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1116 and in other address spaces ``dereferenceable_or_null(<n>)``
1117 implies that a pointer is at least one of ``dereferenceable(<n>)``
1118 or ``null`` (i.e. it may be both ``null`` and
Sean Silvaa1190322015-08-06 22:56:48 +00001119 ``dereferenceable(<n>)``). This attribute may only be applied to
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001120 pointer typed parameters.
1121
Manman Renf46262e2016-03-29 17:37:21 +00001122``swiftself``
1123 This indicates that the parameter is the self/context parameter. This is not
1124 a valid attribute for return values and can only be applied to one
1125 parameter.
1126
Manman Ren9bfd0d02016-04-01 21:41:15 +00001127``swifterror``
1128 This attribute is motivated to model and optimize Swift error handling. It
1129 can be applied to a parameter with pointer to pointer type or a
1130 pointer-sized alloca. At the call site, the actual argument that corresponds
1131 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca. A
1132 ``swifterror`` value (either the parameter or the alloca) can only be loaded
1133 and stored from, or used as a ``swifterror`` argument. This is not a valid
1134 attribute for return values and can only be applied to one parameter.
1135
1136 These constraints allow the calling convention to optimize access to
1137 ``swifterror`` variables by associating them with a specific register at
1138 call boundaries rather than placing them in memory. Since this does change
1139 the calling convention, a function which uses the ``swifterror`` attribute
1140 on a parameter is not ABI-compatible with one which does not.
1141
1142 These constraints also allow LLVM to assume that a ``swifterror`` argument
1143 does not alias any other memory visible within a function and that a
1144 ``swifterror`` alloca passed as an argument does not escape.
1145
Sean Silvab084af42012-12-07 10:36:55 +00001146.. _gc:
1147
Philip Reamesf80bbff2015-02-25 23:45:20 +00001148Garbage Collector Strategy Names
1149--------------------------------
Sean Silvab084af42012-12-07 10:36:55 +00001150
Philip Reamesf80bbff2015-02-25 23:45:20 +00001151Each function may specify a garbage collector strategy name, which is simply a
Sean Silvab084af42012-12-07 10:36:55 +00001152string:
1153
1154.. code-block:: llvm
1155
1156 define void @f() gc "name" { ... }
1157
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001158The supported values of *name* includes those :ref:`built in to LLVM
Sean Silvaa1190322015-08-06 22:56:48 +00001159<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001160strategy will cause the compiler to alter its output in order to support the
Sean Silvaa1190322015-08-06 22:56:48 +00001161named garbage collection algorithm. Note that LLVM itself does not contain a
Philip Reamesf80bbff2015-02-25 23:45:20 +00001162garbage collector, this functionality is restricted to generating machine code
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001163which can interoperate with a collector provided externally.
Sean Silvab084af42012-12-07 10:36:55 +00001164
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001165.. _prefixdata:
1166
1167Prefix Data
1168-----------
1169
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001170Prefix data is data associated with a function which the code
1171generator will emit immediately before the function's entrypoint.
1172The purpose of this feature is to allow frontends to associate
1173language-specific runtime metadata with specific functions and make it
1174available through the function pointer while still allowing the
1175function pointer to be called.
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001176
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001177To access the data for a given function, a program may bitcast the
1178function pointer to a pointer to the constant's type and dereference
Sean Silvaa1190322015-08-06 22:56:48 +00001179index -1. This implies that the IR symbol points just past the end of
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001180the prefix data. For instance, take the example of a function annotated
1181with a single ``i32``,
1182
1183.. code-block:: llvm
1184
1185 define void @f() prefix i32 123 { ... }
1186
1187The prefix data can be referenced as,
1188
1189.. code-block:: llvm
1190
David Blaikie16a97eb2015-03-04 22:02:58 +00001191 %0 = bitcast void* () @f to i32*
1192 %a = getelementptr inbounds i32, i32* %0, i32 -1
David Blaikiec7aabbb2015-03-04 22:06:14 +00001193 %b = load i32, i32* %a
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001194
1195Prefix data is laid out as if it were an initializer for a global variable
Sean Silvaa1190322015-08-06 22:56:48 +00001196of the prefix data's type. The function will be placed such that the
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001197beginning of the prefix data is aligned. This means that if the size
1198of the prefix data is not a multiple of the alignment size, the
1199function's entrypoint will not be aligned. If alignment of the
1200function's entrypoint is desired, padding must be added to the prefix
1201data.
1202
Sean Silvaa1190322015-08-06 22:56:48 +00001203A function may have prefix data but no body. This has similar semantics
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001204to the ``available_externally`` linkage in that the data may be used by the
1205optimizers but will not be emitted in the object file.
1206
1207.. _prologuedata:
1208
1209Prologue Data
1210-------------
1211
1212The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1213be inserted prior to the function body. This can be used for enabling
1214function hot-patching and instrumentation.
1215
1216To maintain the semantics of ordinary function calls, the prologue data must
Sean Silvaa1190322015-08-06 22:56:48 +00001217have a particular format. Specifically, it must begin with a sequence of
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001218bytes which decode to a sequence of machine instructions, valid for the
1219module's target, which transfer control to the point immediately succeeding
Sean Silvaa1190322015-08-06 22:56:48 +00001220the prologue data, without performing any other visible action. This allows
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001221the inliner and other passes to reason about the semantics of the function
Sean Silvaa1190322015-08-06 22:56:48 +00001222definition without needing to reason about the prologue data. Obviously this
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001223makes the format of the prologue data highly target dependent.
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001224
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001225A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001226which encodes the ``nop`` instruction:
1227
Renato Golin124f2592016-07-20 12:16:38 +00001228.. code-block:: text
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001229
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001230 define void @f() prologue i8 144 { ... }
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001231
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001232Generally prologue data can be formed by encoding a relative branch instruction
1233which skips the metadata, as in this example of valid prologue data for the
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001234x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1235
Renato Golin124f2592016-07-20 12:16:38 +00001236.. code-block:: text
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001237
1238 %0 = type <{ i8, i8, i8* }>
1239
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001240 define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001241
Sean Silvaa1190322015-08-06 22:56:48 +00001242A function may have prologue data but no body. This has similar semantics
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001243to the ``available_externally`` linkage in that the data may be used by the
1244optimizers but will not be emitted in the object file.
1245
David Majnemer7fddecc2015-06-17 20:52:32 +00001246.. _personalityfn:
1247
1248Personality Function
David Majnemerc5ad8a92015-06-17 21:21:16 +00001249--------------------
David Majnemer7fddecc2015-06-17 20:52:32 +00001250
1251The ``personality`` attribute permits functions to specify what function
1252to use for exception handling.
1253
Bill Wendling63b88192013-02-06 06:52:58 +00001254.. _attrgrp:
1255
1256Attribute Groups
1257----------------
1258
1259Attribute groups are groups of attributes that are referenced by objects within
1260the IR. They are important for keeping ``.ll`` files readable, because a lot of
1261functions will use the same set of attributes. In the degenerative case of a
1262``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1263group will capture the important command line flags used to build that file.
1264
1265An attribute group is a module-level object. To use an attribute group, an
1266object references the attribute group's ID (e.g. ``#37``). An object may refer
1267to more than one attribute group. In that situation, the attributes from the
1268different groups are merged.
1269
1270Here is an example of attribute groups for a function that should always be
1271inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1272
1273.. code-block:: llvm
1274
1275 ; Target-independent attributes:
Eli Bendersky97ad9242013-04-18 16:11:44 +00001276 attributes #0 = { alwaysinline alignstack=4 }
Bill Wendling63b88192013-02-06 06:52:58 +00001277
1278 ; Target-dependent attributes:
Eli Bendersky97ad9242013-04-18 16:11:44 +00001279 attributes #1 = { "no-sse" }
Bill Wendling63b88192013-02-06 06:52:58 +00001280
1281 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1282 define void @f() #0 #1 { ... }
1283
Sean Silvab084af42012-12-07 10:36:55 +00001284.. _fnattrs:
1285
1286Function Attributes
1287-------------------
1288
1289Function attributes are set to communicate additional information about
1290a function. Function attributes are considered to be part of the
1291function, not of the function type, so functions with different function
1292attributes can have the same function type.
1293
1294Function attributes are simple keywords that follow the type specified.
1295If multiple attributes are needed, they are space separated. For
1296example:
1297
1298.. code-block:: llvm
1299
1300 define void @f() noinline { ... }
1301 define void @f() alwaysinline { ... }
1302 define void @f() alwaysinline optsize { ... }
1303 define void @f() optsize { ... }
1304
Sean Silvab084af42012-12-07 10:36:55 +00001305``alignstack(<n>)``
1306 This attribute indicates that, when emitting the prologue and
1307 epilogue, the backend should forcibly align the stack pointer.
1308 Specify the desired alignment, which must be a power of two, in
1309 parentheses.
George Burgess IV278199f2016-04-12 01:05:35 +00001310``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1311 This attribute indicates that the annotated function will always return at
1312 least a given number of bytes (or null). Its arguments are zero-indexed
1313 parameter numbers; if one argument is provided, then it's assumed that at
1314 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1315 returned pointer. If two are provided, then it's assumed that
1316 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1317 available. The referenced parameters must be integer types. No assumptions
1318 are made about the contents of the returned block of memory.
Sean Silvab084af42012-12-07 10:36:55 +00001319``alwaysinline``
1320 This attribute indicates that the inliner should attempt to inline
1321 this function into callers whenever possible, ignoring any active
1322 inlining size threshold for this caller.
Michael Gottesman41748d72013-06-27 00:25:01 +00001323``builtin``
1324 This indicates that the callee function at a call site should be
1325 recognized as a built-in function, even though the function's declaration
Michael Gottesman3a6a9672013-07-02 21:32:56 +00001326 uses the ``nobuiltin`` attribute. This is only valid at call sites for
Richard Smith32dbdf62014-07-31 04:25:36 +00001327 direct calls to functions that are declared with the ``nobuiltin``
Michael Gottesman41748d72013-06-27 00:25:01 +00001328 attribute.
Michael Gottesman296adb82013-06-27 22:48:08 +00001329``cold``
1330 This attribute indicates that this function is rarely called. When
1331 computing edge weights, basic blocks post-dominated by a cold
1332 function call are also considered to be cold; and, thus, given low
1333 weight.
Owen Anderson85fa7d52015-05-26 23:48:40 +00001334``convergent``
Justin Lebard5fb6952016-02-09 23:03:17 +00001335 In some parallel execution models, there exist operations that cannot be
1336 made control-dependent on any additional values. We call such operations
Justin Lebar58535b12016-02-17 17:46:41 +00001337 ``convergent``, and mark them with this attribute.
Justin Lebard5fb6952016-02-09 23:03:17 +00001338
Justin Lebar58535b12016-02-17 17:46:41 +00001339 The ``convergent`` attribute may appear on functions or call/invoke
1340 instructions. When it appears on a function, it indicates that calls to
1341 this function should not be made control-dependent on additional values.
Justin Bognera4635372016-07-06 20:02:45 +00001342 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
Justin Lebard5fb6952016-02-09 23:03:17 +00001343 calls to this intrinsic cannot be made control-dependent on additional
Justin Lebar58535b12016-02-17 17:46:41 +00001344 values.
Justin Lebard5fb6952016-02-09 23:03:17 +00001345
Justin Lebar58535b12016-02-17 17:46:41 +00001346 When it appears on a call/invoke, the ``convergent`` attribute indicates
1347 that we should treat the call as though we're calling a convergent
1348 function. This is particularly useful on indirect calls; without this we
1349 may treat such calls as though the target is non-convergent.
1350
1351 The optimizer may remove the ``convergent`` attribute on functions when it
1352 can prove that the function does not execute any convergent operations.
1353 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1354 can prove that the call/invoke cannot call a convergent function.
Vaivaswatha Nagarajfb3f4902015-12-16 16:16:19 +00001355``inaccessiblememonly``
1356 This attribute indicates that the function may only access memory that
1357 is not accessible by the module being compiled. This is a weaker form
1358 of ``readnone``.
1359``inaccessiblemem_or_argmemonly``
1360 This attribute indicates that the function may only access memory that is
1361 either not accessible by the module being compiled, or is pointed to
1362 by its pointer arguments. This is a weaker form of ``argmemonly``
Sean Silvab084af42012-12-07 10:36:55 +00001363``inlinehint``
1364 This attribute indicates that the source code contained a hint that
1365 inlining this function is desirable (such as the "inline" keyword in
1366 C/C++). It is just a hint; it imposes no requirements on the
1367 inliner.
Tom Roeder44cb65f2014-06-05 19:29:43 +00001368``jumptable``
1369 This attribute indicates that the function should be added to a
1370 jump-instruction table at code-generation time, and that all address-taken
1371 references to this function should be replaced with a reference to the
1372 appropriate jump-instruction-table function pointer. Note that this creates
1373 a new pointer for the original function, which means that code that depends
1374 on function-pointer identity can break. So, any function annotated with
1375 ``jumptable`` must also be ``unnamed_addr``.
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001376``minsize``
1377 This attribute suggests that optimization passes and code generator
1378 passes make choices that keep the code size of this function as small
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001379 as possible and perform optimizations that may sacrifice runtime
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001380 performance in order to minimize the size of the generated code.
Sean Silvab084af42012-12-07 10:36:55 +00001381``naked``
1382 This attribute disables prologue / epilogue emission for the
1383 function. This can have very system-specific consequences.
Eli Bendersky97ad9242013-04-18 16:11:44 +00001384``nobuiltin``
Michael Gottesman41748d72013-06-27 00:25:01 +00001385 This indicates that the callee function at a call site is not recognized as
1386 a built-in function. LLVM will retain the original call and not replace it
1387 with equivalent code based on the semantics of the built-in function, unless
1388 the call site uses the ``builtin`` attribute. This is valid at call sites
1389 and on function declarations and definitions.
Bill Wendlingbf902f12013-02-06 06:22:58 +00001390``noduplicate``
1391 This attribute indicates that calls to the function cannot be
1392 duplicated. A call to a ``noduplicate`` function may be moved
1393 within its parent function, but may not be duplicated within
1394 its parent function.
1395
1396 A function containing a ``noduplicate`` call may still
1397 be an inlining candidate, provided that the call is not
1398 duplicated by inlining. That implies that the function has
1399 internal linkage and only has one call site, so the original
1400 call is dead after inlining.
Sean Silvab084af42012-12-07 10:36:55 +00001401``noimplicitfloat``
1402 This attributes disables implicit floating point instructions.
1403``noinline``
1404 This attribute indicates that the inliner should never inline this
1405 function in any situation. This attribute may not be used together
1406 with the ``alwaysinline`` attribute.
Sean Silva1cbbcf12013-08-06 19:34:37 +00001407``nonlazybind``
1408 This attribute suppresses lazy symbol binding for the function. This
1409 may make calls to the function faster, at the cost of extra program
1410 startup time if the function is not called during program startup.
Sean Silvab084af42012-12-07 10:36:55 +00001411``noredzone``
1412 This attribute indicates that the code generator should not use a
1413 red zone, even if the target-specific ABI normally permits it.
1414``noreturn``
1415 This function attribute indicates that the function never returns
1416 normally. This produces undefined behavior at runtime if the
1417 function ever does dynamically return.
James Molloye6f87ca2015-11-06 10:32:53 +00001418``norecurse``
1419 This function attribute indicates that the function does not call itself
1420 either directly or indirectly down any possible call path. This produces
1421 undefined behavior at runtime if the function ever does recurse.
Sean Silvab084af42012-12-07 10:36:55 +00001422``nounwind``
Reid Kleckner96d01132015-02-11 01:23:16 +00001423 This function attribute indicates that the function never raises an
1424 exception. If the function does raise an exception, its runtime
1425 behavior is undefined. However, functions marked nounwind may still
1426 trap or generate asynchronous exceptions. Exception handling schemes
1427 that are recognized by LLVM to handle asynchronous exceptions, such
1428 as SEH, will still provide their implementation defined semantics.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001429``optnone``
Paul Robinsona2550a62015-11-30 21:56:16 +00001430 This function attribute indicates that most optimization passes will skip
1431 this function, with the exception of interprocedural optimization passes.
1432 Code generation defaults to the "fast" instruction selector.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001433 This attribute cannot be used together with the ``alwaysinline``
1434 attribute; this attribute is also incompatible
1435 with the ``minsize`` attribute and the ``optsize`` attribute.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001436
Paul Robinsondcbe35b2013-11-18 21:44:03 +00001437 This attribute requires the ``noinline`` attribute to be specified on
1438 the function as well, so the function is never inlined into any caller.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001439 Only functions with the ``alwaysinline`` attribute are valid
Paul Robinsondcbe35b2013-11-18 21:44:03 +00001440 candidates for inlining into the body of this function.
Sean Silvab084af42012-12-07 10:36:55 +00001441``optsize``
1442 This attribute suggests that optimization passes and code generator
1443 passes make choices that keep the code size of this function low,
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001444 and otherwise do optimizations specifically to reduce code size as
1445 long as they do not significantly impact runtime performance.
Sanjoy Dasc0441c22016-04-19 05:24:47 +00001446``"patchable-function"``
1447 This attribute tells the code generator that the code
1448 generated for this function needs to follow certain conventions that
1449 make it possible for a runtime function to patch over it later.
1450 The exact effect of this attribute depends on its string value,
Charles Davise9c32c72016-08-08 21:20:15 +00001451 for which there currently is one legal possibility:
Sanjoy Dasc0441c22016-04-19 05:24:47 +00001452
1453 * ``"prologue-short-redirect"`` - This style of patchable
1454 function is intended to support patching a function prologue to
1455 redirect control away from the function in a thread safe
1456 manner. It guarantees that the first instruction of the
1457 function will be large enough to accommodate a short jump
1458 instruction, and will be sufficiently aligned to allow being
1459 fully changed via an atomic compare-and-swap instruction.
1460 While the first requirement can be satisfied by inserting large
1461 enough NOP, LLVM can and will try to re-purpose an existing
1462 instruction (i.e. one that would have to be emitted anyway) as
1463 the patchable instruction larger than a short jump.
1464
1465 ``"prologue-short-redirect"`` is currently only supported on
1466 x86-64.
1467
1468 This attribute by itself does not imply restrictions on
1469 inter-procedural optimizations. All of the semantic effects the
1470 patching may have to be separately conveyed via the linkage type.
Sean Silvab084af42012-12-07 10:36:55 +00001471``readnone``
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001472 On a function, this attribute indicates that the function computes its
1473 result (or decides to unwind an exception) based strictly on its arguments,
Sean Silvab084af42012-12-07 10:36:55 +00001474 without dereferencing any pointer arguments or otherwise accessing
1475 any mutable state (e.g. memory, control registers, etc) visible to
1476 caller functions. It does not write through any pointer arguments
1477 (including ``byval`` arguments) and never changes any state visible
1478 to callers. This means that it cannot unwind exceptions by calling
1479 the ``C++`` exception throwing methods.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001480
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001481 On an argument, this attribute indicates that the function does not
1482 dereference that pointer argument, even though it may read or write the
Nick Lewyckyefe31f22013-07-06 01:04:47 +00001483 memory that the pointer points to if accessed through other pointers.
Sean Silvab084af42012-12-07 10:36:55 +00001484``readonly``
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001485 On a function, this attribute indicates that the function does not write
1486 through any pointer arguments (including ``byval`` arguments) or otherwise
Sean Silvab084af42012-12-07 10:36:55 +00001487 modify any state (e.g. memory, control registers, etc) visible to
1488 caller functions. It may dereference pointer arguments and read
1489 state that may be set in the caller. A readonly function always
1490 returns the same value (or unwinds an exception identically) when
1491 called with the same set of arguments and global state. It cannot
1492 unwind an exception by calling the ``C++`` exception throwing
1493 methods.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001494
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001495 On an argument, this attribute indicates that the function does not write
1496 through this pointer argument, even though it may write to the memory that
1497 the pointer points to.
Nicolai Haehnle84c9f992016-07-04 08:01:29 +00001498``writeonly``
1499 On a function, this attribute indicates that the function may write to but
1500 does not read from memory.
1501
1502 On an argument, this attribute indicates that the function may write to but
1503 does not read through this pointer argument (even though it may read from
1504 the memory that the pointer points to).
Igor Laevsky39d662f2015-07-11 10:30:36 +00001505``argmemonly``
1506 This attribute indicates that the only memory accesses inside function are
1507 loads and stores from objects pointed to by its pointer-typed arguments,
1508 with arbitrary offsets. Or in other words, all memory operations in the
1509 function can refer to memory only using pointers based on its function
1510 arguments.
1511 Note that ``argmemonly`` can be used together with ``readonly`` attribute
1512 in order to specify that function reads only from its arguments.
Sean Silvab084af42012-12-07 10:36:55 +00001513``returns_twice``
1514 This attribute indicates that this function can return twice. The C
1515 ``setjmp`` is an example of such a function. The compiler disables
1516 some optimizations (like tail calls) in the caller of these
1517 functions.
Peter Collingbourne82437bf2015-06-15 21:07:11 +00001518``safestack``
1519 This attribute indicates that
1520 `SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_
1521 protection is enabled for this function.
1522
1523 If a function that has a ``safestack`` attribute is inlined into a
1524 function that doesn't have a ``safestack`` attribute or which has an
1525 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1526 function will have a ``safestack`` attribute.
Kostya Serebryanycf880b92013-02-26 06:58:09 +00001527``sanitize_address``
1528 This attribute indicates that AddressSanitizer checks
1529 (dynamic address safety analysis) are enabled for this function.
1530``sanitize_memory``
1531 This attribute indicates that MemorySanitizer checks (dynamic detection
1532 of accesses to uninitialized memory) are enabled for this function.
1533``sanitize_thread``
1534 This attribute indicates that ThreadSanitizer checks
1535 (dynamic thread safety analysis) are enabled for this function.
Sean Silvab084af42012-12-07 10:36:55 +00001536``ssp``
1537 This attribute indicates that the function should emit a stack
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00001538 smashing protector. It is in the form of a "canary" --- a random value
Sean Silvab084af42012-12-07 10:36:55 +00001539 placed on the stack before the local variables that's checked upon
1540 return from the function to see if it has been overwritten. A
1541 heuristic is used to determine if a function needs stack protectors
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001542 or not. The heuristic used will enable protectors for functions with:
Dmitri Gribenko69b56472013-01-29 23:14:41 +00001543
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001544 - Character arrays larger than ``ssp-buffer-size`` (default 8).
1545 - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1546 - Calls to alloca() with variable sizes or constant sizes greater than
1547 ``ssp-buffer-size``.
Sean Silvab084af42012-12-07 10:36:55 +00001548
Josh Magee24c7f062014-02-01 01:36:16 +00001549 Variables that are identified as requiring a protector will be arranged
1550 on the stack such that they are adjacent to the stack protector guard.
1551
Sean Silvab084af42012-12-07 10:36:55 +00001552 If a function that has an ``ssp`` attribute is inlined into a
1553 function that doesn't have an ``ssp`` attribute, then the resulting
1554 function will have an ``ssp`` attribute.
1555``sspreq``
1556 This attribute indicates that the function should *always* emit a
1557 stack smashing protector. This overrides the ``ssp`` function
1558 attribute.
1559
Josh Magee24c7f062014-02-01 01:36:16 +00001560 Variables that are identified as requiring a protector will be arranged
1561 on the stack such that they are adjacent to the stack protector guard.
1562 The specific layout rules are:
1563
1564 #. Large arrays and structures containing large arrays
1565 (``>= ssp-buffer-size``) are closest to the stack protector.
1566 #. Small arrays and structures containing small arrays
1567 (``< ssp-buffer-size``) are 2nd closest to the protector.
1568 #. Variables that have had their address taken are 3rd closest to the
1569 protector.
1570
Sean Silvab084af42012-12-07 10:36:55 +00001571 If a function that has an ``sspreq`` attribute is inlined into a
1572 function that doesn't have an ``sspreq`` attribute or which has an
Bill Wendlingd154e2832013-01-23 06:41:41 +00001573 ``ssp`` or ``sspstrong`` attribute, then the resulting function will have
1574 an ``sspreq`` attribute.
1575``sspstrong``
1576 This attribute indicates that the function should emit a stack smashing
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001577 protector. This attribute causes a strong heuristic to be used when
Sean Silvaa1190322015-08-06 22:56:48 +00001578 determining if a function needs stack protectors. The strong heuristic
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001579 will enable protectors for functions with:
Dmitri Gribenko69b56472013-01-29 23:14:41 +00001580
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001581 - Arrays of any size and type
1582 - Aggregates containing an array of any size and type.
1583 - Calls to alloca().
1584 - Local variables that have had their address taken.
1585
Josh Magee24c7f062014-02-01 01:36:16 +00001586 Variables that are identified as requiring a protector will be arranged
1587 on the stack such that they are adjacent to the stack protector guard.
1588 The specific layout rules are:
1589
1590 #. Large arrays and structures containing large arrays
1591 (``>= ssp-buffer-size``) are closest to the stack protector.
1592 #. Small arrays and structures containing small arrays
1593 (``< ssp-buffer-size``) are 2nd closest to the protector.
1594 #. Variables that have had their address taken are 3rd closest to the
1595 protector.
1596
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001597 This overrides the ``ssp`` function attribute.
Bill Wendlingd154e2832013-01-23 06:41:41 +00001598
1599 If a function that has an ``sspstrong`` attribute is inlined into a
1600 function that doesn't have an ``sspstrong`` attribute, then the
1601 resulting function will have an ``sspstrong`` attribute.
Reid Kleckner5a2ab2b2015-03-04 00:08:56 +00001602``"thunk"``
1603 This attribute indicates that the function will delegate to some other
1604 function with a tail call. The prototype of a thunk should not be used for
1605 optimization purposes. The caller is expected to cast the thunk prototype to
1606 match the thunk target prototype.
Sean Silvab084af42012-12-07 10:36:55 +00001607``uwtable``
1608 This attribute indicates that the ABI being targeted requires that
Sean Silva706fba52015-08-06 22:56:24 +00001609 an unwind table entry be produced for this function even if we can
Sean Silvab084af42012-12-07 10:36:55 +00001610 show that no exceptions passes by it. This is normally the case for
1611 the ELF x86-64 abi, but it can be disabled for some compilation
1612 units.
Sean Silvab084af42012-12-07 10:36:55 +00001613
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001614
1615.. _opbundles:
1616
1617Operand Bundles
1618---------------
1619
1620Note: operand bundles are a work in progress, and they should be
1621considered experimental at this time.
1622
1623Operand bundles are tagged sets of SSA values that can be associated
Sanjoy Dasb0e9d4a52015-09-25 00:05:40 +00001624with certain LLVM instructions (currently only ``call`` s and
1625``invoke`` s). In a way they are like metadata, but dropping them is
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001626incorrect and will change program semantics.
1627
1628Syntax::
David Majnemer34cacb42015-10-22 01:46:38 +00001629
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001630 operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001631 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
1632 bundle operand ::= SSA value
1633 tag ::= string constant
1634
1635Operand bundles are **not** part of a function's signature, and a
1636given function may be called from multiple places with different kinds
1637of operand bundles. This reflects the fact that the operand bundles
1638are conceptually a part of the ``call`` (or ``invoke``), not the
1639callee being dispatched to.
1640
1641Operand bundles are a generic mechanism intended to support
1642runtime-introspection-like functionality for managed languages. While
1643the exact semantics of an operand bundle depend on the bundle tag,
1644there are certain limitations to how much the presence of an operand
1645bundle can influence the semantics of a program. These restrictions
1646are described as the semantics of an "unknown" operand bundle. As
1647long as the behavior of an operand bundle is describable within these
1648restrictions, LLVM does not need to have special knowledge of the
1649operand bundle to not miscompile programs containing it.
1650
David Majnemer34cacb42015-10-22 01:46:38 +00001651- The bundle operands for an unknown operand bundle escape in unknown
1652 ways before control is transferred to the callee or invokee.
1653- Calls and invokes with operand bundles have unknown read / write
1654 effect on the heap on entry and exit (even if the call target is
Sylvestre Ledru84666a12016-02-14 20:16:22 +00001655 ``readnone`` or ``readonly``), unless they're overridden with
Sanjoy Das98a341b2015-10-22 03:12:22 +00001656 callsite specific attributes.
1657- An operand bundle at a call site cannot change the implementation
1658 of the called function. Inter-procedural optimizations work as
1659 usual as long as they take into account the first two properties.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001660
Sanjoy Dascdafd842015-11-11 21:38:02 +00001661More specific types of operand bundles are described below.
1662
Sanjoy Dasb51325d2016-03-11 19:08:34 +00001663.. _deopt_opbundles:
1664
Sanjoy Dascdafd842015-11-11 21:38:02 +00001665Deoptimization Operand Bundles
1666^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1667
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001668Deoptimization operand bundles are characterized by the ``"deopt"``
Sanjoy Dascdafd842015-11-11 21:38:02 +00001669operand bundle tag. These operand bundles represent an alternate
1670"safe" continuation for the call site they're attached to, and can be
1671used by a suitable runtime to deoptimize the compiled frame at the
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001672specified call site. There can be at most one ``"deopt"`` operand
1673bundle attached to a call site. Exact details of deoptimization is
1674out of scope for the language reference, but it usually involves
1675rewriting a compiled frame into a set of interpreted frames.
Sanjoy Dascdafd842015-11-11 21:38:02 +00001676
1677From the compiler's perspective, deoptimization operand bundles make
1678the call sites they're attached to at least ``readonly``. They read
1679through all of their pointer typed operands (even if they're not
1680otherwise escaped) and the entire visible heap. Deoptimization
1681operand bundles do not capture their operands except during
1682deoptimization, in which case control will not be returned to the
1683compiled frame.
1684
Sanjoy Das2d161452015-11-18 06:23:38 +00001685The inliner knows how to inline through calls that have deoptimization
1686operand bundles. Just like inlining through a normal call site
1687involves composing the normal and exceptional continuations, inlining
1688through a call site with a deoptimization operand bundle needs to
1689appropriately compose the "safe" deoptimization continuation. The
1690inliner does this by prepending the parent's deoptimization
1691continuation to every deoptimization continuation in the inlined body.
1692E.g. inlining ``@f`` into ``@g`` in the following example
1693
1694.. code-block:: llvm
1695
1696 define void @f() {
1697 call void @x() ;; no deopt state
1698 call void @y() [ "deopt"(i32 10) ]
1699 call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
1700 ret void
1701 }
1702
1703 define void @g() {
1704 call void @f() [ "deopt"(i32 20) ]
1705 ret void
1706 }
1707
1708will result in
1709
1710.. code-block:: llvm
1711
1712 define void @g() {
1713 call void @x() ;; still no deopt state
1714 call void @y() [ "deopt"(i32 20, i32 10) ]
1715 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
1716 ret void
1717 }
1718
1719It is the frontend's responsibility to structure or encode the
1720deoptimization state in a way that syntactically prepending the
1721caller's deoptimization state to the callee's deoptimization state is
1722semantically equivalent to composing the caller's deoptimization
1723continuation after the callee's deoptimization continuation.
1724
Joseph Tremoulete28885e2016-01-10 04:28:38 +00001725.. _ob_funclet:
1726
David Majnemer3bb88c02015-12-15 21:27:27 +00001727Funclet Operand Bundles
1728^^^^^^^^^^^^^^^^^^^^^^^
1729
1730Funclet operand bundles are characterized by the ``"funclet"``
1731operand bundle tag. These operand bundles indicate that a call site
1732is within a particular funclet. There can be at most one
1733``"funclet"`` operand bundle attached to a call site and it must have
1734exactly one bundle operand.
1735
Joseph Tremoulete28885e2016-01-10 04:28:38 +00001736If any funclet EH pads have been "entered" but not "exited" (per the
1737`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
1738it is undefined behavior to execute a ``call`` or ``invoke`` which:
1739
1740* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
1741 intrinsic, or
1742* has a ``"funclet"`` bundle whose operand is not the most-recently-entered
1743 not-yet-exited funclet EH pad.
1744
1745Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
1746executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
1747
Sanjoy Dasa34ce952016-01-20 19:50:25 +00001748GC Transition Operand Bundles
1749^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1750
1751GC transition operand bundles are characterized by the
1752``"gc-transition"`` operand bundle tag. These operand bundles mark a
1753call as a transition between a function with one GC strategy to a
1754function with a different GC strategy. If coordinating the transition
1755between GC strategies requires additional code generation at the call
1756site, these bundles may contain any values that are needed by the
1757generated code. For more details, see :ref:`GC Transitions
1758<gc_transition_args>`.
1759
Sean Silvab084af42012-12-07 10:36:55 +00001760.. _moduleasm:
1761
1762Module-Level Inline Assembly
1763----------------------------
1764
1765Modules may contain "module-level inline asm" blocks, which corresponds
1766to the GCC "file scope inline asm" blocks. These blocks are internally
1767concatenated by LLVM and treated as a single unit, but may be separated
1768in the ``.ll`` file if desired. The syntax is very simple:
1769
1770.. code-block:: llvm
1771
1772 module asm "inline asm code goes here"
1773 module asm "more can go here"
1774
1775The strings can contain any character by escaping non-printable
1776characters. The escape sequence used is simply "\\xx" where "xx" is the
1777two digit hex code for the number.
1778
James Y Knightbc832ed2015-07-08 18:08:36 +00001779Note that the assembly string *must* be parseable by LLVM's integrated assembler
1780(unless it is disabled), even when emitting a ``.s`` file.
Sean Silvab084af42012-12-07 10:36:55 +00001781
Eli Benderskyfdc529a2013-06-07 19:40:08 +00001782.. _langref_datalayout:
1783
Sean Silvab084af42012-12-07 10:36:55 +00001784Data Layout
1785-----------
1786
1787A module may specify a target specific data layout string that specifies
1788how data is to be laid out in memory. The syntax for the data layout is
1789simply:
1790
1791.. code-block:: llvm
1792
1793 target datalayout = "layout specification"
1794
1795The *layout specification* consists of a list of specifications
1796separated by the minus sign character ('-'). Each specification starts
1797with a letter and may include other information after the letter to
1798define some aspect of the data layout. The specifications accepted are
1799as follows:
1800
1801``E``
1802 Specifies that the target lays out data in big-endian form. That is,
1803 the bits with the most significance have the lowest address
1804 location.
1805``e``
1806 Specifies that the target lays out data in little-endian form. That
1807 is, the bits with the least significance have the lowest address
1808 location.
1809``S<size>``
1810 Specifies the natural alignment of the stack in bits. Alignment
1811 promotion of stack variables is limited to the natural stack
1812 alignment to avoid dynamic stack realignment. The stack alignment
1813 must be a multiple of 8-bits. If omitted, the natural stack
1814 alignment defaults to "unspecified", which does not prevent any
1815 alignment promotions.
1816``p[n]:<size>:<abi>:<pref>``
1817 This specifies the *size* of a pointer and its ``<abi>`` and
1818 ``<pref>``\erred alignments for address space ``n``. All sizes are in
Sean Silva706fba52015-08-06 22:56:24 +00001819 bits. The address space, ``n``, is optional, and if not specified,
Sean Silvaa1190322015-08-06 22:56:48 +00001820 denotes the default address space 0. The value of ``n`` must be
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001821 in the range [1,2^23).
Sean Silvab084af42012-12-07 10:36:55 +00001822``i<size>:<abi>:<pref>``
1823 This specifies the alignment for an integer type of a given bit
1824 ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
1825``v<size>:<abi>:<pref>``
1826 This specifies the alignment for a vector type of a given bit
1827 ``<size>``.
1828``f<size>:<abi>:<pref>``
1829 This specifies the alignment for a floating point type of a given bit
1830 ``<size>``. Only values of ``<size>`` that are supported by the target
1831 will work. 32 (float) and 64 (double) are supported on all targets; 80
1832 or 128 (different flavors of long double) are also supported on some
1833 targets.
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001834``a:<abi>:<pref>``
1835 This specifies the alignment for an object of aggregate type.
Rafael Espindola58873562014-01-03 19:21:54 +00001836``m:<mangling>``
Hans Wennborgd4245ac2014-01-15 02:49:17 +00001837 If present, specifies that llvm names are mangled in the output. The
1838 options are
1839
1840 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
1841 * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
1842 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
1843 symbols get a ``_`` prefix.
1844 * ``w``: Windows COFF prefix: Similar to Mach-O, but stdcall and fastcall
1845 functions also get a suffix based on the frame size.
Saleem Abdulrasool70d2d642015-10-25 20:39:35 +00001846 * ``x``: Windows x86 COFF prefix: Similar to Windows COFF, but use a ``_``
1847 prefix for ``__cdecl`` functions.
Sean Silvab084af42012-12-07 10:36:55 +00001848``n<size1>:<size2>:<size3>...``
1849 This specifies a set of native integer widths for the target CPU in
1850 bits. For example, it might contain ``n32`` for 32-bit PowerPC,
1851 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
1852 this set are considered to support most general arithmetic operations
1853 efficiently.
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +00001854``ni:<address space0>:<address space1>:<address space2>...``
1855 This specifies pointer types with the specified address spaces
1856 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0``
1857 address space cannot be specified as non-integral.
Sean Silvab084af42012-12-07 10:36:55 +00001858
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001859On every specification that takes a ``<abi>:<pref>``, specifying the
1860``<pref>`` alignment is optional. If omitted, the preceding ``:``
1861should be omitted too and ``<pref>`` will be equal to ``<abi>``.
1862
Sean Silvab084af42012-12-07 10:36:55 +00001863When constructing the data layout for a given target, LLVM starts with a
1864default set of specifications which are then (possibly) overridden by
1865the specifications in the ``datalayout`` keyword. The default
1866specifications are given in this list:
1867
1868- ``E`` - big endian
Matt Arsenault24b49c42013-07-31 17:49:08 +00001869- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
1870- ``p[n]:64:64:64`` - Other address spaces are assumed to be the
1871 same as the default address space.
Patrik Hagglunda832ab12013-01-30 09:02:06 +00001872- ``S0`` - natural stack alignment is unspecified
Sean Silvab084af42012-12-07 10:36:55 +00001873- ``i1:8:8`` - i1 is 8-bit (byte) aligned
1874- ``i8:8:8`` - i8 is 8-bit (byte) aligned
1875- ``i16:16:16`` - i16 is 16-bit aligned
1876- ``i32:32:32`` - i32 is 32-bit aligned
1877- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
1878 alignment of 64-bits
Patrik Hagglunda832ab12013-01-30 09:02:06 +00001879- ``f16:16:16`` - half is 16-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00001880- ``f32:32:32`` - float is 32-bit aligned
1881- ``f64:64:64`` - double is 64-bit aligned
Patrik Hagglunda832ab12013-01-30 09:02:06 +00001882- ``f128:128:128`` - quad is 128-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00001883- ``v64:64:64`` - 64-bit vector is 64-bit aligned
1884- ``v128:128:128`` - 128-bit vector is 128-bit aligned
Rafael Espindolae8f4d582013-12-12 17:21:51 +00001885- ``a:0:64`` - aggregates are 64-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00001886
1887When LLVM is determining the alignment for a given type, it uses the
1888following rules:
1889
1890#. If the type sought is an exact match for one of the specifications,
1891 that specification is used.
1892#. If no match is found, and the type sought is an integer type, then
1893 the smallest integer type that is larger than the bitwidth of the
1894 sought type is used. If none of the specifications are larger than
1895 the bitwidth then the largest integer type is used. For example,
1896 given the default specifications above, the i7 type will use the
1897 alignment of i8 (next largest) while both i65 and i256 will use the
1898 alignment of i64 (largest specified).
1899#. If no match is found, and the type sought is a vector type, then the
1900 largest vector type that is smaller than the sought vector type will
1901 be used as a fall back. This happens because <128 x double> can be
1902 implemented in terms of 64 <2 x double>, for example.
1903
1904The function of the data layout string may not be what you expect.
1905Notably, this is not a specification from the frontend of what alignment
1906the code generator should use.
1907
1908Instead, if specified, the target data layout is required to match what
1909the ultimate *code generator* expects. This string is used by the
1910mid-level optimizers to improve code, and this only works if it matches
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001911what the ultimate code generator uses. There is no way to generate IR
1912that does not embed this target-specific detail into the IR. If you
1913don't specify the string, the default specifications will be used to
1914generate a Data Layout and the optimization phases will operate
1915accordingly and introduce target specificity into the IR with respect to
1916these default specifications.
Sean Silvab084af42012-12-07 10:36:55 +00001917
Bill Wendling5cc90842013-10-18 23:41:25 +00001918.. _langref_triple:
1919
1920Target Triple
1921-------------
1922
1923A module may specify a target triple string that describes the target
1924host. The syntax for the target triple is simply:
1925
1926.. code-block:: llvm
1927
1928 target triple = "x86_64-apple-macosx10.7.0"
1929
1930The *target triple* string consists of a series of identifiers delimited
1931by the minus sign character ('-'). The canonical forms are:
1932
1933::
1934
1935 ARCHITECTURE-VENDOR-OPERATING_SYSTEM
1936 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
1937
1938This information is passed along to the backend so that it generates
1939code for the proper architecture. It's possible to override this on the
1940command line with the ``-mtriple`` command line option.
1941
Sean Silvab084af42012-12-07 10:36:55 +00001942.. _pointeraliasing:
1943
1944Pointer Aliasing Rules
1945----------------------
1946
1947Any memory access must be done through a pointer value associated with
1948an address range of the memory access, otherwise the behavior is
1949undefined. Pointer values are associated with address ranges according
1950to the following rules:
1951
1952- A pointer value is associated with the addresses associated with any
1953 value it is *based* on.
1954- An address of a global variable is associated with the address range
1955 of the variable's storage.
1956- The result value of an allocation instruction is associated with the
1957 address range of the allocated storage.
1958- A null pointer in the default address-space is associated with no
1959 address.
1960- An integer constant other than zero or a pointer value returned from
1961 a function not defined within LLVM may be associated with address
1962 ranges allocated through mechanisms other than those provided by
1963 LLVM. Such ranges shall not overlap with any ranges of addresses
1964 allocated by mechanisms provided by LLVM.
1965
1966A pointer value is *based* on another pointer value according to the
1967following rules:
1968
1969- A pointer value formed from a ``getelementptr`` operation is *based*
David Blaikie16a97eb2015-03-04 22:02:58 +00001970 on the first value operand of the ``getelementptr``.
Sean Silvab084af42012-12-07 10:36:55 +00001971- The result value of a ``bitcast`` is *based* on the operand of the
1972 ``bitcast``.
1973- A pointer value formed by an ``inttoptr`` is *based* on all pointer
1974 values that contribute (directly or indirectly) to the computation of
1975 the pointer's value.
1976- The "*based* on" relationship is transitive.
1977
1978Note that this definition of *"based"* is intentionally similar to the
1979definition of *"based"* in C99, though it is slightly weaker.
1980
1981LLVM IR does not associate types with memory. The result type of a
1982``load`` merely indicates the size and alignment of the memory from
1983which to load, as well as the interpretation of the value. The first
1984operand type of a ``store`` similarly only indicates the size and
1985alignment of the store.
1986
1987Consequently, type-based alias analysis, aka TBAA, aka
1988``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
1989:ref:`Metadata <metadata>` may be used to encode additional information
1990which specialized optimization passes may use to implement type-based
1991alias analysis.
1992
1993.. _volatile:
1994
1995Volatile Memory Accesses
1996------------------------
1997
1998Certain memory accesses, such as :ref:`load <i_load>`'s,
1999:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2000marked ``volatile``. The optimizers must not change the number of
2001volatile operations or change their order of execution relative to other
2002volatile operations. The optimizers *may* change the order of volatile
2003operations relative to non-volatile operations. This is not Java's
2004"volatile" and has no cross-thread synchronization behavior.
2005
Andrew Trick89fc5a62013-01-30 21:19:35 +00002006IR-level volatile loads and stores cannot safely be optimized into
2007llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are
2008flagged volatile. Likewise, the backend should never split or merge
2009target-legal volatile load/store instructions.
2010
Andrew Trick7e6f9282013-01-31 00:49:39 +00002011.. admonition:: Rationale
2012
2013 Platforms may rely on volatile loads and stores of natively supported
2014 data width to be executed as single instruction. For example, in C
2015 this holds for an l-value of volatile primitive type with native
2016 hardware support, but not necessarily for aggregate types. The
2017 frontend upholds these expectations, which are intentionally
Sean Silva706fba52015-08-06 22:56:24 +00002018 unspecified in the IR. The rules above ensure that IR transformations
Andrew Trick7e6f9282013-01-31 00:49:39 +00002019 do not violate the frontend's contract with the language.
2020
Sean Silvab084af42012-12-07 10:36:55 +00002021.. _memmodel:
2022
2023Memory Model for Concurrent Operations
2024--------------------------------------
2025
2026The LLVM IR does not define any way to start parallel threads of
2027execution or to register signal handlers. Nonetheless, there are
2028platform-specific ways to create them, and we define LLVM IR's behavior
2029in their presence. This model is inspired by the C++0x memory model.
2030
2031For a more informal introduction to this model, see the :doc:`Atomics`.
2032
2033We define a *happens-before* partial order as the least partial order
2034that
2035
2036- Is a superset of single-thread program order, and
2037- When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2038 ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2039 techniques, like pthread locks, thread creation, thread joining,
2040 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2041 Constraints <ordering>`).
2042
2043Note that program order does not introduce *happens-before* edges
2044between a thread and signals executing inside that thread.
2045
2046Every (defined) read operation (load instructions, memcpy, atomic
2047loads/read-modify-writes, etc.) R reads a series of bytes written by
2048(defined) write operations (store instructions, atomic
2049stores/read-modify-writes, memcpy, etc.). For the purposes of this
2050section, initialized globals are considered to have a write of the
2051initializer which is atomic and happens before any other read or write
2052of the memory in question. For each byte of a read R, R\ :sub:`byte`
2053may see any write to the same byte, except:
2054
2055- If write\ :sub:`1` happens before write\ :sub:`2`, and
2056 write\ :sub:`2` happens before R\ :sub:`byte`, then
2057 R\ :sub:`byte` does not see write\ :sub:`1`.
2058- If R\ :sub:`byte` happens before write\ :sub:`3`, then
2059 R\ :sub:`byte` does not see write\ :sub:`3`.
2060
2061Given that definition, R\ :sub:`byte` is defined as follows:
2062
2063- If R is volatile, the result is target-dependent. (Volatile is
2064 supposed to give guarantees which can support ``sig_atomic_t`` in
Richard Smith32dbdf62014-07-31 04:25:36 +00002065 C/C++, and may be used for accesses to addresses that do not behave
Sean Silvab084af42012-12-07 10:36:55 +00002066 like normal memory. It does not generally provide cross-thread
2067 synchronization.)
2068- Otherwise, if there is no write to the same byte that happens before
2069 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2070- Otherwise, if R\ :sub:`byte` may see exactly one write,
2071 R\ :sub:`byte` returns the value written by that write.
2072- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2073 see are atomic, it chooses one of the values written. See the :ref:`Atomic
2074 Memory Ordering Constraints <ordering>` section for additional
2075 constraints on how the choice is made.
2076- Otherwise R\ :sub:`byte` returns ``undef``.
2077
2078R returns the value composed of the series of bytes it read. This
2079implies that some bytes within the value may be ``undef`` **without**
2080the entire value being ``undef``. Note that this only defines the
2081semantics of the operation; it doesn't mean that targets will emit more
2082than one instruction to read the series of bytes.
2083
2084Note that in cases where none of the atomic intrinsics are used, this
2085model places only one restriction on IR transformations on top of what
2086is required for single-threaded execution: introducing a store to a byte
2087which might not otherwise be stored is not allowed in general.
2088(Specifically, in the case where another thread might write to and read
2089from an address, introducing a store can change a load that may see
2090exactly one write into a load that may see multiple writes.)
2091
2092.. _ordering:
2093
2094Atomic Memory Ordering Constraints
2095----------------------------------
2096
2097Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
2098:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
2099:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
Tim Northovere94a5182014-03-11 10:48:52 +00002100ordering parameters that determine which other atomic instructions on
Sean Silvab084af42012-12-07 10:36:55 +00002101the same address they *synchronize with*. These semantics are borrowed
2102from Java and C++0x, but are somewhat more colloquial. If these
2103descriptions aren't precise enough, check those specs (see spec
2104references in the :doc:`atomics guide <Atomics>`).
2105:ref:`fence <i_fence>` instructions treat these orderings somewhat
2106differently since they don't take an address. See that instruction's
2107documentation for details.
2108
2109For a simpler introduction to the ordering constraints, see the
2110:doc:`Atomics`.
2111
2112``unordered``
2113 The set of values that can be read is governed by the happens-before
2114 partial order. A value cannot be read unless some operation wrote
2115 it. This is intended to provide a guarantee strong enough to model
2116 Java's non-volatile shared variables. This ordering cannot be
2117 specified for read-modify-write operations; it is not strong enough
2118 to make them atomic in any interesting way.
2119``monotonic``
2120 In addition to the guarantees of ``unordered``, there is a single
2121 total order for modifications by ``monotonic`` operations on each
2122 address. All modification orders must be compatible with the
2123 happens-before order. There is no guarantee that the modification
2124 orders can be combined to a global total order for the whole program
2125 (and this often will not be possible). The read in an atomic
2126 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
2127 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
2128 order immediately before the value it writes. If one atomic read
2129 happens before another atomic read of the same address, the later
2130 read must see the same value or a later value in the address's
2131 modification order. This disallows reordering of ``monotonic`` (or
2132 stronger) operations on the same address. If an address is written
2133 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
2134 read that address repeatedly, the other threads must eventually see
2135 the write. This corresponds to the C++0x/C1x
2136 ``memory_order_relaxed``.
2137``acquire``
2138 In addition to the guarantees of ``monotonic``, a
2139 *synchronizes-with* edge may be formed with a ``release`` operation.
2140 This is intended to model C++'s ``memory_order_acquire``.
2141``release``
2142 In addition to the guarantees of ``monotonic``, if this operation
2143 writes a value which is subsequently read by an ``acquire``
2144 operation, it *synchronizes-with* that operation. (This isn't a
2145 complete description; see the C++0x definition of a release
2146 sequence.) This corresponds to the C++0x/C1x
2147 ``memory_order_release``.
2148``acq_rel`` (acquire+release)
2149 Acts as both an ``acquire`` and ``release`` operation on its
2150 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
2151``seq_cst`` (sequentially consistent)
2152 In addition to the guarantees of ``acq_rel`` (``acquire`` for an
Richard Smith32dbdf62014-07-31 04:25:36 +00002153 operation that only reads, ``release`` for an operation that only
Sean Silvab084af42012-12-07 10:36:55 +00002154 writes), there is a global total order on all
2155 sequentially-consistent operations on all addresses, which is
2156 consistent with the *happens-before* partial order and with the
2157 modification orders of all the affected addresses. Each
2158 sequentially-consistent read sees the last preceding write to the
2159 same address in this global order. This corresponds to the C++0x/C1x
2160 ``memory_order_seq_cst`` and Java volatile.
2161
2162.. _singlethread:
2163
2164If an atomic operation is marked ``singlethread``, it only *synchronizes
2165with* or participates in modification and seq\_cst total orderings with
2166other operations running in the same thread (for example, in signal
2167handlers).
2168
2169.. _fastmath:
2170
2171Fast-Math Flags
2172---------------
2173
2174LLVM IR floating-point binary ops (:ref:`fadd <i_fadd>`,
2175:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
James Molloy88eb5352015-07-10 12:52:00 +00002176:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) have the following flags that can
2177be set to enable otherwise unsafe floating point operations
Sean Silvab084af42012-12-07 10:36:55 +00002178
2179``nnan``
2180 No NaNs - Allow optimizations to assume the arguments and result are not
2181 NaN. Such optimizations are required to retain defined behavior over
2182 NaNs, but the value of the result is undefined.
2183
2184``ninf``
2185 No Infs - Allow optimizations to assume the arguments and result are not
2186 +/-Inf. Such optimizations are required to retain defined behavior over
2187 +/-Inf, but the value of the result is undefined.
2188
2189``nsz``
2190 No Signed Zeros - Allow optimizations to treat the sign of a zero
2191 argument or result as insignificant.
2192
2193``arcp``
2194 Allow Reciprocal - Allow optimizations to use the reciprocal of an
2195 argument rather than perform division.
2196
2197``fast``
2198 Fast - Allow algebraically equivalent transformations that may
2199 dramatically change results in floating point (e.g. reassociate). This
2200 flag implies all the others.
2201
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002202.. _uselistorder:
2203
2204Use-list Order Directives
2205-------------------------
2206
2207Use-list directives encode the in-memory order of each use-list, allowing the
Sean Silvaa1190322015-08-06 22:56:48 +00002208order to be recreated. ``<order-indexes>`` is a comma-separated list of
2209indexes that are assigned to the referenced value's uses. The referenced
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002210value's use-list is immediately sorted by these indexes.
2211
Sean Silvaa1190322015-08-06 22:56:48 +00002212Use-list directives may appear at function scope or global scope. They are not
2213instructions, and have no effect on the semantics of the IR. When they're at
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002214function scope, they must appear after the terminator of the final basic block.
2215
2216If basic blocks have their address taken via ``blockaddress()`` expressions,
2217``uselistorder_bb`` can be used to reorder their use-lists from outside their
2218function's scope.
2219
2220:Syntax:
2221
2222::
2223
2224 uselistorder <ty> <value>, { <order-indexes> }
2225 uselistorder_bb @function, %block { <order-indexes> }
2226
2227:Examples:
2228
2229::
2230
Duncan P. N. Exon Smith23046652014-08-19 21:48:04 +00002231 define void @foo(i32 %arg1, i32 %arg2) {
2232 entry:
2233 ; ... instructions ...
2234 bb:
2235 ; ... instructions ...
2236
2237 ; At function scope.
2238 uselistorder i32 %arg1, { 1, 0, 2 }
2239 uselistorder label %bb, { 1, 0 }
2240 }
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002241
2242 ; At global scope.
2243 uselistorder i32* @global, { 1, 2, 0 }
2244 uselistorder i32 7, { 1, 0 }
2245 uselistorder i32 (i32) @bar, { 1, 0 }
2246 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
2247
Teresa Johnsonde9b8b42016-04-22 13:09:17 +00002248.. _source_filename:
2249
2250Source Filename
2251---------------
2252
2253The *source filename* string is set to the original module identifier,
2254which will be the name of the compiled source file when compiling from
2255source through the clang front end, for example. It is then preserved through
2256the IR and bitcode.
2257
2258This is currently necessary to generate a consistent unique global
2259identifier for local functions used in profile data, which prepends the
2260source file name to the local function name.
2261
2262The syntax for the source file name is simply:
2263
Renato Golin124f2592016-07-20 12:16:38 +00002264.. code-block:: text
Teresa Johnsonde9b8b42016-04-22 13:09:17 +00002265
2266 source_filename = "/path/to/source.c"
2267
Sean Silvab084af42012-12-07 10:36:55 +00002268.. _typesystem:
2269
2270Type System
2271===========
2272
2273The LLVM type system is one of the most important features of the
2274intermediate representation. Being typed enables a number of
2275optimizations to be performed on the intermediate representation
2276directly, without having to do extra analyses on the side before the
2277transformation. A strong type system makes it easier to read the
2278generated code and enables novel analyses and transformations that are
2279not feasible to perform on normal three address code representations.
2280
Rafael Espindola08013342013-12-07 19:34:20 +00002281.. _t_void:
Eli Bendersky0220e6b2013-06-07 20:24:43 +00002282
Rafael Espindola08013342013-12-07 19:34:20 +00002283Void Type
2284---------
Sean Silvab084af42012-12-07 10:36:55 +00002285
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002286:Overview:
2287
Rafael Espindola08013342013-12-07 19:34:20 +00002288
2289The void type does not represent any value and has no size.
2290
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002291:Syntax:
2292
Rafael Espindola08013342013-12-07 19:34:20 +00002293
2294::
2295
2296 void
Sean Silvab084af42012-12-07 10:36:55 +00002297
2298
Rafael Espindola08013342013-12-07 19:34:20 +00002299.. _t_function:
Sean Silvab084af42012-12-07 10:36:55 +00002300
Rafael Espindola08013342013-12-07 19:34:20 +00002301Function Type
2302-------------
Sean Silvab084af42012-12-07 10:36:55 +00002303
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002304:Overview:
2305
Sean Silvab084af42012-12-07 10:36:55 +00002306
Rafael Espindola08013342013-12-07 19:34:20 +00002307The function type can be thought of as a function signature. It consists of a
2308return type and a list of formal parameter types. The return type of a function
2309type is a void type or first class type --- except for :ref:`label <t_label>`
2310and :ref:`metadata <t_metadata>` types.
Sean Silvab084af42012-12-07 10:36:55 +00002311
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002312:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002313
Rafael Espindola08013342013-12-07 19:34:20 +00002314::
Sean Silvab084af42012-12-07 10:36:55 +00002315
Rafael Espindola08013342013-12-07 19:34:20 +00002316 <returntype> (<parameter list>)
Sean Silvab084af42012-12-07 10:36:55 +00002317
Rafael Espindola08013342013-12-07 19:34:20 +00002318...where '``<parameter list>``' is a comma-separated list of type
2319specifiers. Optionally, the parameter list may include a type ``...``, which
Sean Silvaa1190322015-08-06 22:56:48 +00002320indicates that the function takes a variable number of arguments. Variable
Rafael Espindola08013342013-12-07 19:34:20 +00002321argument functions can access their arguments with the :ref:`variable argument
Sean Silvaa1190322015-08-06 22:56:48 +00002322handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
Rafael Espindola08013342013-12-07 19:34:20 +00002323except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
Sean Silvab084af42012-12-07 10:36:55 +00002324
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002325:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002326
Rafael Espindola08013342013-12-07 19:34:20 +00002327+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2328| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` |
2329+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2330| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. |
2331+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2332| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
2333+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2334| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values |
2335+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2336
2337.. _t_firstclass:
2338
2339First Class Types
2340-----------------
Sean Silvab084af42012-12-07 10:36:55 +00002341
2342The :ref:`first class <t_firstclass>` types are perhaps the most important.
2343Values of these types are the only ones which can be produced by
2344instructions.
2345
Rafael Espindola08013342013-12-07 19:34:20 +00002346.. _t_single_value:
Sean Silvab084af42012-12-07 10:36:55 +00002347
Rafael Espindola08013342013-12-07 19:34:20 +00002348Single Value Types
2349^^^^^^^^^^^^^^^^^^
Sean Silvab084af42012-12-07 10:36:55 +00002350
Rafael Espindola08013342013-12-07 19:34:20 +00002351These are the types that are valid in registers from CodeGen's perspective.
Sean Silvab084af42012-12-07 10:36:55 +00002352
2353.. _t_integer:
2354
2355Integer Type
Rafael Espindola08013342013-12-07 19:34:20 +00002356""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002357
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002358:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002359
2360The integer type is a very simple type that simply specifies an
2361arbitrary bit width for the integer type desired. Any bit width from 1
2362bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.
2363
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002364:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002365
2366::
2367
2368 iN
2369
2370The number of bits the integer will occupy is specified by the ``N``
2371value.
2372
2373Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002374*********
Sean Silvab084af42012-12-07 10:36:55 +00002375
2376+----------------+------------------------------------------------+
2377| ``i1`` | a single-bit integer. |
2378+----------------+------------------------------------------------+
2379| ``i32`` | a 32-bit integer. |
2380+----------------+------------------------------------------------+
2381| ``i1942652`` | a really big integer of over 1 million bits. |
2382+----------------+------------------------------------------------+
2383
2384.. _t_floating:
2385
2386Floating Point Types
Rafael Espindola08013342013-12-07 19:34:20 +00002387""""""""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002388
2389.. list-table::
2390 :header-rows: 1
2391
2392 * - Type
2393 - Description
2394
2395 * - ``half``
2396 - 16-bit floating point value
2397
2398 * - ``float``
2399 - 32-bit floating point value
2400
2401 * - ``double``
2402 - 64-bit floating point value
2403
2404 * - ``fp128``
2405 - 128-bit floating point value (112-bit mantissa)
2406
2407 * - ``x86_fp80``
2408 - 80-bit floating point value (X87)
2409
2410 * - ``ppc_fp128``
2411 - 128-bit floating point value (two 64-bits)
2412
Reid Kleckner9a16d082014-03-05 02:41:37 +00002413X86_mmx Type
2414""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002415
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002416:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002417
Reid Kleckner9a16d082014-03-05 02:41:37 +00002418The x86_mmx type represents a value held in an MMX register on an x86
Sean Silvab084af42012-12-07 10:36:55 +00002419machine. The operations allowed on it are quite limited: parameters and
2420return values, load and store, and bitcast. User-specified MMX
2421instructions are represented as intrinsic or asm calls with arguments
2422and/or results of this type. There are no arrays, vectors or constants
2423of this type.
2424
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002425:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002426
2427::
2428
Reid Kleckner9a16d082014-03-05 02:41:37 +00002429 x86_mmx
Sean Silvab084af42012-12-07 10:36:55 +00002430
Sean Silvab084af42012-12-07 10:36:55 +00002431
Rafael Espindola08013342013-12-07 19:34:20 +00002432.. _t_pointer:
2433
2434Pointer Type
2435""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002436
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002437:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002438
Rafael Espindola08013342013-12-07 19:34:20 +00002439The pointer type is used to specify memory locations. Pointers are
2440commonly used to reference objects in memory.
2441
2442Pointer types may have an optional address space attribute defining the
2443numbered address space where the pointed-to object resides. The default
2444address space is number zero. The semantics of non-zero address spaces
2445are target-specific.
2446
2447Note that LLVM does not permit pointers to void (``void*``) nor does it
2448permit pointers to labels (``label*``). Use ``i8*`` instead.
Sean Silvab084af42012-12-07 10:36:55 +00002449
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002450:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002451
2452::
2453
Rafael Espindola08013342013-12-07 19:34:20 +00002454 <type> *
2455
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002456:Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002457
2458+-------------------------+--------------------------------------------------------------------------------------------------------------+
2459| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. |
2460+-------------------------+--------------------------------------------------------------------------------------------------------------+
2461| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
2462+-------------------------+--------------------------------------------------------------------------------------------------------------+
2463| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. |
2464+-------------------------+--------------------------------------------------------------------------------------------------------------+
2465
2466.. _t_vector:
2467
2468Vector Type
2469"""""""""""
2470
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002471:Overview:
Rafael Espindola08013342013-12-07 19:34:20 +00002472
2473A vector type is a simple derived type that represents a vector of
2474elements. Vector types are used when multiple primitive data are
2475operated in parallel using a single instruction (SIMD). A vector type
2476requires a size (number of elements) and an underlying primitive data
2477type. Vector types are considered :ref:`first class <t_firstclass>`.
2478
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002479:Syntax:
Rafael Espindola08013342013-12-07 19:34:20 +00002480
2481::
2482
2483 < <# elements> x <elementtype> >
2484
2485The number of elements is a constant integer value larger than 0;
Manuel Jacob961f7872014-07-30 12:30:06 +00002486elementtype may be any integer, floating point or pointer type. Vectors
2487of size zero are not allowed.
Rafael Espindola08013342013-12-07 19:34:20 +00002488
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002489:Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002490
2491+-------------------+--------------------------------------------------+
2492| ``<4 x i32>`` | Vector of 4 32-bit integer values. |
2493+-------------------+--------------------------------------------------+
2494| ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
2495+-------------------+--------------------------------------------------+
2496| ``<2 x i64>`` | Vector of 2 64-bit integer values. |
2497+-------------------+--------------------------------------------------+
2498| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. |
2499+-------------------+--------------------------------------------------+
Sean Silvab084af42012-12-07 10:36:55 +00002500
2501.. _t_label:
2502
2503Label Type
2504^^^^^^^^^^
2505
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002506:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002507
2508The label type represents code labels.
2509
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002510:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002511
2512::
2513
2514 label
2515
David Majnemerb611e3f2015-08-14 05:09:07 +00002516.. _t_token:
2517
2518Token Type
2519^^^^^^^^^^
2520
2521:Overview:
2522
2523The token type is used when a value is associated with an instruction
2524but all uses of the value must not attempt to introspect or obscure it.
2525As such, it is not appropriate to have a :ref:`phi <i_phi>` or
2526:ref:`select <i_select>` of type token.
2527
2528:Syntax:
2529
2530::
2531
2532 token
2533
2534
2535
Sean Silvab084af42012-12-07 10:36:55 +00002536.. _t_metadata:
2537
2538Metadata Type
2539^^^^^^^^^^^^^
2540
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002541:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002542
2543The metadata type represents embedded metadata. No derived types may be
2544created from metadata except for :ref:`function <t_function>` arguments.
2545
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002546:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002547
2548::
2549
2550 metadata
2551
Sean Silvab084af42012-12-07 10:36:55 +00002552.. _t_aggregate:
2553
2554Aggregate Types
2555^^^^^^^^^^^^^^^
2556
2557Aggregate Types are a subset of derived types that can contain multiple
2558member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
2559aggregate types. :ref:`Vectors <t_vector>` are not considered to be
2560aggregate types.
2561
2562.. _t_array:
2563
2564Array Type
Rafael Espindola08013342013-12-07 19:34:20 +00002565""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002566
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002567:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002568
2569The array type is a very simple derived type that arranges elements
2570sequentially in memory. The array type requires a size (number of
2571elements) and an underlying data type.
2572
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002573:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002574
2575::
2576
2577 [<# elements> x <elementtype>]
2578
2579The number of elements is a constant integer value; ``elementtype`` may
2580be any type with a size.
2581
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002582:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002583
2584+------------------+--------------------------------------+
2585| ``[40 x i32]`` | Array of 40 32-bit integer values. |
2586+------------------+--------------------------------------+
2587| ``[41 x i32]`` | Array of 41 32-bit integer values. |
2588+------------------+--------------------------------------+
2589| ``[4 x i8]`` | Array of 4 8-bit integer values. |
2590+------------------+--------------------------------------+
2591
2592Here are some examples of multidimensional arrays:
2593
2594+-----------------------------+----------------------------------------------------------+
2595| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
2596+-----------------------------+----------------------------------------------------------+
2597| ``[12 x [10 x float]]`` | 12x10 array of single precision floating point values. |
2598+-----------------------------+----------------------------------------------------------+
2599| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
2600+-----------------------------+----------------------------------------------------------+
2601
2602There is no restriction on indexing beyond the end of the array implied
2603by a static type (though there are restrictions on indexing beyond the
2604bounds of an allocated object in some cases). This means that
2605single-dimension 'variable sized array' addressing can be implemented in
2606LLVM with a zero length array type. An implementation of 'pascal style
2607arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
2608example.
2609
Sean Silvab084af42012-12-07 10:36:55 +00002610.. _t_struct:
2611
2612Structure Type
Rafael Espindola08013342013-12-07 19:34:20 +00002613""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002614
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002615:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002616
2617The structure type is used to represent a collection of data members
2618together in memory. The elements of a structure may be any type that has
2619a size.
2620
2621Structures in memory are accessed using '``load``' and '``store``' by
2622getting a pointer to a field with the '``getelementptr``' instruction.
2623Structures in registers are accessed using the '``extractvalue``' and
2624'``insertvalue``' instructions.
2625
2626Structures may optionally be "packed" structures, which indicate that
2627the alignment of the struct is one byte, and that there is no padding
2628between the elements. In non-packed structs, padding between field types
2629is inserted as defined by the DataLayout string in the module, which is
2630required to match what the underlying code generator expects.
2631
2632Structures can either be "literal" or "identified". A literal structure
2633is defined inline with other types (e.g. ``{i32, i32}*``) whereas
2634identified types are always defined at the top level with a name.
2635Literal types are uniqued by their contents and can never be recursive
2636or opaque since there is no way to write one. Identified types can be
2637recursive, can be opaqued, and are never uniqued.
2638
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002639:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002640
2641::
2642
2643 %T1 = type { <type list> } ; Identified normal struct type
2644 %T2 = type <{ <type list> }> ; Identified packed struct type
2645
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002646:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002647
2648+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2649| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values |
2650+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00002651| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. |
Sean Silvab084af42012-12-07 10:36:55 +00002652+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2653| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. |
2654+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2655
2656.. _t_opaque:
2657
2658Opaque Structure Types
Rafael Espindola08013342013-12-07 19:34:20 +00002659""""""""""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002660
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002661:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002662
2663Opaque structure types are used to represent named structure types that
2664do not have a body specified. This corresponds (for example) to the C
2665notion of a forward declared structure.
2666
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002667:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002668
2669::
2670
2671 %X = type opaque
2672 %52 = type opaque
2673
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002674:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002675
2676+--------------+-------------------+
2677| ``opaque`` | An opaque type. |
2678+--------------+-------------------+
2679
Sean Silva1703e702014-04-08 21:06:22 +00002680.. _constants:
2681
Sean Silvab084af42012-12-07 10:36:55 +00002682Constants
2683=========
2684
2685LLVM has several different basic types of constants. This section
2686describes them all and their syntax.
2687
2688Simple Constants
2689----------------
2690
2691**Boolean constants**
2692 The two strings '``true``' and '``false``' are both valid constants
2693 of the ``i1`` type.
2694**Integer constants**
2695 Standard integers (such as '4') are constants of the
2696 :ref:`integer <t_integer>` type. Negative numbers may be used with
2697 integer types.
2698**Floating point constants**
2699 Floating point constants use standard decimal notation (e.g.
2700 123.421), exponential notation (e.g. 1.23421e+2), or a more precise
2701 hexadecimal notation (see below). The assembler requires the exact
2702 decimal value of a floating-point constant. For example, the
2703 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
2704 decimal in binary. Floating point constants must have a :ref:`floating
2705 point <t_floating>` type.
2706**Null pointer constants**
2707 The identifier '``null``' is recognized as a null pointer constant
2708 and must be of :ref:`pointer type <t_pointer>`.
David Majnemerf0f224d2015-11-11 21:57:16 +00002709**Token constants**
2710 The identifier '``none``' is recognized as an empty token constant
2711 and must be of :ref:`token type <t_token>`.
Sean Silvab084af42012-12-07 10:36:55 +00002712
2713The one non-intuitive notation for constants is the hexadecimal form of
2714floating point constants. For example, the form
2715'``double 0x432ff973cafa8000``' is equivalent to (but harder to read
2716than) '``double 4.5e+15``'. The only time hexadecimal floating point
2717constants are required (and the only time that they are generated by the
2718disassembler) is when a floating point constant must be emitted but it
2719cannot be represented as a decimal floating point number in a reasonable
2720number of digits. For example, NaN's, infinities, and other special
2721values are represented in their IEEE hexadecimal format so that assembly
2722and disassembly do not cause any bits to change in the constants.
2723
2724When using the hexadecimal form, constants of types half, float, and
2725double are represented using the 16-digit form shown above (which
2726matches the IEEE754 representation for double); half and float values
Dmitri Gribenko4dc2ba12013-01-16 23:40:37 +00002727must, however, be exactly representable as IEEE 754 half and single
Sean Silvab084af42012-12-07 10:36:55 +00002728precision, respectively. Hexadecimal format is always used for long
2729double, and there are three forms of long double. The 80-bit format used
2730by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The
2731128-bit format used by PowerPC (two adjacent doubles) is represented by
2732``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is
Richard Sandifordae426b42013-05-03 14:32:27 +00002733represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles
2734will only work if they match the long double format on your target.
2735The IEEE 16-bit format (half precision) is represented by ``0xH``
2736followed by 4 hexadecimal digits. All hexadecimal formats are big-endian
2737(sign bit at the left).
Sean Silvab084af42012-12-07 10:36:55 +00002738
Reid Kleckner9a16d082014-03-05 02:41:37 +00002739There are no constants of type x86_mmx.
Sean Silvab084af42012-12-07 10:36:55 +00002740
Eli Bendersky0220e6b2013-06-07 20:24:43 +00002741.. _complexconstants:
2742
Sean Silvab084af42012-12-07 10:36:55 +00002743Complex Constants
2744-----------------
2745
2746Complex constants are a (potentially recursive) combination of simple
2747constants and smaller complex constants.
2748
2749**Structure constants**
2750 Structure constants are represented with notation similar to
2751 structure type definitions (a comma separated list of elements,
2752 surrounded by braces (``{}``)). For example:
2753 "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
2754 "``@G = external global i32``". Structure constants must have
2755 :ref:`structure type <t_struct>`, and the number and types of elements
2756 must match those specified by the type.
2757**Array constants**
2758 Array constants are represented with notation similar to array type
2759 definitions (a comma separated list of elements, surrounded by
2760 square brackets (``[]``)). For example:
2761 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
2762 :ref:`array type <t_array>`, and the number and types of elements must
Daniel Sandersf6051842014-09-11 12:02:59 +00002763 match those specified by the type. As a special case, character array
2764 constants may also be represented as a double-quoted string using the ``c``
2765 prefix. For example: "``c"Hello World\0A\00"``".
Sean Silvab084af42012-12-07 10:36:55 +00002766**Vector constants**
2767 Vector constants are represented with notation similar to vector
2768 type definitions (a comma separated list of elements, surrounded by
2769 less-than/greater-than's (``<>``)). For example:
2770 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
2771 must have :ref:`vector type <t_vector>`, and the number and types of
2772 elements must match those specified by the type.
2773**Zero initialization**
2774 The string '``zeroinitializer``' can be used to zero initialize a
2775 value to zero of *any* type, including scalar and
2776 :ref:`aggregate <t_aggregate>` types. This is often used to avoid
2777 having to print large zero initializers (e.g. for large arrays) and
2778 is always exactly equivalent to using explicit zero initializers.
2779**Metadata node**
Sean Silvaa1190322015-08-06 22:56:48 +00002780 A metadata node is a constant tuple without types. For example:
2781 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00002782 for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
2783 Unlike other typed constants that are meant to be interpreted as part of
2784 the instruction stream, metadata is a place to attach additional
Sean Silvab084af42012-12-07 10:36:55 +00002785 information such as debug info.
2786
2787Global Variable and Function Addresses
2788--------------------------------------
2789
2790The addresses of :ref:`global variables <globalvars>` and
2791:ref:`functions <functionstructure>` are always implicitly valid
2792(link-time) constants. These constants are explicitly referenced when
2793the :ref:`identifier for the global <identifiers>` is used and always have
2794:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
2795file:
2796
2797.. code-block:: llvm
2798
2799 @X = global i32 17
2800 @Y = global i32 42
2801 @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
2802
2803.. _undefvalues:
2804
2805Undefined Values
2806----------------
2807
2808The string '``undef``' can be used anywhere a constant is expected, and
2809indicates that the user of the value may receive an unspecified
2810bit-pattern. Undefined values may be of any type (other than '``label``'
2811or '``void``') and be used anywhere a constant is permitted.
2812
2813Undefined values are useful because they indicate to the compiler that
2814the program is well defined no matter what value is used. This gives the
2815compiler more freedom to optimize. Here are some examples of
2816(potentially surprising) transformations that are valid (in pseudo IR):
2817
2818.. code-block:: llvm
2819
2820 %A = add %X, undef
2821 %B = sub %X, undef
2822 %C = xor %X, undef
2823 Safe:
2824 %A = undef
2825 %B = undef
2826 %C = undef
2827
2828This is safe because all of the output bits are affected by the undef
2829bits. Any output bit can have a zero or one depending on the input bits.
2830
2831.. code-block:: llvm
2832
2833 %A = or %X, undef
2834 %B = and %X, undef
2835 Safe:
2836 %A = -1
2837 %B = 0
2838 Unsafe:
2839 %A = undef
2840 %B = undef
2841
2842These logical operations have bits that are not always affected by the
2843input. For example, if ``%X`` has a zero bit, then the output of the
2844'``and``' operation will always be a zero for that bit, no matter what
2845the corresponding bit from the '``undef``' is. As such, it is unsafe to
2846optimize or assume that the result of the '``and``' is '``undef``'.
2847However, it is safe to assume that all bits of the '``undef``' could be
28480, and optimize the '``and``' to 0. Likewise, it is safe to assume that
2849all the bits of the '``undef``' operand to the '``or``' could be set,
2850allowing the '``or``' to be folded to -1.
2851
2852.. code-block:: llvm
2853
2854 %A = select undef, %X, %Y
2855 %B = select undef, 42, %Y
2856 %C = select %X, %Y, undef
2857 Safe:
2858 %A = %X (or %Y)
2859 %B = 42 (or %Y)
2860 %C = %Y
2861 Unsafe:
2862 %A = undef
2863 %B = undef
2864 %C = undef
2865
2866This set of examples shows that undefined '``select``' (and conditional
2867branch) conditions can go *either way*, but they have to come from one
2868of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
2869both known to have a clear low bit, then ``%A`` would have to have a
2870cleared low bit. However, in the ``%C`` example, the optimizer is
2871allowed to assume that the '``undef``' operand could be the same as
2872``%Y``, allowing the whole '``select``' to be eliminated.
2873
Renato Golin124f2592016-07-20 12:16:38 +00002874.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00002875
2876 %A = xor undef, undef
2877
2878 %B = undef
2879 %C = xor %B, %B
2880
2881 %D = undef
Jonathan Roelofsec81c0b2014-10-16 19:28:10 +00002882 %E = icmp slt %D, 4
Sean Silvab084af42012-12-07 10:36:55 +00002883 %F = icmp gte %D, 4
2884
2885 Safe:
2886 %A = undef
2887 %B = undef
2888 %C = undef
2889 %D = undef
2890 %E = undef
2891 %F = undef
2892
2893This example points out that two '``undef``' operands are not
2894necessarily the same. This can be surprising to people (and also matches
2895C semantics) where they assume that "``X^X``" is always zero, even if
2896``X`` is undefined. This isn't true for a number of reasons, but the
2897short answer is that an '``undef``' "variable" can arbitrarily change
2898its value over its "live range". This is true because the variable
2899doesn't actually *have a live range*. Instead, the value is logically
2900read from arbitrary registers that happen to be around when needed, so
2901the value is not necessarily consistent over time. In fact, ``%A`` and
2902``%C`` need to have the same semantics or the core LLVM "replace all
2903uses with" concept would not hold.
2904
2905.. code-block:: llvm
2906
2907 %A = fdiv undef, %X
2908 %B = fdiv %X, undef
2909 Safe:
2910 %A = undef
2911 b: unreachable
2912
2913These examples show the crucial difference between an *undefined value*
2914and *undefined behavior*. An undefined value (like '``undef``') is
2915allowed to have an arbitrary bit-pattern. This means that the ``%A``
2916operation can be constant folded to '``undef``', because the '``undef``'
2917could be an SNaN, and ``fdiv`` is not (currently) defined on SNaN's.
2918However, in the second example, we can make a more aggressive
2919assumption: because the ``undef`` is allowed to be an arbitrary value,
2920we are allowed to assume that it could be zero. Since a divide by zero
2921has *undefined behavior*, we are allowed to assume that the operation
2922does not execute at all. This allows us to delete the divide and all
2923code after it. Because the undefined operation "can't happen", the
2924optimizer can assume that it occurs in dead code.
2925
Renato Golin124f2592016-07-20 12:16:38 +00002926.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00002927
2928 a: store undef -> %X
2929 b: store %X -> undef
2930 Safe:
2931 a: <deleted>
2932 b: unreachable
2933
2934These examples reiterate the ``fdiv`` example: a store *of* an undefined
2935value can be assumed to not have any effect; we can assume that the
2936value is overwritten with bits that happen to match what was already
2937there. However, a store *to* an undefined location could clobber
2938arbitrary memory, therefore, it has undefined behavior.
2939
2940.. _poisonvalues:
2941
2942Poison Values
2943-------------
2944
2945Poison values are similar to :ref:`undef values <undefvalues>`, however
2946they also represent the fact that an instruction or constant expression
Richard Smith32dbdf62014-07-31 04:25:36 +00002947that cannot evoke side effects has nevertheless detected a condition
2948that results in undefined behavior.
Sean Silvab084af42012-12-07 10:36:55 +00002949
2950There is currently no way of representing a poison value in the IR; they
2951only exist when produced by operations such as :ref:`add <i_add>` with
2952the ``nsw`` flag.
2953
2954Poison value behavior is defined in terms of value *dependence*:
2955
2956- Values other than :ref:`phi <i_phi>` nodes depend on their operands.
2957- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
2958 their dynamic predecessor basic block.
2959- Function arguments depend on the corresponding actual argument values
2960 in the dynamic callers of their functions.
2961- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
2962 instructions that dynamically transfer control back to them.
2963- :ref:`Invoke <i_invoke>` instructions depend on the
2964 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
2965 call instructions that dynamically transfer control back to them.
2966- Non-volatile loads and stores depend on the most recent stores to all
2967 of the referenced memory addresses, following the order in the IR
2968 (including loads and stores implied by intrinsics such as
2969 :ref:`@llvm.memcpy <int_memcpy>`.)
2970- An instruction with externally visible side effects depends on the
2971 most recent preceding instruction with externally visible side
2972 effects, following the order in the IR. (This includes :ref:`volatile
2973 operations <volatile>`.)
2974- An instruction *control-depends* on a :ref:`terminator
2975 instruction <terminators>` if the terminator instruction has
2976 multiple successors and the instruction is always executed when
2977 control transfers to one of the successors, and may not be executed
2978 when control is transferred to another.
2979- Additionally, an instruction also *control-depends* on a terminator
2980 instruction if the set of instructions it otherwise depends on would
2981 be different if the terminator had transferred control to a different
2982 successor.
2983- Dependence is transitive.
2984
Richard Smith32dbdf62014-07-31 04:25:36 +00002985Poison values have the same behavior as :ref:`undef values <undefvalues>`,
2986with the additional effect that any instruction that has a *dependence*
Sean Silvab084af42012-12-07 10:36:55 +00002987on a poison value has undefined behavior.
2988
2989Here are some examples:
2990
2991.. code-block:: llvm
2992
2993 entry:
2994 %poison = sub nuw i32 0, 1 ; Results in a poison value.
2995 %still_poison = and i32 %poison, 0 ; 0, but also poison.
David Blaikie16a97eb2015-03-04 22:02:58 +00002996 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
Sean Silvab084af42012-12-07 10:36:55 +00002997 store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned
2998
2999 store i32 %poison, i32* @g ; Poison value stored to memory.
David Blaikiec7aabbb2015-03-04 22:06:14 +00003000 %poison2 = load i32, i32* @g ; Poison value loaded back from memory.
Sean Silvab084af42012-12-07 10:36:55 +00003001
3002 store volatile i32 %poison, i32* @g ; External observation; undefined behavior.
3003
3004 %narrowaddr = bitcast i32* @g to i16*
3005 %wideaddr = bitcast i32* @g to i64*
David Blaikiec7aabbb2015-03-04 22:06:14 +00003006 %poison3 = load i16, i16* %narrowaddr ; Returns a poison value.
3007 %poison4 = load i64, i64* %wideaddr ; Returns a poison value.
Sean Silvab084af42012-12-07 10:36:55 +00003008
3009 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value.
3010 br i1 %cmp, label %true, label %end ; Branch to either destination.
3011
3012 true:
3013 store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so
3014 ; it has undefined behavior.
3015 br label %end
3016
3017 end:
3018 %p = phi i32 [ 0, %entry ], [ 1, %true ]
3019 ; Both edges into this PHI are
3020 ; control-dependent on %cmp, so this
3021 ; always results in a poison value.
3022
3023 store volatile i32 0, i32* @g ; This would depend on the store in %true
3024 ; if %cmp is true, or the store in %entry
3025 ; otherwise, so this is undefined behavior.
3026
3027 br i1 %cmp, label %second_true, label %second_end
3028 ; The same branch again, but this time the
3029 ; true block doesn't have side effects.
3030
3031 second_true:
3032 ; No side effects!
3033 ret void
3034
3035 second_end:
3036 store volatile i32 0, i32* @g ; This time, the instruction always depends
3037 ; on the store in %end. Also, it is
3038 ; control-equivalent to %end, so this is
3039 ; well-defined (ignoring earlier undefined
3040 ; behavior in this example).
3041
3042.. _blockaddress:
3043
3044Addresses of Basic Blocks
3045-------------------------
3046
3047``blockaddress(@function, %block)``
3048
3049The '``blockaddress``' constant computes the address of the specified
3050basic block in the specified function, and always has an ``i8*`` type.
3051Taking the address of the entry block is illegal.
3052
3053This value only has defined behavior when used as an operand to the
3054':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons
3055against null. Pointer equality tests between labels addresses results in
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00003056undefined behavior --- though, again, comparison against null is ok, and
Sean Silvab084af42012-12-07 10:36:55 +00003057no label is equal to the null pointer. This may be passed around as an
3058opaque pointer sized value as long as the bits are not inspected. This
3059allows ``ptrtoint`` and arithmetic to be performed on these values so
3060long as the original value is reconstituted before the ``indirectbr``
3061instruction.
3062
3063Finally, some targets may provide defined semantics when using the value
3064as the operand to an inline assembly, but that is target specific.
3065
Eli Bendersky0220e6b2013-06-07 20:24:43 +00003066.. _constantexprs:
3067
Sean Silvab084af42012-12-07 10:36:55 +00003068Constant Expressions
3069--------------------
3070
3071Constant expressions are used to allow expressions involving other
3072constants to be used as constants. Constant expressions may be of any
3073:ref:`first class <t_firstclass>` type and may involve any LLVM operation
3074that does not have side effects (e.g. load and call are not supported).
3075The following is the syntax for constant expressions:
3076
3077``trunc (CST to TYPE)``
3078 Truncate a constant to another type. The bit size of CST must be
3079 larger than the bit size of TYPE. Both types must be integers.
3080``zext (CST to TYPE)``
3081 Zero extend a constant to another type. The bit size of CST must be
3082 smaller than the bit size of TYPE. Both types must be integers.
3083``sext (CST to TYPE)``
3084 Sign extend a constant to another type. The bit size of CST must be
3085 smaller than the bit size of TYPE. Both types must be integers.
3086``fptrunc (CST to TYPE)``
3087 Truncate a floating point constant to another floating point type.
3088 The size of CST must be larger than the size of TYPE. Both types
3089 must be floating point.
3090``fpext (CST to TYPE)``
3091 Floating point extend a constant to another type. The size of CST
3092 must be smaller or equal to the size of TYPE. Both types must be
3093 floating point.
3094``fptoui (CST to TYPE)``
3095 Convert a floating point constant to the corresponding unsigned
3096 integer constant. TYPE must be a scalar or vector integer type. CST
3097 must be of scalar or vector floating point type. Both CST and TYPE
3098 must be scalars, or vectors of the same number of elements. If the
3099 value won't fit in the integer type, the results are undefined.
3100``fptosi (CST to TYPE)``
3101 Convert a floating point constant to the corresponding signed
3102 integer constant. TYPE must be a scalar or vector integer type. CST
3103 must be of scalar or vector floating point type. Both CST and TYPE
3104 must be scalars, or vectors of the same number of elements. If the
3105 value won't fit in the integer type, the results are undefined.
3106``uitofp (CST to TYPE)``
3107 Convert an unsigned integer constant to the corresponding floating
3108 point constant. TYPE must be a scalar or vector floating point type.
3109 CST must be of scalar or vector integer type. Both CST and TYPE must
3110 be scalars, or vectors of the same number of elements. If the value
3111 won't fit in the floating point type, the results are undefined.
3112``sitofp (CST to TYPE)``
3113 Convert a signed integer constant to the corresponding floating
3114 point constant. TYPE must be a scalar or vector floating point type.
3115 CST must be of scalar or vector integer type. Both CST and TYPE must
3116 be scalars, or vectors of the same number of elements. If the value
3117 won't fit in the floating point type, the results are undefined.
3118``ptrtoint (CST to TYPE)``
3119 Convert a pointer typed constant to the corresponding integer
Eli Bendersky9c0d4932013-03-11 16:51:15 +00003120 constant. ``TYPE`` must be an integer type. ``CST`` must be of
Sean Silvab084af42012-12-07 10:36:55 +00003121 pointer type. The ``CST`` value is zero extended, truncated, or
3122 unchanged to make it fit in ``TYPE``.
3123``inttoptr (CST to TYPE)``
3124 Convert an integer constant to a pointer constant. TYPE must be a
3125 pointer type. CST must be of integer type. The CST value is zero
3126 extended, truncated, or unchanged to make it fit in a pointer size.
3127 This one is *really* dangerous!
3128``bitcast (CST to TYPE)``
3129 Convert a constant, CST, to another TYPE. The constraints of the
3130 operands are the same as those for the :ref:`bitcast
3131 instruction <i_bitcast>`.
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00003132``addrspacecast (CST to TYPE)``
3133 Convert a constant pointer or constant vector of pointer, CST, to another
3134 TYPE in a different address space. The constraints of the operands are the
3135 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
David Blaikief72d05b2015-03-13 18:20:45 +00003136``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
Sean Silvab084af42012-12-07 10:36:55 +00003137 Perform the :ref:`getelementptr operation <i_getelementptr>` on
3138 constants. As with the :ref:`getelementptr <i_getelementptr>`
3139 instruction, the index list may have zero or more indexes, which are
David Blaikief72d05b2015-03-13 18:20:45 +00003140 required to make sense for the type of "pointer to TY".
Sean Silvab084af42012-12-07 10:36:55 +00003141``select (COND, VAL1, VAL2)``
3142 Perform the :ref:`select operation <i_select>` on constants.
3143``icmp COND (VAL1, VAL2)``
3144 Performs the :ref:`icmp operation <i_icmp>` on constants.
3145``fcmp COND (VAL1, VAL2)``
3146 Performs the :ref:`fcmp operation <i_fcmp>` on constants.
3147``extractelement (VAL, IDX)``
3148 Perform the :ref:`extractelement operation <i_extractelement>` on
3149 constants.
3150``insertelement (VAL, ELT, IDX)``
3151 Perform the :ref:`insertelement operation <i_insertelement>` on
3152 constants.
3153``shufflevector (VEC1, VEC2, IDXMASK)``
3154 Perform the :ref:`shufflevector operation <i_shufflevector>` on
3155 constants.
3156``extractvalue (VAL, IDX0, IDX1, ...)``
3157 Perform the :ref:`extractvalue operation <i_extractvalue>` on
3158 constants. The index list is interpreted in a similar manner as
3159 indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
3160 least one index value must be specified.
3161``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
3162 Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
3163 The index list is interpreted in a similar manner as indices in a
3164 ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
3165 value must be specified.
3166``OPCODE (LHS, RHS)``
3167 Perform the specified operation of the LHS and RHS constants. OPCODE
3168 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
3169 binary <bitwiseops>` operations. The constraints on operands are
3170 the same as those for the corresponding instruction (e.g. no bitwise
3171 operations on floating point values are allowed).
3172
3173Other Values
3174============
3175
Eli Bendersky0220e6b2013-06-07 20:24:43 +00003176.. _inlineasmexprs:
3177
Sean Silvab084af42012-12-07 10:36:55 +00003178Inline Assembler Expressions
3179----------------------------
3180
3181LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
James Y Knightbc832ed2015-07-08 18:08:36 +00003182Inline Assembly <moduleasm>`) through the use of a special value. This value
3183represents the inline assembler as a template string (containing the
3184instructions to emit), a list of operand constraints (stored as a string), a
3185flag that indicates whether or not the inline asm expression has side effects,
3186and a flag indicating whether the function containing the asm needs to align its
3187stack conservatively.
3188
3189The template string supports argument substitution of the operands using "``$``"
3190followed by a number, to indicate substitution of the given register/memory
3191location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
3192be used, where ``MODIFIER`` is a target-specific annotation for how to print the
3193operand (See :ref:`inline-asm-modifiers`).
3194
3195A literal "``$``" may be included by using "``$$``" in the template. To include
3196other special characters into the output, the usual "``\XX``" escapes may be
3197used, just as in other strings. Note that after template substitution, the
3198resulting assembly string is parsed by LLVM's integrated assembler unless it is
3199disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
3200syntax known to LLVM.
3201
3202LLVM's support for inline asm is modeled closely on the requirements of Clang's
3203GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
3204modifier codes listed here are similar or identical to those in GCC's inline asm
3205support. However, to be clear, the syntax of the template and constraint strings
3206described here is *not* the same as the syntax accepted by GCC and Clang, and,
3207while most constraint letters are passed through as-is by Clang, some get
3208translated to other codes when converting from the C source to the LLVM
3209assembly.
3210
3211An example inline assembler expression is:
Sean Silvab084af42012-12-07 10:36:55 +00003212
3213.. code-block:: llvm
3214
3215 i32 (i32) asm "bswap $0", "=r,r"
3216
3217Inline assembler expressions may **only** be used as the callee operand
3218of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
3219Thus, typically we have:
3220
3221.. code-block:: llvm
3222
3223 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
3224
3225Inline asms with side effects not visible in the constraint list must be
3226marked as having side effects. This is done through the use of the
3227'``sideeffect``' keyword, like so:
3228
3229.. code-block:: llvm
3230
3231 call void asm sideeffect "eieio", ""()
3232
3233In some cases inline asms will contain code that will not work unless
3234the stack is aligned in some way, such as calls or SSE instructions on
3235x86, yet will not contain code that does that alignment within the asm.
3236The compiler should make conservative assumptions about what the asm
3237might contain and should generate its usual stack alignment code in the
3238prologue if the '``alignstack``' keyword is present:
3239
3240.. code-block:: llvm
3241
3242 call void asm alignstack "eieio", ""()
3243
3244Inline asms also support using non-standard assembly dialects. The
3245assumed dialect is ATT. When the '``inteldialect``' keyword is present,
3246the inline asm is using the Intel dialect. Currently, ATT and Intel are
3247the only supported dialects. An example is:
3248
3249.. code-block:: llvm
3250
3251 call void asm inteldialect "eieio", ""()
3252
3253If multiple keywords appear the '``sideeffect``' keyword must come
3254first, the '``alignstack``' keyword second and the '``inteldialect``'
3255keyword last.
3256
James Y Knightbc832ed2015-07-08 18:08:36 +00003257Inline Asm Constraint String
3258^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3259
3260The constraint list is a comma-separated string, each element containing one or
3261more constraint codes.
3262
3263For each element in the constraint list an appropriate register or memory
3264operand will be chosen, and it will be made available to assembly template
3265string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
3266second, etc.
3267
3268There are three different types of constraints, which are distinguished by a
3269prefix symbol in front of the constraint code: Output, Input, and Clobber. The
3270constraints must always be given in that order: outputs first, then inputs, then
3271clobbers. They cannot be intermingled.
3272
3273There are also three different categories of constraint codes:
3274
3275- Register constraint. This is either a register class, or a fixed physical
3276 register. This kind of constraint will allocate a register, and if necessary,
3277 bitcast the argument or result to the appropriate type.
3278- Memory constraint. This kind of constraint is for use with an instruction
3279 taking a memory operand. Different constraints allow for different addressing
3280 modes used by the target.
3281- Immediate value constraint. This kind of constraint is for an integer or other
3282 immediate value which can be rendered directly into an instruction. The
3283 various target-specific constraints allow the selection of a value in the
3284 proper range for the instruction you wish to use it with.
3285
3286Output constraints
3287""""""""""""""""""
3288
3289Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
3290indicates that the assembly will write to this operand, and the operand will
3291then be made available as a return value of the ``asm`` expression. Output
3292constraints do not consume an argument from the call instruction. (Except, see
3293below about indirect outputs).
3294
3295Normally, it is expected that no output locations are written to by the assembly
3296expression until *all* of the inputs have been read. As such, LLVM may assign
3297the same register to an output and an input. If this is not safe (e.g. if the
3298assembly contains two instructions, where the first writes to one output, and
3299the second reads an input and writes to a second output), then the "``&``"
3300modifier must be used (e.g. "``=&r``") to specify that the output is an
Sylvestre Ledru84666a12016-02-14 20:16:22 +00003301"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
James Y Knightbc832ed2015-07-08 18:08:36 +00003302will not use the same register for any inputs (other than an input tied to this
3303output).
3304
3305Input constraints
3306"""""""""""""""""
3307
3308Input constraints do not have a prefix -- just the constraint codes. Each input
3309constraint will consume one argument from the call instruction. It is not
3310permitted for the asm to write to any input register or memory location (unless
3311that input is tied to an output). Note also that multiple inputs may all be
3312assigned to the same register, if LLVM can determine that they necessarily all
3313contain the same value.
3314
3315Instead of providing a Constraint Code, input constraints may also "tie"
3316themselves to an output constraint, by providing an integer as the constraint
3317string. Tied inputs still consume an argument from the call instruction, and
3318take up a position in the asm template numbering as is usual -- they will simply
3319be constrained to always use the same register as the output they've been tied
3320to. For example, a constraint string of "``=r,0``" says to assign a register for
3321output, and use that register as an input as well (it being the 0'th
3322constraint).
3323
3324It is permitted to tie an input to an "early-clobber" output. In that case, no
3325*other* input may share the same register as the input tied to the early-clobber
3326(even when the other input has the same value).
3327
3328You may only tie an input to an output which has a register constraint, not a
3329memory constraint. Only a single input may be tied to an output.
3330
3331There is also an "interesting" feature which deserves a bit of explanation: if a
3332register class constraint allocates a register which is too small for the value
3333type operand provided as input, the input value will be split into multiple
3334registers, and all of them passed to the inline asm.
3335
3336However, this feature is often not as useful as you might think.
3337
3338Firstly, the registers are *not* guaranteed to be consecutive. So, on those
3339architectures that have instructions which operate on multiple consecutive
3340instructions, this is not an appropriate way to support them. (e.g. the 32-bit
3341SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
3342hardware then loads into both the named register, and the next register. This
3343feature of inline asm would not be useful to support that.)
3344
3345A few of the targets provide a template string modifier allowing explicit access
3346to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
3347``D``). On such an architecture, you can actually access the second allocated
3348register (yet, still, not any subsequent ones). But, in that case, you're still
3349probably better off simply splitting the value into two separate operands, for
3350clarity. (e.g. see the description of the ``A`` constraint on X86, which,
3351despite existing only for use with this feature, is not really a good idea to
3352use)
3353
3354Indirect inputs and outputs
3355"""""""""""""""""""""""""""
3356
3357Indirect output or input constraints can be specified by the "``*``" modifier
3358(which goes after the "``=``" in case of an output). This indicates that the asm
3359will write to or read from the contents of an *address* provided as an input
3360argument. (Note that in this way, indirect outputs act more like an *input* than
3361an output: just like an input, they consume an argument of the call expression,
3362rather than producing a return value. An indirect output constraint is an
3363"output" only in that the asm is expected to write to the contents of the input
3364memory location, instead of just read from it).
3365
3366This is most typically used for memory constraint, e.g. "``=*m``", to pass the
3367address of a variable as a value.
3368
3369It is also possible to use an indirect *register* constraint, but only on output
3370(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
3371value normally, and then, separately emit a store to the address provided as
3372input, after the provided inline asm. (It's not clear what value this
3373functionality provides, compared to writing the store explicitly after the asm
3374statement, and it can only produce worse code, since it bypasses many
3375optimization passes. I would recommend not using it.)
3376
3377
3378Clobber constraints
3379"""""""""""""""""""
3380
3381A clobber constraint is indicated by a "``~``" prefix. A clobber does not
3382consume an input operand, nor generate an output. Clobbers cannot use any of the
3383general constraint code letters -- they may use only explicit register
3384constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
3385"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
3386memory locations -- not only the memory pointed to by a declared indirect
3387output.
3388
3389
3390Constraint Codes
3391""""""""""""""""
3392After a potential prefix comes constraint code, or codes.
3393
3394A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
3395followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
3396(e.g. "``{eax}``").
3397
3398The one and two letter constraint codes are typically chosen to be the same as
3399GCC's constraint codes.
3400
3401A single constraint may include one or more than constraint code in it, leaving
3402it up to LLVM to choose which one to use. This is included mainly for
3403compatibility with the translation of GCC inline asm coming from clang.
3404
3405There are two ways to specify alternatives, and either or both may be used in an
3406inline asm constraint list:
3407
34081) Append the codes to each other, making a constraint code set. E.g. "``im``"
3409 or "``{eax}m``". This means "choose any of the options in the set". The
3410 choice of constraint is made independently for each constraint in the
3411 constraint list.
3412
34132) Use "``|``" between constraint code sets, creating alternatives. Every
3414 constraint in the constraint list must have the same number of alternative
3415 sets. With this syntax, the same alternative in *all* of the items in the
3416 constraint list will be chosen together.
3417
3418Putting those together, you might have a two operand constraint string like
3419``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
3420operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
3421may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
3422
3423However, the use of either of the alternatives features is *NOT* recommended, as
3424LLVM is not able to make an intelligent choice about which one to use. (At the
3425point it currently needs to choose, not enough information is available to do so
3426in a smart way.) Thus, it simply tries to make a choice that's most likely to
3427compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
3428always choose to use memory, not registers). And, if given multiple registers,
3429or multiple register classes, it will simply choose the first one. (In fact, it
3430doesn't currently even ensure explicitly specified physical registers are
3431unique, so specifying multiple physical registers as alternatives, like
3432``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
3433intended.)
3434
3435Supported Constraint Code List
3436""""""""""""""""""""""""""""""
3437
3438The constraint codes are, in general, expected to behave the same way they do in
3439GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3440inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3441and GCC likely indicates a bug in LLVM.
3442
3443Some constraint codes are typically supported by all targets:
3444
3445- ``r``: A register in the target's general purpose register class.
3446- ``m``: A memory address operand. It is target-specific what addressing modes
3447 are supported, typical examples are register, or register + register offset,
3448 or register + immediate offset (of some target-specific size).
3449- ``i``: An integer constant (of target-specific width). Allows either a simple
3450 immediate, or a relocatable value.
3451- ``n``: An integer constant -- *not* including relocatable values.
3452- ``s``: An integer constant, but allowing *only* relocatable values.
3453- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
3454 useful to pass a label for an asm branch or call.
3455
3456 .. FIXME: but that surely isn't actually okay to jump out of an asm
3457 block without telling llvm about the control transfer???)
3458
3459- ``{register-name}``: Requires exactly the named physical register.
3460
3461Other constraints are target-specific:
3462
3463AArch64:
3464
3465- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
3466- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
3467 i.e. 0 to 4095 with optional shift by 12.
3468- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
3469 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
3470- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
3471 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
3472- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
3473 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
3474- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
3475 32-bit register. This is a superset of ``K``: in addition to the bitmask
3476 immediate, also allows immediate integers which can be loaded with a single
3477 ``MOVZ`` or ``MOVL`` instruction.
3478- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
3479 64-bit register. This is a superset of ``L``.
3480- ``Q``: Memory address operand must be in a single register (no
3481 offsets). (However, LLVM currently does this for the ``m`` constraint as
3482 well.)
3483- ``r``: A 32 or 64-bit integer register (W* or X*).
3484- ``w``: A 32, 64, or 128-bit floating-point/SIMD register.
3485- ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``).
3486
3487AMDGPU:
3488
3489- ``r``: A 32 or 64-bit integer register.
3490- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
3491- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
3492
3493
3494All ARM modes:
3495
3496- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
3497 operand. Treated the same as operand ``m``, at the moment.
3498
3499ARM and ARM's Thumb2 mode:
3500
3501- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
3502- ``I``: An immediate integer valid for a data-processing instruction.
3503- ``J``: An immediate integer between -4095 and 4095.
3504- ``K``: An immediate integer whose bitwise inverse is valid for a
3505 data-processing instruction. (Can be used with template modifier "``B``" to
3506 print the inverted value).
3507- ``L``: An immediate integer whose negation is valid for a data-processing
3508 instruction. (Can be used with template modifier "``n``" to print the negated
3509 value).
3510- ``M``: A power of two or a integer between 0 and 32.
3511- ``N``: Invalid immediate constraint.
3512- ``O``: Invalid immediate constraint.
3513- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
3514- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
3515 as ``r``.
3516- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
3517 invalid.
3518- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3519 ``d0-d31``, or ``q0-q15``.
3520- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3521 ``d0-d7``, or ``q0-q3``.
3522- ``t``: A floating-point/SIMD register, only supports 32-bit values:
3523 ``s0-s31``.
3524
3525ARM's Thumb1 mode:
3526
3527- ``I``: An immediate integer between 0 and 255.
3528- ``J``: An immediate integer between -255 and -1.
3529- ``K``: An immediate integer between 0 and 255, with optional left-shift by
3530 some amount.
3531- ``L``: An immediate integer between -7 and 7.
3532- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
3533- ``N``: An immediate integer between 0 and 31.
3534- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
3535- ``r``: A low 32-bit GPR register (``r0-r7``).
3536- ``l``: A low 32-bit GPR register (``r0-r7``).
3537- ``h``: A high GPR register (``r0-r7``).
3538- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3539 ``d0-d31``, or ``q0-q15``.
3540- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3541 ``d0-d7``, or ``q0-q3``.
3542- ``t``: A floating-point/SIMD register, only supports 32-bit values:
3543 ``s0-s31``.
3544
3545
3546Hexagon:
3547
3548- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
3549 at the moment.
3550- ``r``: A 32 or 64-bit register.
3551
3552MSP430:
3553
3554- ``r``: An 8 or 16-bit register.
3555
3556MIPS:
3557
3558- ``I``: An immediate signed 16-bit integer.
3559- ``J``: An immediate integer zero.
3560- ``K``: An immediate unsigned 16-bit integer.
3561- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
3562- ``N``: An immediate integer between -65535 and -1.
3563- ``O``: An immediate signed 15-bit integer.
3564- ``P``: An immediate integer between 1 and 65535.
3565- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
3566 register plus 16-bit immediate offset. In MIPS mode, just a base register.
3567- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
3568 register plus a 9-bit signed offset. In MIPS mode, the same as constraint
3569 ``m``.
3570- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
3571 ``sc`` instruction on the given subtarget (details vary).
3572- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
3573- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
Daniel Sanders3745e022015-07-13 09:24:21 +00003574 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
3575 argument modifier for compatibility with GCC.
James Y Knightbc832ed2015-07-08 18:08:36 +00003576- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
3577 ``25``).
3578- ``l``: The ``lo`` register, 32 or 64-bit.
3579- ``x``: Invalid.
3580
3581NVPTX:
3582
3583- ``b``: A 1-bit integer register.
3584- ``c`` or ``h``: A 16-bit integer register.
3585- ``r``: A 32-bit integer register.
3586- ``l`` or ``N``: A 64-bit integer register.
3587- ``f``: A 32-bit float register.
3588- ``d``: A 64-bit float register.
3589
3590
3591PowerPC:
3592
3593- ``I``: An immediate signed 16-bit integer.
3594- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
3595- ``K``: An immediate unsigned 16-bit integer.
3596- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
3597- ``M``: An immediate integer greater than 31.
3598- ``N``: An immediate integer that is an exact power of 2.
3599- ``O``: The immediate integer constant 0.
3600- ``P``: An immediate integer constant whose negation is a signed 16-bit
3601 constant.
3602- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
3603 treated the same as ``m``.
3604- ``r``: A 32 or 64-bit integer register.
3605- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
3606 ``R1-R31``).
3607- ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a
3608 128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers).
3609- ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a
3610 128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit
3611 altivec vector register (``V0-V31``).
3612
3613 .. FIXME: is this a bug that v accepts QPX registers? I think this
3614 is supposed to only use the altivec vector registers?
3615
3616- ``y``: Condition register (``CR0-CR7``).
3617- ``wc``: An individual CR bit in a CR register.
3618- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
3619 register set (overlapping both the floating-point and vector register files).
3620- ``ws``: A 32 or 64-bit floating point register, from the full VSX register
3621 set.
3622
3623Sparc:
3624
3625- ``I``: An immediate 13-bit signed integer.
3626- ``r``: A 32-bit integer register.
3627
3628SystemZ:
3629
3630- ``I``: An immediate unsigned 8-bit integer.
3631- ``J``: An immediate unsigned 12-bit integer.
3632- ``K``: An immediate signed 16-bit integer.
3633- ``L``: An immediate signed 20-bit integer.
3634- ``M``: An immediate integer 0x7fffffff.
Ulrich Weiganddaae87aa2016-06-13 14:24:05 +00003635- ``Q``: A memory address operand with a base address and a 12-bit immediate
3636 unsigned displacement.
3637- ``R``: A memory address operand with a base address, a 12-bit immediate
3638 unsigned displacement, and an index register.
3639- ``S``: A memory address operand with a base address and a 20-bit immediate
3640 signed displacement.
3641- ``T``: A memory address operand with a base address, a 20-bit immediate
3642 signed displacement, and an index register.
James Y Knightbc832ed2015-07-08 18:08:36 +00003643- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
3644- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
3645 address context evaluates as zero).
3646- ``h``: A 32-bit value in the high part of a 64bit data register
3647 (LLVM-specific)
3648- ``f``: A 32, 64, or 128-bit floating point register.
3649
3650X86:
3651
3652- ``I``: An immediate integer between 0 and 31.
3653- ``J``: An immediate integer between 0 and 64.
3654- ``K``: An immediate signed 8-bit integer.
3655- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
3656 0xffffffff.
3657- ``M``: An immediate integer between 0 and 3.
3658- ``N``: An immediate unsigned 8-bit integer.
3659- ``O``: An immediate integer between 0 and 127.
3660- ``e``: An immediate 32-bit signed integer.
3661- ``Z``: An immediate 32-bit unsigned integer.
3662- ``o``, ``v``: Treated the same as ``m``, at the moment.
3663- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3664 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
3665 registers, and on X86-64, it is all of the integer registers.
3666- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3667 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
3668- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
3669- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
3670 existed since i386, and can be accessed without the REX prefix.
3671- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
3672- ``y``: A 64-bit MMX register, if MMX is enabled.
3673- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
3674 operand in a SSE register. If AVX is also enabled, can also be a 256-bit
3675 vector operand in an AVX register. If AVX-512 is also enabled, can also be a
3676 512-bit vector operand in an AVX512 register, Otherwise, an error.
3677- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
3678- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
3679 32-bit mode, a 64-bit integer operand will get split into two registers). It
3680 is not recommended to use this constraint, as in 64-bit mode, the 64-bit
3681 operand will get allocated only to RAX -- if two 32-bit operands are needed,
3682 you're better off splitting it yourself, before passing it to the asm
3683 statement.
3684
3685XCore:
3686
3687- ``r``: A 32-bit integer register.
3688
3689
3690.. _inline-asm-modifiers:
3691
3692Asm template argument modifiers
3693^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3694
3695In the asm template string, modifiers can be used on the operand reference, like
3696"``${0:n}``".
3697
3698The modifiers are, in general, expected to behave the same way they do in
3699GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3700inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3701and GCC likely indicates a bug in LLVM.
3702
3703Target-independent:
3704
Sean Silvaa1190322015-08-06 22:56:48 +00003705- ``c``: Print an immediate integer constant unadorned, without
James Y Knightbc832ed2015-07-08 18:08:36 +00003706 the target-specific immediate punctuation (e.g. no ``$`` prefix).
3707- ``n``: Negate and print immediate integer constant unadorned, without the
3708 target-specific immediate punctuation (e.g. no ``$`` prefix).
3709- ``l``: Print as an unadorned label, without the target-specific label
3710 punctuation (e.g. no ``$`` prefix).
3711
3712AArch64:
3713
3714- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
3715 instead of ``x30``, print ``w30``.
3716- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
3717- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
3718 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
3719 ``v*``.
3720
3721AMDGPU:
3722
3723- ``r``: No effect.
3724
3725ARM:
3726
3727- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
3728 register).
3729- ``P``: No effect.
3730- ``q``: No effect.
3731- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
3732 as ``d4[1]`` instead of ``s9``)
3733- ``B``: Bitwise invert and print an immediate integer constant without ``#``
3734 prefix.
3735- ``L``: Print the low 16-bits of an immediate integer constant.
3736- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
3737 register operands subsequent to the specified one (!), so use carefully.
3738- ``Q``: Print the low-order register of a register-pair, or the low-order
3739 register of a two-register operand.
3740- ``R``: Print the high-order register of a register-pair, or the high-order
3741 register of a two-register operand.
3742- ``H``: Print the second register of a register-pair. (On a big-endian system,
3743 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
3744 to ``R``.)
3745
3746 .. FIXME: H doesn't currently support printing the second register
3747 of a two-register operand.
3748
3749- ``e``: Print the low doubleword register of a NEON quad register.
3750- ``f``: Print the high doubleword register of a NEON quad register.
3751- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
3752 adornment.
3753
3754Hexagon:
3755
3756- ``L``: Print the second register of a two-register operand. Requires that it
3757 has been allocated consecutively to the first.
3758
3759 .. FIXME: why is it restricted to consecutive ones? And there's
3760 nothing that ensures that happens, is there?
3761
3762- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3763 nothing. Used to print 'addi' vs 'add' instructions.
3764
3765MSP430:
3766
3767No additional modifiers.
3768
3769MIPS:
3770
3771- ``X``: Print an immediate integer as hexadecimal
3772- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
3773- ``d``: Print an immediate integer as decimal.
3774- ``m``: Subtract one and print an immediate integer as decimal.
3775- ``z``: Print $0 if an immediate zero, otherwise print normally.
3776- ``L``: Print the low-order register of a two-register operand, or prints the
3777 address of the low-order word of a double-word memory operand.
3778
3779 .. FIXME: L seems to be missing memory operand support.
3780
3781- ``M``: Print the high-order register of a two-register operand, or prints the
3782 address of the high-order word of a double-word memory operand.
3783
3784 .. FIXME: M seems to be missing memory operand support.
3785
3786- ``D``: Print the second register of a two-register operand, or prints the
3787 second word of a double-word memory operand. (On a big-endian system, ``D`` is
3788 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
3789 ``M``.)
Daniel Sanders3745e022015-07-13 09:24:21 +00003790- ``w``: No effect. Provided for compatibility with GCC which requires this
3791 modifier in order to print MSA registers (``W0-W31``) with the ``f``
3792 constraint.
James Y Knightbc832ed2015-07-08 18:08:36 +00003793
3794NVPTX:
3795
3796- ``r``: No effect.
3797
3798PowerPC:
3799
3800- ``L``: Print the second register of a two-register operand. Requires that it
3801 has been allocated consecutively to the first.
3802
3803 .. FIXME: why is it restricted to consecutive ones? And there's
3804 nothing that ensures that happens, is there?
3805
3806- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3807 nothing. Used to print 'addi' vs 'add' instructions.
3808- ``y``: For a memory operand, prints formatter for a two-register X-form
3809 instruction. (Currently always prints ``r0,OPERAND``).
3810- ``U``: Prints 'u' if the memory operand is an update form, and nothing
3811 otherwise. (NOTE: LLVM does not support update form, so this will currently
3812 always print nothing)
3813- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
3814 not support indexed form, so this will currently always print nothing)
3815
3816Sparc:
3817
3818- ``r``: No effect.
3819
3820SystemZ:
3821
3822SystemZ implements only ``n``, and does *not* support any of the other
3823target-independent modifiers.
3824
3825X86:
3826
3827- ``c``: Print an unadorned integer or symbol name. (The latter is
3828 target-specific behavior for this typically target-independent modifier).
3829- ``A``: Print a register name with a '``*``' before it.
3830- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
3831 operand.
3832- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
3833 memory operand.
3834- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
3835 operand.
3836- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
3837 operand.
3838- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
3839 available, otherwise the 32-bit register name; do nothing on a memory operand.
3840- ``n``: Negate and print an unadorned integer, or, for operands other than an
3841 immediate integer (e.g. a relocatable symbol expression), print a '-' before
3842 the operand. (The behavior for relocatable symbol expressions is a
3843 target-specific behavior for this typically target-independent modifier)
3844- ``H``: Print a memory reference with additional offset +8.
3845- ``P``: Print a memory reference or operand for use as the argument of a call
3846 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
3847
3848XCore:
3849
3850No additional modifiers.
3851
3852
Sean Silvab084af42012-12-07 10:36:55 +00003853Inline Asm Metadata
3854^^^^^^^^^^^^^^^^^^^
3855
3856The call instructions that wrap inline asm nodes may have a
3857"``!srcloc``" MDNode attached to it that contains a list of constant
3858integers. If present, the code generator will use the integer as the
3859location cookie value when report errors through the ``LLVMContext``
3860error reporting mechanisms. This allows a front-end to correlate backend
3861errors that occur with inline asm back to the source code that produced
3862it. For example:
3863
3864.. code-block:: llvm
3865
3866 call void asm sideeffect "something bad", ""(), !srcloc !42
3867 ...
3868 !42 = !{ i32 1234567 }
3869
3870It is up to the front-end to make sense of the magic numbers it places
3871in the IR. If the MDNode contains multiple constants, the code generator
3872will use the one that corresponds to the line of the asm that the error
3873occurs on.
3874
3875.. _metadata:
3876
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00003877Metadata
3878========
Sean Silvab084af42012-12-07 10:36:55 +00003879
3880LLVM IR allows metadata to be attached to instructions in the program
3881that can convey extra information about the code to the optimizers and
3882code generator. One example application of metadata is source-level
3883debug information. There are two metadata primitives: strings and nodes.
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00003884
Sean Silvaa1190322015-08-06 22:56:48 +00003885Metadata does not have a type, and is not a value. If referenced from a
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00003886``call`` instruction, it uses the ``metadata`` type.
3887
3888All metadata are identified in syntax by a exclamation point ('``!``').
3889
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003890.. _metadata-string:
3891
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00003892Metadata Nodes and Metadata Strings
3893-----------------------------------
Sean Silvab084af42012-12-07 10:36:55 +00003894
3895A metadata string is a string surrounded by double quotes. It can
3896contain any character by escaping non-printable characters with
3897"``\xx``" where "``xx``" is the two digit hex code. For example:
3898"``!"test\00"``".
3899
3900Metadata nodes are represented with notation similar to structure
3901constants (a comma separated list of elements, surrounded by braces and
3902preceded by an exclamation point). Metadata nodes can have any values as
3903their operand. For example:
3904
3905.. code-block:: llvm
3906
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00003907 !{ !"test\00", i32 10}
Sean Silvab084af42012-12-07 10:36:55 +00003908
Duncan P. N. Exon Smith090a19b2015-01-08 22:38:29 +00003909Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
3910
Renato Golin124f2592016-07-20 12:16:38 +00003911.. code-block:: text
Duncan P. N. Exon Smith090a19b2015-01-08 22:38:29 +00003912
3913 !0 = distinct !{!"test\00", i32 10}
3914
Duncan P. N. Exon Smith99010342015-01-08 23:50:26 +00003915``distinct`` nodes are useful when nodes shouldn't be merged based on their
Sean Silvaa1190322015-08-06 22:56:48 +00003916content. They can also occur when transformations cause uniquing collisions
Duncan P. N. Exon Smith99010342015-01-08 23:50:26 +00003917when metadata operands change.
3918
Sean Silvab084af42012-12-07 10:36:55 +00003919A :ref:`named metadata <namedmetadatastructure>` is a collection of
3920metadata nodes, which can be looked up in the module symbol table. For
3921example:
3922
3923.. code-block:: llvm
3924
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00003925 !foo = !{!4, !3}
Sean Silvab084af42012-12-07 10:36:55 +00003926
3927Metadata can be used as function arguments. Here ``llvm.dbg.value``
3928function is using two metadata arguments:
3929
3930.. code-block:: llvm
3931
3932 call void @llvm.dbg.value(metadata !24, i64 0, metadata !25)
3933
Peter Collingbourne50108682015-11-06 02:41:02 +00003934Metadata can be attached to an instruction. Here metadata ``!21`` is attached
3935to the ``add`` instruction using the ``!dbg`` identifier:
Sean Silvab084af42012-12-07 10:36:55 +00003936
3937.. code-block:: llvm
3938
3939 %indvar.next = add i64 %indvar, 1, !dbg !21
3940
Peter Collingbourne50108682015-11-06 02:41:02 +00003941Metadata can also be attached to a function definition. Here metadata ``!22``
3942is attached to the ``foo`` function using the ``!dbg`` identifier:
3943
3944.. code-block:: llvm
3945
3946 define void @foo() !dbg !22 {
3947 ret void
3948 }
3949
Sean Silvab084af42012-12-07 10:36:55 +00003950More information about specific metadata nodes recognized by the
3951optimizers and code generator is found below.
3952
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00003953.. _specialized-metadata:
3954
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00003955Specialized Metadata Nodes
3956^^^^^^^^^^^^^^^^^^^^^^^^^^
3957
3958Specialized metadata nodes are custom data structures in metadata (as opposed
Sean Silvaa1190322015-08-06 22:56:48 +00003959to generic tuples). Their fields are labelled, and can be specified in any
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00003960order.
3961
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003962These aren't inherently debug info centric, but currently all the specialized
3963metadata nodes are related to debug info.
3964
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00003965.. _DICompileUnit:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00003966
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00003967DICompileUnit
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003968"""""""""""""
3969
Sean Silvaa1190322015-08-06 22:56:48 +00003970``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
Amjad Abouda9bcf162015-12-10 12:56:35 +00003971``retainedTypes:``, ``subprograms:``, ``globals:``, ``imports:`` and ``macros:``
3972fields are tuples containing the debug info to be emitted along with the compile
3973unit, regardless of code optimizations (some nodes are only emitted if there are
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003974references to them from instructions).
3975
Renato Golin124f2592016-07-20 12:16:38 +00003976.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003977
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00003978 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003979 isOptimized: true, flags: "-O2", runtimeVersion: 2,
Adrian Prantlb8089512016-04-01 00:16:49 +00003980 splitDebugFilename: "abc.debug", emissionKind: FullDebug,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003981 enums: !2, retainedTypes: !3, subprograms: !4,
Amjad Abouda9bcf162015-12-10 12:56:35 +00003982 globals: !5, imports: !6, macros: !7, dwoId: 0x0abcd)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003983
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00003984Compile unit descriptors provide the root scope for objects declared in a
Sean Silvaa1190322015-08-06 22:56:48 +00003985specific compilation unit. File descriptors are defined using this scope.
3986These descriptors are collected by a named metadata ``!llvm.dbg.cu``. They
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00003987keep track of subprograms, global variables, type information, and imported
3988entities (declarations and namespaces).
3989
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00003990.. _DIFile:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00003991
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00003992DIFile
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003993""""""
3994
Sean Silvaa1190322015-08-06 22:56:48 +00003995``DIFile`` nodes represent files. The ``filename:`` can include slashes.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00003996
3997.. code-block:: llvm
3998
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00003999 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir")
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004000
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004001Files are sometimes used in ``scope:`` fields, and are the only valid target
4002for ``file:`` fields.
4003
Michael Kuperstein605308a2015-05-14 10:58:59 +00004004.. _DIBasicType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004005
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004006DIBasicType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004007"""""""""""
4008
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004009``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
Sean Silvaa1190322015-08-06 22:56:48 +00004010``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004011
Renato Golin124f2592016-07-20 12:16:38 +00004012.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004013
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004014 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004015 encoding: DW_ATE_unsigned_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004016 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004017
Sean Silvaa1190322015-08-06 22:56:48 +00004018The ``encoding:`` describes the details of the type. Usually it's one of the
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004019following:
4020
Renato Golin124f2592016-07-20 12:16:38 +00004021.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004022
4023 DW_ATE_address = 1
4024 DW_ATE_boolean = 2
4025 DW_ATE_float = 4
4026 DW_ATE_signed = 5
4027 DW_ATE_signed_char = 6
4028 DW_ATE_unsigned = 7
4029 DW_ATE_unsigned_char = 8
4030
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004031.. _DISubroutineType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004032
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004033DISubroutineType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004034""""""""""""""""
4035
Sean Silvaa1190322015-08-06 22:56:48 +00004036``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004037refers to a tuple; the first operand is the return type, while the rest are the
Sean Silvaa1190322015-08-06 22:56:48 +00004038types of the formal arguments in order. If the first operand is ``null``, that
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004039represents a function with no return value (such as ``void foo() {}`` in C++).
4040
Renato Golin124f2592016-07-20 12:16:38 +00004041.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004042
4043 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
4044 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004045 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004046
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004047.. _DIDerivedType:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004048
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004049DIDerivedType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004050"""""""""""""
4051
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004052``DIDerivedType`` nodes represent types derived from other types, such as
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004053qualified types.
4054
Renato Golin124f2592016-07-20 12:16:38 +00004055.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004056
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004057 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004058 encoding: DW_ATE_unsigned_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004059 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004060 align: 32)
4061
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004062The following ``tag:`` values are valid:
4063
Renato Golin124f2592016-07-20 12:16:38 +00004064.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004065
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004066 DW_TAG_member = 13
4067 DW_TAG_pointer_type = 15
4068 DW_TAG_reference_type = 16
4069 DW_TAG_typedef = 22
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004070 DW_TAG_inheritance = 28
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004071 DW_TAG_ptr_to_member_type = 31
4072 DW_TAG_const_type = 38
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004073 DW_TAG_friend = 42
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004074 DW_TAG_volatile_type = 53
4075 DW_TAG_restrict_type = 55
4076
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004077.. _DIDerivedTypeMember:
4078
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004079``DW_TAG_member`` is used to define a member of a :ref:`composite type
Duncan P. N. Exon Smith90990cd2016-04-17 00:45:00 +00004080<DICompositeType>`. The type of the member is the ``baseType:``. The
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004081``offset:`` is the member's bit offset. If the composite type has an ODR
4082``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
4083uniqued based only on its ``name:`` and ``scope:``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004084
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004085``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
4086field of :ref:`composite types <DICompositeType>` to describe parents and
4087friends.
4088
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004089``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
4090
4091``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
4092``DW_TAG_volatile_type`` and ``DW_TAG_restrict_type`` are used to qualify the
4093``baseType:``.
4094
4095Note that the ``void *`` type is expressed as a type derived from NULL.
4096
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004097.. _DICompositeType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004098
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004099DICompositeType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004100"""""""""""""""
4101
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004102``DICompositeType`` nodes represent types composed of other types, like
Sean Silvaa1190322015-08-06 22:56:48 +00004103structures and unions. ``elements:`` points to a tuple of the composed types.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004104
4105If the source language supports ODR, the ``identifier:`` field gives the unique
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004106identifier used for type merging between modules. When specified,
4107:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
4108derived types <DIDerivedTypeMember>` that reference the ODR-type in their
4109``scope:`` change uniquing rules.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004110
Duncan P. N. Exon Smith5ab2be02016-04-17 03:58:21 +00004111For a given ``identifier:``, there should only be a single composite type that
4112does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules
4113together will unique such definitions at parse time via the ``identifier:``
4114field, even if the nodes are ``distinct``.
4115
Renato Golin124f2592016-07-20 12:16:38 +00004116.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004117
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004118 !0 = !DIEnumerator(name: "SixKind", value: 7)
4119 !1 = !DIEnumerator(name: "SevenKind", value: 7)
4120 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
4121 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004122 line: 2, size: 32, align: 32, identifier: "_M4Enum",
4123 elements: !{!0, !1, !2})
4124
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004125The following ``tag:`` values are valid:
4126
Renato Golin124f2592016-07-20 12:16:38 +00004127.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004128
4129 DW_TAG_array_type = 1
4130 DW_TAG_class_type = 2
4131 DW_TAG_enumeration_type = 4
4132 DW_TAG_structure_type = 19
4133 DW_TAG_union_type = 23
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004134
4135For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004136descriptors <DISubrange>`, each representing the range of subscripts at that
Sean Silvaa1190322015-08-06 22:56:48 +00004137level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004138array type is a native packed vector.
4139
4140For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004141descriptors <DIEnumerator>`, each representing the definition of an enumeration
Sean Silvaa1190322015-08-06 22:56:48 +00004142value for the set. All enumeration type descriptors are collected in the
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004143``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004144
4145For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
4146``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004147<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
4148``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
4149``isDefinition: false``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004150
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004151.. _DISubrange:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004152
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004153DISubrange
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004154""""""""""
4155
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004156``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
Sean Silvaa1190322015-08-06 22:56:48 +00004157:ref:`DICompositeType`. ``count: -1`` indicates an empty array.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004158
4159.. code-block:: llvm
4160
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004161 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
4162 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
4163 !2 = !DISubrange(count: -1) ; empty array.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004164
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004165.. _DIEnumerator:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004166
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004167DIEnumerator
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004168""""""""""""
4169
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004170``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
4171variants of :ref:`DICompositeType`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004172
4173.. code-block:: llvm
4174
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004175 !0 = !DIEnumerator(name: "SixKind", value: 7)
4176 !1 = !DIEnumerator(name: "SevenKind", value: 7)
4177 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004178
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004179DITemplateTypeParameter
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004180"""""""""""""""""""""""
4181
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004182``DITemplateTypeParameter`` nodes represent type parameters to generic source
Sean Silvaa1190322015-08-06 22:56:48 +00004183language constructs. They are used (optionally) in :ref:`DICompositeType` and
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004184:ref:`DISubprogram` ``templateParams:`` fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004185
4186.. code-block:: llvm
4187
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004188 !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004189
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004190DITemplateValueParameter
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004191""""""""""""""""""""""""
4192
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004193``DITemplateValueParameter`` nodes represent value parameters to generic source
Sean Silvaa1190322015-08-06 22:56:48 +00004194language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004195but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
Sean Silvaa1190322015-08-06 22:56:48 +00004196``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004197:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004198
4199.. code-block:: llvm
4200
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004201 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004202
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004203DINamespace
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004204"""""""""""
4205
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004206``DINamespace`` nodes represent namespaces in the source language.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004207
4208.. code-block:: llvm
4209
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004210 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004211
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004212DIGlobalVariable
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004213""""""""""""""""
4214
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004215``DIGlobalVariable`` nodes represent global variables in the source language.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004216
4217.. code-block:: llvm
4218
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004219 !0 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !1,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004220 file: !2, line: 7, type: !3, isLocal: true,
4221 isDefinition: false, variable: i32* @foo,
4222 declaration: !4)
4223
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004224All global variables should be referenced by the `globals:` field of a
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004225:ref:`compile unit <DICompileUnit>`.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004226
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004227.. _DISubprogram:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004228
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004229DISubprogram
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004230""""""""""""
4231
Peter Collingbourne50108682015-11-06 02:41:02 +00004232``DISubprogram`` nodes represent functions from the source language. A
4233``DISubprogram`` may be attached to a function definition using ``!dbg``
4234metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>`
4235that must be retained, even if their IR counterparts are optimized out of
4236the IR. The ``type:`` field must point at an :ref:`DISubroutineType`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004237
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004238.. _DISubprogramDeclaration:
4239
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004240When ``isDefinition: false``, subprograms describe a declaration in the type
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004241tree as opposed to a definition of a function. If the scope is a composite
4242type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
4243then the subprogram declaration is uniqued based only on its ``linkageName:``
4244and ``scope:``.
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004245
Renato Golin124f2592016-07-20 12:16:38 +00004246.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004247
Peter Collingbourne50108682015-11-06 02:41:02 +00004248 define void @_Z3foov() !dbg !0 {
4249 ...
4250 }
4251
4252 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
4253 file: !2, line: 7, type: !3, isLocal: true,
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004254 isDefinition: true, scopeLine: 8,
Peter Collingbourne50108682015-11-06 02:41:02 +00004255 containingType: !4,
4256 virtuality: DW_VIRTUALITY_pure_virtual,
4257 virtualIndex: 10, flags: DIFlagPrototyped,
4258 isOptimized: true, templateParams: !5,
4259 declaration: !6, variables: !7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004260
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004261.. _DILexicalBlock:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004262
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004263DILexicalBlock
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004264""""""""""""""
4265
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004266``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004267<DISubprogram>`. The line number and column numbers are used to distinguish
Sean Silvaa1190322015-08-06 22:56:48 +00004268two lexical blocks at same depth. They are valid targets for ``scope:``
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004269fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004270
Renato Golin124f2592016-07-20 12:16:38 +00004271.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004272
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004273 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004274
4275Usually lexical blocks are ``distinct`` to prevent node merging based on
4276operands.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004277
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004278.. _DILexicalBlockFile:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004279
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004280DILexicalBlockFile
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004281""""""""""""""""""
4282
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004283``DILexicalBlockFile`` nodes are used to discriminate between sections of a
Sean Silvaa1190322015-08-06 22:56:48 +00004284:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004285indicate textual inclusion, or the ``discriminator:`` field can be used to
4286discriminate between control flow within a single block in the source language.
4287
4288.. code-block:: llvm
4289
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004290 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
4291 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
4292 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004293
Michael Kuperstein605308a2015-05-14 10:58:59 +00004294.. _DILocation:
4295
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004296DILocation
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004297""""""""""
4298
Sean Silvaa1190322015-08-06 22:56:48 +00004299``DILocation`` nodes represent source debug locations. The ``scope:`` field is
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004300mandatory, and points at an :ref:`DILexicalBlockFile`, an
4301:ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004302
4303.. code-block:: llvm
4304
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004305 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004306
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004307.. _DILocalVariable:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004308
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004309DILocalVariable
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004310"""""""""""""""
4311
Sean Silvaa1190322015-08-06 22:56:48 +00004312``DILocalVariable`` nodes represent local variables in the source language. If
Duncan P. N. Exon Smithed013cd2015-07-31 18:58:39 +00004313the ``arg:`` field is set to non-zero, then this variable is a subprogram
4314parameter, and it will be included in the ``variables:`` field of its
4315:ref:`DISubprogram`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004316
Renato Golin124f2592016-07-20 12:16:38 +00004317.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004318
Duncan P. N. Exon Smithed013cd2015-07-31 18:58:39 +00004319 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
4320 type: !3, flags: DIFlagArtificial)
4321 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
4322 type: !3)
4323 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004324
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004325DIExpression
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004326""""""""""""
4327
Sean Silvaa1190322015-08-06 22:56:48 +00004328``DIExpression`` nodes represent DWARF expression sequences. They are used in
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004329:ref:`debug intrinsics<dbg_intrinsics>` (such as ``llvm.dbg.declare``) to
4330describe how the referenced LLVM variable relates to the source language
4331variable.
4332
4333The current supported vocabulary is limited:
4334
4335- ``DW_OP_deref`` dereferences the working expression.
4336- ``DW_OP_plus, 93`` adds ``93`` to the working expression.
4337- ``DW_OP_bit_piece, 16, 8`` specifies the offset and size (``16`` and ``8``
4338 here, respectively) of the variable piece from the working expression.
4339
Renato Golin124f2592016-07-20 12:16:38 +00004340.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004341
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004342 !0 = !DIExpression(DW_OP_deref)
4343 !1 = !DIExpression(DW_OP_plus, 3)
4344 !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
4345 !3 = !DIExpression(DW_OP_deref, DW_OP_plus, 3, DW_OP_bit_piece, 3, 7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004346
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004347DIObjCProperty
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004348""""""""""""""
4349
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004350``DIObjCProperty`` nodes represent Objective-C property nodes.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004351
4352.. code-block:: llvm
4353
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004354 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004355 getter: "getFoo", attributes: 7, type: !2)
4356
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004357DIImportedEntity
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004358""""""""""""""""
4359
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004360``DIImportedEntity`` nodes represent entities (such as modules) imported into a
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004361compile unit.
4362
Renato Golin124f2592016-07-20 12:16:38 +00004363.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004364
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004365 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004366 entity: !1, line: 7)
4367
Amjad Abouda9bcf162015-12-10 12:56:35 +00004368DIMacro
4369"""""""
4370
4371``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
4372The ``name:`` field is the macro identifier, followed by macro parameters when
Sylvestre Ledru7d540502016-07-02 19:28:40 +00004373defining a function-like macro, and the ``value`` field is the token-string
Amjad Abouda9bcf162015-12-10 12:56:35 +00004374used to expand the macro identifier.
4375
Renato Golin124f2592016-07-20 12:16:38 +00004376.. code-block:: text
Amjad Abouda9bcf162015-12-10 12:56:35 +00004377
4378 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
4379 value: "((x) + 1)")
4380 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
4381
4382DIMacroFile
4383"""""""""""
4384
4385``DIMacroFile`` nodes represent inclusion of source files.
4386The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
4387appear in the included source file.
4388
Renato Golin124f2592016-07-20 12:16:38 +00004389.. code-block:: text
Amjad Abouda9bcf162015-12-10 12:56:35 +00004390
4391 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
4392 nodes: !3)
4393
Sean Silvab084af42012-12-07 10:36:55 +00004394'``tbaa``' Metadata
4395^^^^^^^^^^^^^^^^^^^
4396
4397In LLVM IR, memory does not have types, so LLVM's own type system is not
4398suitable for doing TBAA. Instead, metadata is added to the IR to
4399describe a type system of a higher level language. This can be used to
4400implement typical C/C++ TBAA, but it can also be used to implement
4401custom alias analysis behavior for other languages.
4402
4403The current metadata format is very simple. TBAA metadata nodes have up
4404to three fields, e.g.:
4405
4406.. code-block:: llvm
4407
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004408 !0 = !{ !"an example type tree" }
4409 !1 = !{ !"int", !0 }
4410 !2 = !{ !"float", !0 }
4411 !3 = !{ !"const float", !2, i64 1 }
Sean Silvab084af42012-12-07 10:36:55 +00004412
4413The first field is an identity field. It can be any value, usually a
4414metadata string, which uniquely identifies the type. The most important
4415name in the tree is the name of the root node. Two trees with different
4416root node names are entirely disjoint, even if they have leaves with
4417common names.
4418
4419The second field identifies the type's parent node in the tree, or is
4420null or omitted for a root node. A type is considered to alias all of
4421its descendants and all of its ancestors in the tree. Also, a type is
4422considered to alias all types in other trees, so that bitcode produced
4423from multiple front-ends is handled conservatively.
4424
4425If the third field is present, it's an integer which if equal to 1
4426indicates that the type is "constant" (meaning
4427``pointsToConstantMemory`` should return true; see `other useful
4428AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_).
4429
4430'``tbaa.struct``' Metadata
4431^^^^^^^^^^^^^^^^^^^^^^^^^^
4432
4433The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
4434aggregate assignment operations in C and similar languages, however it
4435is defined to copy a contiguous region of memory, which is more than
4436strictly necessary for aggregate types which contain holes due to
4437padding. Also, it doesn't contain any TBAA information about the fields
4438of the aggregate.
4439
4440``!tbaa.struct`` metadata can describe which memory subregions in a
4441memcpy are padding and what the TBAA tags of the struct are.
4442
4443The current metadata format is very simple. ``!tbaa.struct`` metadata
4444nodes are a list of operands which are in conceptual groups of three.
4445For each group of three, the first operand gives the byte offset of a
4446field in bytes, the second gives its size in bytes, and the third gives
4447its tbaa tag. e.g.:
4448
4449.. code-block:: llvm
4450
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004451 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
Sean Silvab084af42012-12-07 10:36:55 +00004452
4453This describes a struct with two fields. The first is at offset 0 bytes
4454with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
4455and has size 4 bytes and has tbaa tag !2.
4456
4457Note that the fields need not be contiguous. In this example, there is a
44584 byte gap between the two fields. This gap represents padding which
4459does not carry useful data and need not be preserved.
4460
Hal Finkel94146652014-07-24 14:25:39 +00004461'``noalias``' and '``alias.scope``' Metadata
Dan Liewbafdcba2014-07-28 13:33:51 +00004462^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Hal Finkel94146652014-07-24 14:25:39 +00004463
4464``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
4465noalias memory-access sets. This means that some collection of memory access
4466instructions (loads, stores, memory-accessing calls, etc.) that carry
4467``noalias`` metadata can specifically be specified not to alias with some other
4468collection of memory access instructions that carry ``alias.scope`` metadata.
Hal Finkel029cde62014-07-25 15:50:02 +00004469Each type of metadata specifies a list of scopes where each scope has an id and
Adam Nemet569a5b32016-04-27 00:52:48 +00004470a domain.
4471
4472When evaluating an aliasing query, if for some domain, the set
Hal Finkel029cde62014-07-25 15:50:02 +00004473of scopes with that domain in one instruction's ``alias.scope`` list is a
Arch D. Robison96cf7ab2015-02-24 20:11:49 +00004474subset of (or equal to) the set of scopes for that domain in another
Hal Finkel029cde62014-07-25 15:50:02 +00004475instruction's ``noalias`` list, then the two memory accesses are assumed not to
4476alias.
Hal Finkel94146652014-07-24 14:25:39 +00004477
Adam Nemet569a5b32016-04-27 00:52:48 +00004478Because scopes in one domain don't affect scopes in other domains, separate
4479domains can be used to compose multiple independent noalias sets. This is
4480used for example during inlining. As the noalias function parameters are
4481turned into noalias scope metadata, a new domain is used every time the
4482function is inlined.
4483
Hal Finkel029cde62014-07-25 15:50:02 +00004484The metadata identifying each domain is itself a list containing one or two
4485entries. The first entry is the name of the domain. Note that if the name is a
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004486string then it can be combined across functions and translation units. A
Hal Finkel029cde62014-07-25 15:50:02 +00004487self-reference can be used to create globally unique domain names. A
4488descriptive string may optionally be provided as a second list entry.
4489
4490The metadata identifying each scope is also itself a list containing two or
4491three entries. The first entry is the name of the scope. Note that if the name
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004492is a string then it can be combined across functions and translation units. A
Hal Finkel029cde62014-07-25 15:50:02 +00004493self-reference can be used to create globally unique scope names. A metadata
4494reference to the scope's domain is the second entry. A descriptive string may
4495optionally be provided as a third list entry.
Hal Finkel94146652014-07-24 14:25:39 +00004496
4497For example,
4498
4499.. code-block:: llvm
4500
Hal Finkel029cde62014-07-25 15:50:02 +00004501 ; Two scope domains:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004502 !0 = !{!0}
4503 !1 = !{!1}
Hal Finkel94146652014-07-24 14:25:39 +00004504
Hal Finkel029cde62014-07-25 15:50:02 +00004505 ; Some scopes in these domains:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004506 !2 = !{!2, !0}
4507 !3 = !{!3, !0}
4508 !4 = !{!4, !1}
Hal Finkel94146652014-07-24 14:25:39 +00004509
Hal Finkel029cde62014-07-25 15:50:02 +00004510 ; Some scope lists:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004511 !5 = !{!4} ; A list containing only scope !4
4512 !6 = !{!4, !3, !2}
4513 !7 = !{!3}
Hal Finkel94146652014-07-24 14:25:39 +00004514
4515 ; These two instructions don't alias:
David Blaikiec7aabbb2015-03-04 22:06:14 +00004516 %0 = load float, float* %c, align 4, !alias.scope !5
Hal Finkel029cde62014-07-25 15:50:02 +00004517 store float %0, float* %arrayidx.i, align 4, !noalias !5
Hal Finkel94146652014-07-24 14:25:39 +00004518
Hal Finkel029cde62014-07-25 15:50:02 +00004519 ; These two instructions also don't alias (for domain !1, the set of scopes
4520 ; in the !alias.scope equals that in the !noalias list):
David Blaikiec7aabbb2015-03-04 22:06:14 +00004521 %2 = load float, float* %c, align 4, !alias.scope !5
Hal Finkel029cde62014-07-25 15:50:02 +00004522 store float %2, float* %arrayidx.i2, align 4, !noalias !6
Hal Finkel94146652014-07-24 14:25:39 +00004523
Adam Nemet0a8416f2015-05-11 08:30:28 +00004524 ; These two instructions may alias (for domain !0, the set of scopes in
Hal Finkel029cde62014-07-25 15:50:02 +00004525 ; the !noalias list is not a superset of, or equal to, the scopes in the
4526 ; !alias.scope list):
David Blaikiec7aabbb2015-03-04 22:06:14 +00004527 %2 = load float, float* %c, align 4, !alias.scope !6
Hal Finkel029cde62014-07-25 15:50:02 +00004528 store float %0, float* %arrayidx.i, align 4, !noalias !7
Hal Finkel94146652014-07-24 14:25:39 +00004529
Sean Silvab084af42012-12-07 10:36:55 +00004530'``fpmath``' Metadata
4531^^^^^^^^^^^^^^^^^^^^^
4532
4533``fpmath`` metadata may be attached to any instruction of floating point
4534type. It can be used to express the maximum acceptable error in the
4535result of that instruction, in ULPs, thus potentially allowing the
4536compiler to use a more efficient but less accurate method of computing
4537it. ULP is defined as follows:
4538
4539 If ``x`` is a real number that lies between two finite consecutive
4540 floating-point numbers ``a`` and ``b``, without being equal to one
4541 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
4542 distance between the two non-equal finite floating-point numbers
4543 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
4544
Matt Arsenault82f41512016-06-27 19:43:15 +00004545The metadata node shall consist of a single positive float type number
4546representing the maximum relative error, for example:
Sean Silvab084af42012-12-07 10:36:55 +00004547
4548.. code-block:: llvm
4549
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004550 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
Sean Silvab084af42012-12-07 10:36:55 +00004551
Philip Reamesf8bf9dd2015-02-27 23:14:50 +00004552.. _range-metadata:
4553
Sean Silvab084af42012-12-07 10:36:55 +00004554'``range``' Metadata
4555^^^^^^^^^^^^^^^^^^^^
4556
Jingyue Wu37fcb592014-06-19 16:50:16 +00004557``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
4558integer types. It expresses the possible ranges the loaded value or the value
4559returned by the called function at this call site is in. The ranges are
4560represented with a flattened list of integers. The loaded value or the value
4561returned is known to be in the union of the ranges defined by each consecutive
4562pair. Each pair has the following properties:
Sean Silvab084af42012-12-07 10:36:55 +00004563
4564- The type must match the type loaded by the instruction.
4565- The pair ``a,b`` represents the range ``[a,b)``.
4566- Both ``a`` and ``b`` are constants.
4567- The range is allowed to wrap.
4568- The range should not represent the full or empty set. That is,
4569 ``a!=b``.
4570
4571In addition, the pairs must be in signed order of the lower bound and
4572they must be non-contiguous.
4573
4574Examples:
4575
4576.. code-block:: llvm
4577
David Blaikiec7aabbb2015-03-04 22:06:14 +00004578 %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
4579 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
Jingyue Wu37fcb592014-06-19 16:50:16 +00004580 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
4581 %d = invoke i8 @bar() to label %cont
4582 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
Sean Silvab084af42012-12-07 10:36:55 +00004583 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004584 !0 = !{ i8 0, i8 2 }
4585 !1 = !{ i8 255, i8 2 }
4586 !2 = !{ i8 0, i8 2, i8 3, i8 6 }
4587 !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
Sean Silvab084af42012-12-07 10:36:55 +00004588
Sanjay Patela99ab1f2015-09-02 19:06:43 +00004589'``unpredictable``' Metadata
Sanjay Patel1f12b342015-09-02 19:35:31 +00004590^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sanjay Patela99ab1f2015-09-02 19:06:43 +00004591
4592``unpredictable`` metadata may be attached to any branch or switch
4593instruction. It can be used to express the unpredictability of control
4594flow. Similar to the llvm.expect intrinsic, it may be used to alter
4595optimizations related to compare and branch instructions. The metadata
4596is treated as a boolean value; if it exists, it signals that the branch
4597or switch that it is attached to is completely unpredictable.
4598
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004599'``llvm.loop``'
4600^^^^^^^^^^^^^^^
4601
4602It is sometimes useful to attach information to loop constructs. Currently,
4603loop metadata is implemented as metadata attached to the branch instruction
4604in the loop latch block. This type of metadata refer to a metadata node that is
Matt Arsenault24b49c42013-07-31 17:49:08 +00004605guaranteed to be separate for each loop. The loop identifier metadata is
Paul Redmond5fdf8362013-05-28 20:00:34 +00004606specified with the name ``llvm.loop``.
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004607
4608The loop identifier metadata is implemented using a metadata that refers to
Michael Liaoa7699082013-03-06 18:24:34 +00004609itself to avoid merging it with any other identifier metadata, e.g.,
4610during module linkage or function inlining. That is, each loop should refer
4611to their own identification metadata even if they reside in separate functions.
4612The following example contains loop identifier metadata for two separate loop
Pekka Jaaskelainen119a2b62013-02-22 12:03:07 +00004613constructs:
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004614
4615.. code-block:: llvm
Paul Redmondeaaed3b2013-02-21 17:20:45 +00004616
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004617 !0 = !{!0}
4618 !1 = !{!1}
Pekka Jaaskelainen119a2b62013-02-22 12:03:07 +00004619
Mark Heffernan893752a2014-07-18 19:24:51 +00004620The loop identifier metadata can be used to specify additional
4621per-loop metadata. Any operands after the first operand can be treated
4622as user-defined metadata. For example the ``llvm.loop.unroll.count``
4623suggests an unroll factor to the loop unroller:
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004624
Paul Redmond5fdf8362013-05-28 20:00:34 +00004625.. code-block:: llvm
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004626
Paul Redmond5fdf8362013-05-28 20:00:34 +00004627 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
4628 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004629 !0 = !{!0, !1}
4630 !1 = !{!"llvm.loop.unroll.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00004631
Mark Heffernan9d20e422014-07-21 23:11:03 +00004632'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
4633^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mark Heffernan893752a2014-07-18 19:24:51 +00004634
Mark Heffernan9d20e422014-07-21 23:11:03 +00004635Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
4636used to control per-loop vectorization and interleaving parameters such as
Sean Silvaa1190322015-08-06 22:56:48 +00004637vectorization width and interleave count. These metadata should be used in
4638conjunction with ``llvm.loop`` loop identification metadata. The
Mark Heffernan9d20e422014-07-21 23:11:03 +00004639``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
4640optimization hints and the optimizer will only interleave and vectorize loops if
Sean Silvaa1190322015-08-06 22:56:48 +00004641it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata
Mark Heffernan9d20e422014-07-21 23:11:03 +00004642which contains information about loop-carried memory dependencies can be helpful
4643in determining the safety of these transformations.
Mark Heffernan893752a2014-07-18 19:24:51 +00004644
Mark Heffernan9d20e422014-07-21 23:11:03 +00004645'``llvm.loop.interleave.count``' Metadata
Mark Heffernan893752a2014-07-18 19:24:51 +00004646^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4647
Mark Heffernan9d20e422014-07-21 23:11:03 +00004648This metadata suggests an interleave count to the loop interleaver.
4649The first operand is the string ``llvm.loop.interleave.count`` and the
Mark Heffernan893752a2014-07-18 19:24:51 +00004650second operand is an integer specifying the interleave count. For
4651example:
4652
4653.. code-block:: llvm
4654
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004655 !0 = !{!"llvm.loop.interleave.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00004656
Mark Heffernan9d20e422014-07-21 23:11:03 +00004657Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
Sean Silvaa1190322015-08-06 22:56:48 +00004658multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
Mark Heffernan9d20e422014-07-21 23:11:03 +00004659then the interleave count will be determined automatically.
4660
4661'``llvm.loop.vectorize.enable``' Metadata
Dan Liew9a1829d2014-07-22 14:59:38 +00004662^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mark Heffernan9d20e422014-07-21 23:11:03 +00004663
4664This metadata selectively enables or disables vectorization for the loop. The
4665first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
Sean Silvaa1190322015-08-06 22:56:48 +00004666is a bit. If the bit operand value is 1 vectorization is enabled. A value of
Mark Heffernan9d20e422014-07-21 23:11:03 +000046670 disables vectorization:
4668
4669.. code-block:: llvm
4670
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004671 !0 = !{!"llvm.loop.vectorize.enable", i1 0}
4672 !1 = !{!"llvm.loop.vectorize.enable", i1 1}
Mark Heffernan893752a2014-07-18 19:24:51 +00004673
4674'``llvm.loop.vectorize.width``' Metadata
4675^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4676
4677This metadata sets the target width of the vectorizer. The first
4678operand is the string ``llvm.loop.vectorize.width`` and the second
4679operand is an integer specifying the width. For example:
4680
4681.. code-block:: llvm
4682
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004683 !0 = !{!"llvm.loop.vectorize.width", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00004684
4685Note that setting ``llvm.loop.vectorize.width`` to 1 disables
Sean Silvaa1190322015-08-06 22:56:48 +00004686vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
Mark Heffernan893752a2014-07-18 19:24:51 +000046870 or if the loop does not have this metadata the width will be
4688determined automatically.
4689
4690'``llvm.loop.unroll``'
4691^^^^^^^^^^^^^^^^^^^^^^
4692
4693Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
4694optimization hints such as the unroll factor. ``llvm.loop.unroll``
4695metadata should be used in conjunction with ``llvm.loop`` loop
4696identification metadata. The ``llvm.loop.unroll`` metadata are only
4697optimization hints and the unrolling will only be performed if the
4698optimizer believes it is safe to do so.
4699
Mark Heffernan893752a2014-07-18 19:24:51 +00004700'``llvm.loop.unroll.count``' Metadata
4701^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4702
4703This metadata suggests an unroll factor to the loop unroller. The
4704first operand is the string ``llvm.loop.unroll.count`` and the second
4705operand is a positive integer specifying the unroll factor. For
4706example:
4707
4708.. code-block:: llvm
4709
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004710 !0 = !{!"llvm.loop.unroll.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00004711
4712If the trip count of the loop is less than the unroll count the loop
4713will be partially unrolled.
4714
Mark Heffernane6b4ba12014-07-23 17:31:37 +00004715'``llvm.loop.unroll.disable``' Metadata
4716^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4717
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00004718This metadata disables loop unrolling. The metadata has a single operand
Sean Silvaa1190322015-08-06 22:56:48 +00004719which is the string ``llvm.loop.unroll.disable``. For example:
Mark Heffernane6b4ba12014-07-23 17:31:37 +00004720
4721.. code-block:: llvm
4722
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004723 !0 = !{!"llvm.loop.unroll.disable"}
Mark Heffernane6b4ba12014-07-23 17:31:37 +00004724
Kevin Qin715b01e2015-03-09 06:14:18 +00004725'``llvm.loop.unroll.runtime.disable``' Metadata
Dan Liew868b0742015-03-11 13:34:49 +00004726^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Kevin Qin715b01e2015-03-09 06:14:18 +00004727
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00004728This metadata disables runtime loop unrolling. The metadata has a single
Sean Silvaa1190322015-08-06 22:56:48 +00004729operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
Kevin Qin715b01e2015-03-09 06:14:18 +00004730
4731.. code-block:: llvm
4732
4733 !0 = !{!"llvm.loop.unroll.runtime.disable"}
4734
Mark Heffernan89391542015-08-10 17:28:08 +00004735'``llvm.loop.unroll.enable``' Metadata
4736^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4737
4738This metadata suggests that the loop should be fully unrolled if the trip count
4739is known at compile time and partially unrolled if the trip count is not known
4740at compile time. The metadata has a single operand which is the string
4741``llvm.loop.unroll.enable``. For example:
4742
4743.. code-block:: llvm
4744
4745 !0 = !{!"llvm.loop.unroll.enable"}
4746
Mark Heffernane6b4ba12014-07-23 17:31:37 +00004747'``llvm.loop.unroll.full``' Metadata
4748^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4749
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00004750This metadata suggests that the loop should be unrolled fully. The
4751metadata has a single operand which is the string ``llvm.loop.unroll.full``.
Mark Heffernane6b4ba12014-07-23 17:31:37 +00004752For example:
4753
4754.. code-block:: llvm
4755
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004756 !0 = !{!"llvm.loop.unroll.full"}
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004757
Ashutosh Nemadf6763a2016-02-06 07:47:48 +00004758'``llvm.loop.licm_versioning.disable``' Metadata
Ashutosh Nema5f0e4722016-02-06 09:24:37 +00004759^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Ashutosh Nemadf6763a2016-02-06 07:47:48 +00004760
4761This metadata indicates that the loop should not be versioned for the purpose
4762of enabling loop-invariant code motion (LICM). The metadata has a single operand
4763which is the string ``llvm.loop.licm_versioning.disable``. For example:
4764
4765.. code-block:: llvm
4766
4767 !0 = !{!"llvm.loop.licm_versioning.disable"}
4768
Adam Nemetd2fa4142016-04-27 05:28:18 +00004769'``llvm.loop.distribute.enable``' Metadata
Adam Nemet55dc0af2016-04-27 05:59:51 +00004770^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Adam Nemetd2fa4142016-04-27 05:28:18 +00004771
4772Loop distribution allows splitting a loop into multiple loops. Currently,
4773this is only performed if the entire loop cannot be vectorized due to unsafe
4774memory dependencies. The transformation will atempt to isolate the unsafe
4775dependencies into their own loop.
4776
4777This metadata can be used to selectively enable or disable distribution of the
4778loop. The first operand is the string ``llvm.loop.distribute.enable`` and the
4779second operand is a bit. If the bit operand value is 1 distribution is
4780enabled. A value of 0 disables distribution:
4781
4782.. code-block:: llvm
4783
4784 !0 = !{!"llvm.loop.distribute.enable", i1 0}
4785 !1 = !{!"llvm.loop.distribute.enable", i1 1}
4786
4787This metadata should be used in conjunction with ``llvm.loop`` loop
4788identification metadata.
4789
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004790'``llvm.mem``'
4791^^^^^^^^^^^^^^^
4792
4793Metadata types used to annotate memory accesses with information helpful
4794for optimizations are prefixed with ``llvm.mem``.
4795
4796'``llvm.mem.parallel_loop_access``' Metadata
4797^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4798
Mehdi Amini4a121fa2015-03-14 22:04:06 +00004799The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier,
4800or metadata containing a list of loop identifiers for nested loops.
4801The metadata is attached to memory accessing instructions and denotes that
4802no loop carried memory dependence exist between it and other instructions denoted
Hal Finkel411d31a2016-04-26 02:00:36 +00004803with the same loop identifier. The metadata on memory reads also implies that
4804if conversion (i.e. speculative execution within a loop iteration) is safe.
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00004805
Mehdi Amini4a121fa2015-03-14 22:04:06 +00004806Precisely, given two instructions ``m1`` and ``m2`` that both have the
4807``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the
4808set of loops associated with that metadata, respectively, then there is no loop
4809carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00004810``L2``.
4811
Mehdi Amini4a121fa2015-03-14 22:04:06 +00004812As a special case, if all memory accessing instructions in a loop have
4813``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the
4814loop has no loop carried memory dependences and is considered to be a parallel
4815loop.
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00004816
Mehdi Amini4a121fa2015-03-14 22:04:06 +00004817Note that if not all memory access instructions have such metadata referring to
4818the loop, then the loop is considered not being trivially parallel. Additional
Sean Silvaa1190322015-08-06 22:56:48 +00004819memory dependence analysis is required to make that determination. As a fail
Mehdi Amini4a121fa2015-03-14 22:04:06 +00004820safe mechanism, this causes loops that were originally parallel to be considered
4821sequential (if optimization passes that are unaware of the parallel semantics
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00004822insert new memory instructions into the loop body).
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004823
4824Example of a loop that is considered parallel due to its correct use of
Paul Redmond5fdf8362013-05-28 20:00:34 +00004825both ``llvm.loop`` and ``llvm.mem.parallel_loop_access``
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004826metadata types that refer to the same loop identifier metadata.
4827
4828.. code-block:: llvm
4829
4830 for.body:
Paul Redmond5fdf8362013-05-28 20:00:34 +00004831 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00004832 %val0 = load i32, i32* %arrayidx, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00004833 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00004834 store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00004835 ...
4836 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004837
4838 for.end:
4839 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004840 !0 = !{!0}
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004841
4842It is also possible to have nested parallel loops. In that case the
4843memory accesses refer to a list of loop identifier metadata nodes instead of
4844the loop identifier metadata node directly:
4845
4846.. code-block:: llvm
4847
4848 outer.for.body:
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00004849 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00004850 %val1 = load i32, i32* %arrayidx3, !llvm.mem.parallel_loop_access !2
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00004851 ...
4852 br label %inner.for.body
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004853
4854 inner.for.body:
Paul Redmond5fdf8362013-05-28 20:00:34 +00004855 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00004856 %val0 = load i32, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00004857 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00004858 store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00004859 ...
4860 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004861
4862 inner.for.end:
Paul Redmond5fdf8362013-05-28 20:00:34 +00004863 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00004864 store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2
Paul Redmond5fdf8362013-05-28 20:00:34 +00004865 ...
4866 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004867
4868 outer.for.end: ; preds = %for.body
4869 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004870 !0 = !{!1, !2} ; a list of loop identifiers
4871 !1 = !{!1} ; an identifier for the inner loop
4872 !2 = !{!2} ; an identifier for the outer loop
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004873
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00004874'``invariant.group``' Metadata
4875^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4876
4877The ``invariant.group`` metadata may be attached to ``load``/``store`` instructions.
4878The existence of the ``invariant.group`` metadata on the instruction tells
4879the optimizer that every ``load`` and ``store`` to the same pointer operand
4880within the same invariant group can be assumed to load or store the same
4881value (but see the ``llvm.invariant.group.barrier`` intrinsic which affects
4882when two pointers are considered the same).
4883
4884Examples:
4885
4886.. code-block:: llvm
4887
4888 @unknownPtr = external global i8
4889 ...
4890 %ptr = alloca i8
4891 store i8 42, i8* %ptr, !invariant.group !0
4892 call void @foo(i8* %ptr)
4893
4894 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
4895 call void @foo(i8* %ptr)
4896 %b = load i8, i8* %ptr, !invariant.group !1 ; Can't assume anything, because group changed
4897
4898 %newPtr = call i8* @getPointer(i8* %ptr)
4899 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
4900
4901 %unknownValue = load i8, i8* @unknownPtr
4902 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
4903
4904 call void @foo(i8* %ptr)
4905 %newPtr2 = call i8* @llvm.invariant.group.barrier(i8* %ptr)
4906 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through invariant.group.barrier to get value of %ptr
4907
4908 ...
4909 declare void @foo(i8*)
4910 declare i8* @getPointer(i8*)
4911 declare i8* @llvm.invariant.group.barrier(i8*)
4912
4913 !0 = !{!"magic ptr"}
4914 !1 = !{!"other ptr"}
4915
Peter Collingbournea333db82016-07-26 22:31:30 +00004916'``type``' Metadata
4917^^^^^^^^^^^^^^^^^^^
4918
4919See :doc:`TypeMetadata`.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00004920
4921
Sean Silvab084af42012-12-07 10:36:55 +00004922Module Flags Metadata
4923=====================
4924
4925Information about the module as a whole is difficult to convey to LLVM's
4926subsystems. The LLVM IR isn't sufficient to transmit this information.
4927The ``llvm.module.flags`` named metadata exists in order to facilitate
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00004928this. These flags are in the form of key / value pairs --- much like a
4929dictionary --- making it easy for any subsystem who cares about a flag to
Sean Silvab084af42012-12-07 10:36:55 +00004930look it up.
4931
4932The ``llvm.module.flags`` metadata contains a list of metadata triplets.
4933Each triplet has the following form:
4934
4935- The first element is a *behavior* flag, which specifies the behavior
4936 when two (or more) modules are merged together, and it encounters two
4937 (or more) metadata with the same ID. The supported behaviors are
4938 described below.
4939- The second element is a metadata string that is a unique ID for the
Daniel Dunbar25c4b572013-01-15 01:22:53 +00004940 metadata. Each module may only have one flag entry for each unique ID (not
4941 including entries with the **Require** behavior).
Sean Silvab084af42012-12-07 10:36:55 +00004942- The third element is the value of the flag.
4943
4944When two (or more) modules are merged together, the resulting
Daniel Dunbar25c4b572013-01-15 01:22:53 +00004945``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
4946each unique metadata ID string, there will be exactly one entry in the merged
4947modules ``llvm.module.flags`` metadata table, and the value for that entry will
4948be determined by the merge behavior flag, as described below. The only exception
4949is that entries with the *Require* behavior are always preserved.
Sean Silvab084af42012-12-07 10:36:55 +00004950
4951The following behaviors are supported:
4952
4953.. list-table::
4954 :header-rows: 1
4955 :widths: 10 90
4956
4957 * - Value
4958 - Behavior
4959
4960 * - 1
4961 - **Error**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00004962 Emits an error if two values disagree, otherwise the resulting value
4963 is that of the operands.
Sean Silvab084af42012-12-07 10:36:55 +00004964
4965 * - 2
4966 - **Warning**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00004967 Emits a warning if two values disagree. The result value will be the
4968 operand for the flag from the first module being linked.
Sean Silvab084af42012-12-07 10:36:55 +00004969
4970 * - 3
4971 - **Require**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00004972 Adds a requirement that another module flag be present and have a
4973 specified value after linking is performed. The value must be a
4974 metadata pair, where the first element of the pair is the ID of the
4975 module flag to be restricted, and the second element of the pair is
4976 the value the module flag should be restricted to. This behavior can
4977 be used to restrict the allowable results (via triggering of an
4978 error) of linking IDs with the **Override** behavior.
Sean Silvab084af42012-12-07 10:36:55 +00004979
4980 * - 4
4981 - **Override**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00004982 Uses the specified value, regardless of the behavior or value of the
4983 other module. If both modules specify **Override**, but the values
4984 differ, an error will be emitted.
4985
Daniel Dunbard77d9fb2013-01-16 21:38:56 +00004986 * - 5
4987 - **Append**
4988 Appends the two values, which are required to be metadata nodes.
4989
4990 * - 6
4991 - **AppendUnique**
4992 Appends the two values, which are required to be metadata
4993 nodes. However, duplicate entries in the second list are dropped
4994 during the append operation.
4995
Daniel Dunbar25c4b572013-01-15 01:22:53 +00004996It is an error for a particular unique flag ID to have multiple behaviors,
4997except in the case of **Require** (which adds restrictions on another metadata
4998value) or **Override**.
Sean Silvab084af42012-12-07 10:36:55 +00004999
5000An example of module flags:
5001
5002.. code-block:: llvm
5003
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005004 !0 = !{ i32 1, !"foo", i32 1 }
5005 !1 = !{ i32 4, !"bar", i32 37 }
5006 !2 = !{ i32 2, !"qux", i32 42 }
5007 !3 = !{ i32 3, !"qux",
5008 !{
5009 !"foo", i32 1
Sean Silvab084af42012-12-07 10:36:55 +00005010 }
5011 }
5012 !llvm.module.flags = !{ !0, !1, !2, !3 }
5013
5014- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
5015 if two or more ``!"foo"`` flags are seen is to emit an error if their
5016 values are not equal.
5017
5018- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
5019 behavior if two or more ``!"bar"`` flags are seen is to use the value
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005020 '37'.
Sean Silvab084af42012-12-07 10:36:55 +00005021
5022- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
5023 behavior if two or more ``!"qux"`` flags are seen is to emit a
5024 warning if their values are not equal.
5025
5026- Metadata ``!3`` has the ID ``!"qux"`` and the value:
5027
5028 ::
5029
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005030 !{ !"foo", i32 1 }
Sean Silvab084af42012-12-07 10:36:55 +00005031
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005032 The behavior is to emit an error if the ``llvm.module.flags`` does not
5033 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
5034 performed.
Sean Silvab084af42012-12-07 10:36:55 +00005035
5036Objective-C Garbage Collection Module Flags Metadata
5037----------------------------------------------------
5038
5039On the Mach-O platform, Objective-C stores metadata about garbage
5040collection in a special section called "image info". The metadata
5041consists of a version number and a bitmask specifying what types of
5042garbage collection are supported (if any) by the file. If two or more
5043modules are linked together their garbage collection metadata needs to
5044be merged rather than appended together.
5045
5046The Objective-C garbage collection module flags metadata consists of the
5047following key-value pairs:
5048
5049.. list-table::
5050 :header-rows: 1
5051 :widths: 30 70
5052
5053 * - Key
5054 - Value
5055
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005056 * - ``Objective-C Version``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005057 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
Sean Silvab084af42012-12-07 10:36:55 +00005058
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005059 * - ``Objective-C Image Info Version``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005060 - **[Required]** --- The version of the image info section. Currently
Sean Silvab084af42012-12-07 10:36:55 +00005061 always 0.
5062
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005063 * - ``Objective-C Image Info Section``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005064 - **[Required]** --- The section to place the metadata. Valid values are
Sean Silvab084af42012-12-07 10:36:55 +00005065 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
5066 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
5067 Objective-C ABI version 2.
5068
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005069 * - ``Objective-C Garbage Collection``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005070 - **[Required]** --- Specifies whether garbage collection is supported or
Sean Silvab084af42012-12-07 10:36:55 +00005071 not. Valid values are 0, for no garbage collection, and 2, for garbage
5072 collection supported.
5073
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005074 * - ``Objective-C GC Only``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005075 - **[Optional]** --- Specifies that only garbage collection is supported.
Sean Silvab084af42012-12-07 10:36:55 +00005076 If present, its value must be 6. This flag requires that the
5077 ``Objective-C Garbage Collection`` flag have the value 2.
5078
5079Some important flag interactions:
5080
5081- If a module with ``Objective-C Garbage Collection`` set to 0 is
5082 merged with a module with ``Objective-C Garbage Collection`` set to
5083 2, then the resulting module has the
5084 ``Objective-C Garbage Collection`` flag set to 0.
5085- A module with ``Objective-C Garbage Collection`` set to 0 cannot be
5086 merged with a module with ``Objective-C GC Only`` set to 6.
5087
Daniel Dunbar252bedc2013-01-17 00:16:27 +00005088Automatic Linker Flags Module Flags Metadata
5089--------------------------------------------
5090
5091Some targets support embedding flags to the linker inside individual object
5092files. Typically this is used in conjunction with language extensions which
5093allow source files to explicitly declare the libraries they depend on, and have
5094these automatically be transmitted to the linker via object files.
5095
5096These flags are encoded in the IR using metadata in the module flags section,
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005097using the ``Linker Options`` key. The merge behavior for this flag is required
Daniel Dunbar252bedc2013-01-17 00:16:27 +00005098to be ``AppendUnique``, and the value for the key is expected to be a metadata
5099node which should be a list of other metadata nodes, each of which should be a
5100list of metadata strings defining linker options.
5101
5102For example, the following metadata section specifies two separate sets of
5103linker options, presumably to link against ``libz`` and the ``Cocoa``
5104framework::
5105
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005106 !0 = !{ i32 6, !"Linker Options",
5107 !{
5108 !{ !"-lz" },
5109 !{ !"-framework", !"Cocoa" } } }
Daniel Dunbar252bedc2013-01-17 00:16:27 +00005110 !llvm.module.flags = !{ !0 }
5111
5112The metadata encoding as lists of lists of options, as opposed to a collapsed
5113list of options, is chosen so that the IR encoding can use multiple option
5114strings to specify e.g., a single library, while still having that specifier be
5115preserved as an atomic element that can be recognized by a target specific
5116assembly writer or object file emitter.
5117
5118Each individual option is required to be either a valid option for the target's
5119linker, or an option that is reserved by the target specific assembly writer or
5120object file emitter. No other aspect of these options is defined by the IR.
5121
Oliver Stannard5dc29342014-06-20 10:08:11 +00005122C type width Module Flags Metadata
5123----------------------------------
5124
5125The ARM backend emits a section into each generated object file describing the
5126options that it was compiled with (in a compiler-independent way) to prevent
5127linking incompatible objects, and to allow automatic library selection. Some
5128of these options are not visible at the IR level, namely wchar_t width and enum
5129width.
5130
5131To pass this information to the backend, these options are encoded in module
5132flags metadata, using the following key-value pairs:
5133
5134.. list-table::
5135 :header-rows: 1
5136 :widths: 30 70
5137
5138 * - Key
5139 - Value
5140
5141 * - short_wchar
5142 - * 0 --- sizeof(wchar_t) == 4
5143 * 1 --- sizeof(wchar_t) == 2
5144
5145 * - short_enum
5146 - * 0 --- Enums are at least as large as an ``int``.
5147 * 1 --- Enums are stored in the smallest integer type which can
5148 represent all of its values.
5149
5150For example, the following metadata section specifies that the module was
5151compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
5152enum is the smallest type which can represent all of its values::
5153
5154 !llvm.module.flags = !{!0, !1}
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005155 !0 = !{i32 1, !"short_wchar", i32 1}
5156 !1 = !{i32 1, !"short_enum", i32 0}
Oliver Stannard5dc29342014-06-20 10:08:11 +00005157
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005158.. _intrinsicglobalvariables:
5159
Sean Silvab084af42012-12-07 10:36:55 +00005160Intrinsic Global Variables
5161==========================
5162
5163LLVM has a number of "magic" global variables that contain data that
5164affect code generation or other IR semantics. These are documented here.
5165All globals of this sort should have a section specified as
5166"``llvm.metadata``". This section and all globals that start with
5167"``llvm.``" are reserved for use by LLVM.
5168
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005169.. _gv_llvmused:
5170
Sean Silvab084af42012-12-07 10:36:55 +00005171The '``llvm.used``' Global Variable
5172-----------------------------------
5173
Rafael Espindola74f2e462013-04-22 14:58:02 +00005174The ``@llvm.used`` global is an array which has
Paul Redmond219ef812013-05-30 17:24:32 +00005175:ref:`appending linkage <linkage_appending>`. This array contains a list of
Rafael Espindola70a729d2013-06-11 13:18:13 +00005176pointers to named global variables, functions and aliases which may optionally
5177have a pointer cast formed of bitcast or getelementptr. For example, a legal
Sean Silvab084af42012-12-07 10:36:55 +00005178use of it is:
5179
5180.. code-block:: llvm
5181
5182 @X = global i8 4
5183 @Y = global i32 123
5184
5185 @llvm.used = appending global [2 x i8*] [
5186 i8* @X,
5187 i8* bitcast (i32* @Y to i8*)
5188 ], section "llvm.metadata"
5189
Rafael Espindola74f2e462013-04-22 14:58:02 +00005190If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
5191and linker are required to treat the symbol as if there is a reference to the
Rafael Espindola70a729d2013-06-11 13:18:13 +00005192symbol that it cannot see (which is why they have to be named). For example, if
5193a variable has internal linkage and no references other than that from the
5194``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
5195references from inline asms and other things the compiler cannot "see", and
5196corresponds to "``attribute((used))``" in GNU C.
Sean Silvab084af42012-12-07 10:36:55 +00005197
5198On some targets, the code generator must emit a directive to the
5199assembler or object file to prevent the assembler and linker from
5200molesting the symbol.
5201
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005202.. _gv_llvmcompilerused:
5203
Sean Silvab084af42012-12-07 10:36:55 +00005204The '``llvm.compiler.used``' Global Variable
5205--------------------------------------------
5206
5207The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
5208directive, except that it only prevents the compiler from touching the
5209symbol. On targets that support it, this allows an intelligent linker to
5210optimize references to the symbol without being impeded as it would be
5211by ``@llvm.used``.
5212
5213This is a rare construct that should only be used in rare circumstances,
5214and should not be exposed to source languages.
5215
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005216.. _gv_llvmglobalctors:
5217
Sean Silvab084af42012-12-07 10:36:55 +00005218The '``llvm.global_ctors``' Global Variable
5219-------------------------------------------
5220
5221.. code-block:: llvm
5222
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005223 %0 = type { i32, void ()*, i8* }
5224 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
Sean Silvab084af42012-12-07 10:36:55 +00005225
5226The ``@llvm.global_ctors`` array contains a list of constructor
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005227functions, priorities, and an optional associated global or function.
5228The functions referenced by this array will be called in ascending order
5229of priority (i.e. lowest first) when the module is loaded. The order of
5230functions with the same priority is not defined.
5231
5232If the third field is present, non-null, and points to a global variable
5233or function, the initializer function will only run if the associated
5234data from the current module is not discarded.
Sean Silvab084af42012-12-07 10:36:55 +00005235
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005236.. _llvmglobaldtors:
5237
Sean Silvab084af42012-12-07 10:36:55 +00005238The '``llvm.global_dtors``' Global Variable
5239-------------------------------------------
5240
5241.. code-block:: llvm
5242
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005243 %0 = type { i32, void ()*, i8* }
5244 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
Sean Silvab084af42012-12-07 10:36:55 +00005245
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005246The ``@llvm.global_dtors`` array contains a list of destructor
5247functions, priorities, and an optional associated global or function.
5248The functions referenced by this array will be called in descending
Reid Klecknerbffbcc52014-05-27 21:35:17 +00005249order of priority (i.e. highest first) when the module is unloaded. The
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005250order of functions with the same priority is not defined.
5251
5252If the third field is present, non-null, and points to a global variable
5253or function, the destructor function will only run if the associated
5254data from the current module is not discarded.
Sean Silvab084af42012-12-07 10:36:55 +00005255
5256Instruction Reference
5257=====================
5258
5259The LLVM instruction set consists of several different classifications
5260of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
5261instructions <binaryops>`, :ref:`bitwise binary
5262instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
5263:ref:`other instructions <otherops>`.
5264
5265.. _terminators:
5266
5267Terminator Instructions
5268-----------------------
5269
5270As mentioned :ref:`previously <functionstructure>`, every basic block in a
5271program ends with a "Terminator" instruction, which indicates which
5272block should be executed after the current block is finished. These
5273terminator instructions typically yield a '``void``' value: they produce
5274control flow, not values (the one exception being the
5275':ref:`invoke <i_invoke>`' instruction).
5276
5277The terminator instructions are: ':ref:`ret <i_ret>`',
5278':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
5279':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
David Majnemer8a1c45d2015-12-12 05:38:55 +00005280':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
David Majnemer654e1302015-07-31 17:58:14 +00005281':ref:`catchret <i_catchret>`',
5282':ref:`cleanupret <i_cleanupret>`',
David Majnemer654e1302015-07-31 17:58:14 +00005283and ':ref:`unreachable <i_unreachable>`'.
Sean Silvab084af42012-12-07 10:36:55 +00005284
5285.. _i_ret:
5286
5287'``ret``' Instruction
5288^^^^^^^^^^^^^^^^^^^^^
5289
5290Syntax:
5291"""""""
5292
5293::
5294
5295 ret <type> <value> ; Return a value from a non-void function
5296 ret void ; Return from void function
5297
5298Overview:
5299"""""""""
5300
5301The '``ret``' instruction is used to return control flow (and optionally
5302a value) from a function back to the caller.
5303
5304There are two forms of the '``ret``' instruction: one that returns a
5305value and then causes control flow, and one that just causes control
5306flow to occur.
5307
5308Arguments:
5309""""""""""
5310
5311The '``ret``' instruction optionally accepts a single argument, the
5312return value. The type of the return value must be a ':ref:`first
5313class <t_firstclass>`' type.
5314
5315A function is not :ref:`well formed <wellformed>` if it it has a non-void
5316return type and contains a '``ret``' instruction with no return value or
5317a return value with a type that does not match its type, or if it has a
5318void return type and contains a '``ret``' instruction with a return
5319value.
5320
5321Semantics:
5322""""""""""
5323
5324When the '``ret``' instruction is executed, control flow returns back to
5325the calling function's context. If the caller is a
5326":ref:`call <i_call>`" instruction, execution continues at the
5327instruction after the call. If the caller was an
5328":ref:`invoke <i_invoke>`" instruction, execution continues at the
5329beginning of the "normal" destination block. If the instruction returns
5330a value, that value shall set the call or invoke instruction's return
5331value.
5332
5333Example:
5334""""""""
5335
5336.. code-block:: llvm
5337
5338 ret i32 5 ; Return an integer value of 5
5339 ret void ; Return from a void function
5340 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
5341
5342.. _i_br:
5343
5344'``br``' Instruction
5345^^^^^^^^^^^^^^^^^^^^
5346
5347Syntax:
5348"""""""
5349
5350::
5351
5352 br i1 <cond>, label <iftrue>, label <iffalse>
5353 br label <dest> ; Unconditional branch
5354
5355Overview:
5356"""""""""
5357
5358The '``br``' instruction is used to cause control flow to transfer to a
5359different basic block in the current function. There are two forms of
5360this instruction, corresponding to a conditional branch and an
5361unconditional branch.
5362
5363Arguments:
5364""""""""""
5365
5366The conditional branch form of the '``br``' instruction takes a single
5367'``i1``' value and two '``label``' values. The unconditional form of the
5368'``br``' instruction takes a single '``label``' value as a target.
5369
5370Semantics:
5371""""""""""
5372
5373Upon execution of a conditional '``br``' instruction, the '``i1``'
5374argument is evaluated. If the value is ``true``, control flows to the
5375'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
5376to the '``iffalse``' ``label`` argument.
5377
5378Example:
5379""""""""
5380
5381.. code-block:: llvm
5382
5383 Test:
5384 %cond = icmp eq i32 %a, %b
5385 br i1 %cond, label %IfEqual, label %IfUnequal
5386 IfEqual:
5387 ret i32 1
5388 IfUnequal:
5389 ret i32 0
5390
5391.. _i_switch:
5392
5393'``switch``' Instruction
5394^^^^^^^^^^^^^^^^^^^^^^^^
5395
5396Syntax:
5397"""""""
5398
5399::
5400
5401 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
5402
5403Overview:
5404"""""""""
5405
5406The '``switch``' instruction is used to transfer control flow to one of
5407several different places. It is a generalization of the '``br``'
5408instruction, allowing a branch to occur to one of many possible
5409destinations.
5410
5411Arguments:
5412""""""""""
5413
5414The '``switch``' instruction uses three parameters: an integer
5415comparison value '``value``', a default '``label``' destination, and an
5416array of pairs of comparison value constants and '``label``'s. The table
5417is not allowed to contain duplicate constant entries.
5418
5419Semantics:
5420""""""""""
5421
5422The ``switch`` instruction specifies a table of values and destinations.
5423When the '``switch``' instruction is executed, this table is searched
5424for the given value. If the value is found, control flow is transferred
5425to the corresponding destination; otherwise, control flow is transferred
5426to the default destination.
5427
5428Implementation:
5429"""""""""""""""
5430
5431Depending on properties of the target machine and the particular
5432``switch`` instruction, this instruction may be code generated in
5433different ways. For example, it could be generated as a series of
5434chained conditional branches or with a lookup table.
5435
5436Example:
5437""""""""
5438
5439.. code-block:: llvm
5440
5441 ; Emulate a conditional br instruction
5442 %Val = zext i1 %value to i32
5443 switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
5444
5445 ; Emulate an unconditional br instruction
5446 switch i32 0, label %dest [ ]
5447
5448 ; Implement a jump table:
5449 switch i32 %val, label %otherwise [ i32 0, label %onzero
5450 i32 1, label %onone
5451 i32 2, label %ontwo ]
5452
5453.. _i_indirectbr:
5454
5455'``indirectbr``' Instruction
5456^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5457
5458Syntax:
5459"""""""
5460
5461::
5462
5463 indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
5464
5465Overview:
5466"""""""""
5467
5468The '``indirectbr``' instruction implements an indirect branch to a
5469label within the current function, whose address is specified by
5470"``address``". Address must be derived from a
5471:ref:`blockaddress <blockaddress>` constant.
5472
5473Arguments:
5474""""""""""
5475
5476The '``address``' argument is the address of the label to jump to. The
5477rest of the arguments indicate the full set of possible destinations
5478that the address may point to. Blocks are allowed to occur multiple
5479times in the destination list, though this isn't particularly useful.
5480
5481This destination list is required so that dataflow analysis has an
5482accurate understanding of the CFG.
5483
5484Semantics:
5485""""""""""
5486
5487Control transfers to the block specified in the address argument. All
5488possible destination blocks must be listed in the label list, otherwise
5489this instruction has undefined behavior. This implies that jumps to
5490labels defined in other functions have undefined behavior as well.
5491
5492Implementation:
5493"""""""""""""""
5494
5495This is typically implemented with a jump through a register.
5496
5497Example:
5498""""""""
5499
5500.. code-block:: llvm
5501
5502 indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
5503
5504.. _i_invoke:
5505
5506'``invoke``' Instruction
5507^^^^^^^^^^^^^^^^^^^^^^^^
5508
5509Syntax:
5510"""""""
5511
5512::
5513
David Blaikieb83cf102016-07-13 17:21:34 +00005514 <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00005515 [operand bundles] to label <normal label> unwind label <exception label>
Sean Silvab084af42012-12-07 10:36:55 +00005516
5517Overview:
5518"""""""""
5519
5520The '``invoke``' instruction causes control to transfer to a specified
5521function, with the possibility of control flow transfer to either the
5522'``normal``' label or the '``exception``' label. If the callee function
5523returns with the "``ret``" instruction, control flow will return to the
5524"normal" label. If the callee (or any indirect callees) returns via the
5525":ref:`resume <i_resume>`" instruction or other exception handling
5526mechanism, control is interrupted and continued at the dynamically
5527nearest "exception" label.
5528
5529The '``exception``' label is a `landing
5530pad <ExceptionHandling.html#overview>`_ for the exception. As such,
5531'``exception``' label is required to have the
5532":ref:`landingpad <i_landingpad>`" instruction, which contains the
5533information about the behavior of the program after unwinding happens,
5534as its first non-PHI instruction. The restrictions on the
5535"``landingpad``" instruction's tightly couples it to the "``invoke``"
5536instruction, so that the important information contained within the
5537"``landingpad``" instruction can't be lost through normal code motion.
5538
5539Arguments:
5540""""""""""
5541
5542This instruction requires several arguments:
5543
5544#. The optional "cconv" marker indicates which :ref:`calling
5545 convention <callingconv>` the call should use. If none is
5546 specified, the call defaults to using C calling conventions.
5547#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
5548 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
5549 are valid here.
David Blaikieb83cf102016-07-13 17:21:34 +00005550#. '``ty``': the type of the call instruction itself which is also the
5551 type of the return value. Functions that return no value are marked
5552 ``void``.
5553#. '``fnty``': shall be the signature of the function being invoked. The
5554 argument types must match the types implied by this signature. This
5555 type can be omitted if the function is not varargs.
5556#. '``fnptrval``': An LLVM value containing a pointer to a function to
5557 be invoked. In most cases, this is a direct function invocation, but
5558 indirect ``invoke``'s are just as possible, calling an arbitrary pointer
5559 to function value.
Sean Silvab084af42012-12-07 10:36:55 +00005560#. '``function args``': argument list whose types match the function
5561 signature argument types and parameter attributes. All arguments must
5562 be of :ref:`first class <t_firstclass>` type. If the function signature
5563 indicates the function accepts a variable number of arguments, the
5564 extra arguments can be specified.
5565#. '``normal label``': the label reached when the called function
5566 executes a '``ret``' instruction.
5567#. '``exception label``': the label reached when a callee returns via
5568 the :ref:`resume <i_resume>` instruction or other exception handling
5569 mechanism.
5570#. The optional :ref:`function attributes <fnattrs>` list. Only
5571 '``noreturn``', '``nounwind``', '``readonly``' and '``readnone``'
5572 attributes are valid here.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00005573#. The optional :ref:`operand bundles <opbundles>` list.
Sean Silvab084af42012-12-07 10:36:55 +00005574
5575Semantics:
5576""""""""""
5577
5578This instruction is designed to operate as a standard '``call``'
5579instruction in most regards. The primary difference is that it
5580establishes an association with a label, which is used by the runtime
5581library to unwind the stack.
5582
5583This instruction is used in languages with destructors to ensure that
5584proper cleanup is performed in the case of either a ``longjmp`` or a
5585thrown exception. Additionally, this is important for implementation of
5586'``catch``' clauses in high-level languages that support them.
5587
5588For the purposes of the SSA form, the definition of the value returned
5589by the '``invoke``' instruction is deemed to occur on the edge from the
5590current block to the "normal" label. If the callee unwinds then no
5591return value is available.
5592
5593Example:
5594""""""""
5595
5596.. code-block:: llvm
5597
5598 %retval = invoke i32 @Test(i32 15) to label %Continue
Tim Northover675a0962014-06-13 14:24:23 +00005599 unwind label %TestCleanup ; i32:retval set
Sean Silvab084af42012-12-07 10:36:55 +00005600 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
Tim Northover675a0962014-06-13 14:24:23 +00005601 unwind label %TestCleanup ; i32:retval set
Sean Silvab084af42012-12-07 10:36:55 +00005602
5603.. _i_resume:
5604
5605'``resume``' Instruction
5606^^^^^^^^^^^^^^^^^^^^^^^^
5607
5608Syntax:
5609"""""""
5610
5611::
5612
5613 resume <type> <value>
5614
5615Overview:
5616"""""""""
5617
5618The '``resume``' instruction is a terminator instruction that has no
5619successors.
5620
5621Arguments:
5622""""""""""
5623
5624The '``resume``' instruction requires one argument, which must have the
5625same type as the result of any '``landingpad``' instruction in the same
5626function.
5627
5628Semantics:
5629""""""""""
5630
5631The '``resume``' instruction resumes propagation of an existing
5632(in-flight) exception whose unwinding was interrupted with a
5633:ref:`landingpad <i_landingpad>` instruction.
5634
5635Example:
5636""""""""
5637
5638.. code-block:: llvm
5639
5640 resume { i8*, i32 } %exn
5641
David Majnemer8a1c45d2015-12-12 05:38:55 +00005642.. _i_catchswitch:
5643
5644'``catchswitch``' Instruction
Akira Hatanakacedf8e92015-12-14 05:15:40 +00005645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
David Majnemer8a1c45d2015-12-12 05:38:55 +00005646
5647Syntax:
5648"""""""
5649
5650::
5651
5652 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
5653 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
5654
5655Overview:
5656"""""""""
5657
5658The '``catchswitch``' instruction is used by `LLVM's exception handling system
5659<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
5660that may be executed by the :ref:`EH personality routine <personalityfn>`.
5661
5662Arguments:
5663""""""""""
5664
5665The ``parent`` argument is the token of the funclet that contains the
5666``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
5667this operand may be the token ``none``.
5668
Joseph Tremoulete28885e2016-01-10 04:28:38 +00005669The ``default`` argument is the label of another basic block beginning with
5670either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination
5671must be a legal target with respect to the ``parent`` links, as described in
5672the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
David Majnemer8a1c45d2015-12-12 05:38:55 +00005673
Joseph Tremoulete28885e2016-01-10 04:28:38 +00005674The ``handlers`` are a nonempty list of successor blocks that each begin with a
David Majnemer8a1c45d2015-12-12 05:38:55 +00005675:ref:`catchpad <i_catchpad>` instruction.
5676
5677Semantics:
5678""""""""""
5679
5680Executing this instruction transfers control to one of the successors in
5681``handlers``, if appropriate, or continues to unwind via the unwind label if
5682present.
5683
5684The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
5685it must be both the first non-phi instruction and last instruction in the basic
5686block. Therefore, it must be the only non-phi instruction in the block.
5687
5688Example:
5689""""""""
5690
Renato Golin124f2592016-07-20 12:16:38 +00005691.. code-block:: text
David Majnemer8a1c45d2015-12-12 05:38:55 +00005692
5693 dispatch1:
5694 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
5695 dispatch2:
5696 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
5697
David Majnemer654e1302015-07-31 17:58:14 +00005698.. _i_catchret:
5699
5700'``catchret``' Instruction
5701^^^^^^^^^^^^^^^^^^^^^^^^^^
5702
5703Syntax:
5704"""""""
5705
5706::
5707
David Majnemer8a1c45d2015-12-12 05:38:55 +00005708 catchret from <token> to label <normal>
David Majnemer654e1302015-07-31 17:58:14 +00005709
5710Overview:
5711"""""""""
5712
5713The '``catchret``' instruction is a terminator instruction that has a
5714single successor.
5715
5716
5717Arguments:
5718""""""""""
5719
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00005720The first argument to a '``catchret``' indicates which ``catchpad`` it
5721exits. It must be a :ref:`catchpad <i_catchpad>`.
5722The second argument to a '``catchret``' specifies where control will
5723transfer to next.
David Majnemer654e1302015-07-31 17:58:14 +00005724
5725Semantics:
5726""""""""""
5727
David Majnemer8a1c45d2015-12-12 05:38:55 +00005728The '``catchret``' instruction ends an existing (in-flight) exception whose
5729unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The
5730:ref:`personality function <personalityfn>` gets a chance to execute arbitrary
5731code to, for example, destroy the active exception. Control then transfers to
5732``normal``.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00005733
Joseph Tremoulete28885e2016-01-10 04:28:38 +00005734The ``token`` argument must be a token produced by a ``catchpad`` instruction.
5735If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
5736funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
5737the ``catchret``'s behavior is undefined.
David Majnemer654e1302015-07-31 17:58:14 +00005738
5739Example:
5740""""""""
5741
Renato Golin124f2592016-07-20 12:16:38 +00005742.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00005743
David Majnemer8a1c45d2015-12-12 05:38:55 +00005744 catchret from %catch label %continue
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00005745
David Majnemer654e1302015-07-31 17:58:14 +00005746.. _i_cleanupret:
5747
5748'``cleanupret``' Instruction
5749^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5750
5751Syntax:
5752"""""""
5753
5754::
5755
David Majnemer8a1c45d2015-12-12 05:38:55 +00005756 cleanupret from <value> unwind label <continue>
5757 cleanupret from <value> unwind to caller
David Majnemer654e1302015-07-31 17:58:14 +00005758
5759Overview:
5760"""""""""
5761
5762The '``cleanupret``' instruction is a terminator instruction that has
5763an optional successor.
5764
5765
5766Arguments:
5767""""""""""
5768
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00005769The '``cleanupret``' instruction requires one argument, which indicates
5770which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
Joseph Tremoulete28885e2016-01-10 04:28:38 +00005771If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
5772funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
5773the ``cleanupret``'s behavior is undefined.
5774
5775The '``cleanupret``' instruction also has an optional successor, ``continue``,
5776which must be the label of another basic block beginning with either a
5777``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must
5778be a legal target with respect to the ``parent`` links, as described in the
5779`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
David Majnemer654e1302015-07-31 17:58:14 +00005780
5781Semantics:
5782""""""""""
5783
5784The '``cleanupret``' instruction indicates to the
5785:ref:`personality function <personalityfn>` that one
5786:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
5787It transfers control to ``continue`` or unwinds out of the function.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00005788
David Majnemer654e1302015-07-31 17:58:14 +00005789Example:
5790""""""""
5791
Renato Golin124f2592016-07-20 12:16:38 +00005792.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00005793
David Majnemer8a1c45d2015-12-12 05:38:55 +00005794 cleanupret from %cleanup unwind to caller
5795 cleanupret from %cleanup unwind label %continue
David Majnemer654e1302015-07-31 17:58:14 +00005796
Sean Silvab084af42012-12-07 10:36:55 +00005797.. _i_unreachable:
5798
5799'``unreachable``' Instruction
5800^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5801
5802Syntax:
5803"""""""
5804
5805::
5806
5807 unreachable
5808
5809Overview:
5810"""""""""
5811
5812The '``unreachable``' instruction has no defined semantics. This
5813instruction is used to inform the optimizer that a particular portion of
5814the code is not reachable. This can be used to indicate that the code
5815after a no-return function cannot be reached, and other facts.
5816
5817Semantics:
5818""""""""""
5819
5820The '``unreachable``' instruction has no defined semantics.
5821
5822.. _binaryops:
5823
5824Binary Operations
5825-----------------
5826
5827Binary operators are used to do most of the computation in a program.
5828They require two operands of the same type, execute an operation on
5829them, and produce a single value. The operands might represent multiple
5830data, as is the case with the :ref:`vector <t_vector>` data type. The
5831result value has the same type as its operands.
5832
5833There are several different binary operators:
5834
5835.. _i_add:
5836
5837'``add``' Instruction
5838^^^^^^^^^^^^^^^^^^^^^
5839
5840Syntax:
5841"""""""
5842
5843::
5844
Tim Northover675a0962014-06-13 14:24:23 +00005845 <result> = add <ty> <op1>, <op2> ; yields ty:result
5846 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
5847 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
5848 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00005849
5850Overview:
5851"""""""""
5852
5853The '``add``' instruction returns the sum of its two operands.
5854
5855Arguments:
5856""""""""""
5857
5858The two arguments to the '``add``' instruction must be
5859:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
5860arguments must have identical types.
5861
5862Semantics:
5863""""""""""
5864
5865The value produced is the integer sum of the two operands.
5866
5867If the sum has unsigned overflow, the result returned is the
5868mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
5869the result.
5870
5871Because LLVM integers use a two's complement representation, this
5872instruction is appropriate for both signed and unsigned integers.
5873
5874``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
5875respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
5876result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
5877unsigned and/or signed overflow, respectively, occurs.
5878
5879Example:
5880""""""""
5881
Renato Golin124f2592016-07-20 12:16:38 +00005882.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00005883
Tim Northover675a0962014-06-13 14:24:23 +00005884 <result> = add i32 4, %var ; yields i32:result = 4 + %var
Sean Silvab084af42012-12-07 10:36:55 +00005885
5886.. _i_fadd:
5887
5888'``fadd``' Instruction
5889^^^^^^^^^^^^^^^^^^^^^^
5890
5891Syntax:
5892"""""""
5893
5894::
5895
Tim Northover675a0962014-06-13 14:24:23 +00005896 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00005897
5898Overview:
5899"""""""""
5900
5901The '``fadd``' instruction returns the sum of its two operands.
5902
5903Arguments:
5904""""""""""
5905
5906The two arguments to the '``fadd``' instruction must be :ref:`floating
5907point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
5908Both arguments must have identical types.
5909
5910Semantics:
5911""""""""""
5912
5913The value produced is the floating point sum of the two operands. This
5914instruction can also take any number of :ref:`fast-math flags <fastmath>`,
5915which are optimization hints to enable otherwise unsafe floating point
5916optimizations:
5917
5918Example:
5919""""""""
5920
Renato Golin124f2592016-07-20 12:16:38 +00005921.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00005922
Tim Northover675a0962014-06-13 14:24:23 +00005923 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
Sean Silvab084af42012-12-07 10:36:55 +00005924
5925'``sub``' Instruction
5926^^^^^^^^^^^^^^^^^^^^^
5927
5928Syntax:
5929"""""""
5930
5931::
5932
Tim Northover675a0962014-06-13 14:24:23 +00005933 <result> = sub <ty> <op1>, <op2> ; yields ty:result
5934 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
5935 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
5936 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00005937
5938Overview:
5939"""""""""
5940
5941The '``sub``' instruction returns the difference of its two operands.
5942
5943Note that the '``sub``' instruction is used to represent the '``neg``'
5944instruction present in most other intermediate representations.
5945
5946Arguments:
5947""""""""""
5948
5949The two arguments to the '``sub``' instruction must be
5950:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
5951arguments must have identical types.
5952
5953Semantics:
5954""""""""""
5955
5956The value produced is the integer difference of the two operands.
5957
5958If the difference has unsigned overflow, the result returned is the
5959mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
5960the result.
5961
5962Because LLVM integers use a two's complement representation, this
5963instruction is appropriate for both signed and unsigned integers.
5964
5965``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
5966respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
5967result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
5968unsigned and/or signed overflow, respectively, occurs.
5969
5970Example:
5971""""""""
5972
Renato Golin124f2592016-07-20 12:16:38 +00005973.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00005974
Tim Northover675a0962014-06-13 14:24:23 +00005975 <result> = sub i32 4, %var ; yields i32:result = 4 - %var
5976 <result> = sub i32 0, %val ; yields i32:result = -%var
Sean Silvab084af42012-12-07 10:36:55 +00005977
5978.. _i_fsub:
5979
5980'``fsub``' Instruction
5981^^^^^^^^^^^^^^^^^^^^^^
5982
5983Syntax:
5984"""""""
5985
5986::
5987
Tim Northover675a0962014-06-13 14:24:23 +00005988 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00005989
5990Overview:
5991"""""""""
5992
5993The '``fsub``' instruction returns the difference of its two operands.
5994
5995Note that the '``fsub``' instruction is used to represent the '``fneg``'
5996instruction present in most other intermediate representations.
5997
5998Arguments:
5999""""""""""
6000
6001The two arguments to the '``fsub``' instruction must be :ref:`floating
6002point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6003Both arguments must have identical types.
6004
6005Semantics:
6006""""""""""
6007
6008The value produced is the floating point difference of the two operands.
6009This instruction can also take any number of :ref:`fast-math
6010flags <fastmath>`, which are optimization hints to enable otherwise
6011unsafe floating point optimizations:
6012
6013Example:
6014""""""""
6015
Renato Golin124f2592016-07-20 12:16:38 +00006016.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006017
Tim Northover675a0962014-06-13 14:24:23 +00006018 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
6019 <result> = fsub float -0.0, %val ; yields float:result = -%var
Sean Silvab084af42012-12-07 10:36:55 +00006020
6021'``mul``' Instruction
6022^^^^^^^^^^^^^^^^^^^^^
6023
6024Syntax:
6025"""""""
6026
6027::
6028
Tim Northover675a0962014-06-13 14:24:23 +00006029 <result> = mul <ty> <op1>, <op2> ; yields ty:result
6030 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
6031 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
6032 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006033
6034Overview:
6035"""""""""
6036
6037The '``mul``' instruction returns the product of its two operands.
6038
6039Arguments:
6040""""""""""
6041
6042The two arguments to the '``mul``' instruction must be
6043:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6044arguments must have identical types.
6045
6046Semantics:
6047""""""""""
6048
6049The value produced is the integer product of the two operands.
6050
6051If the result of the multiplication has unsigned overflow, the result
6052returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
6053bit width of the result.
6054
6055Because LLVM integers use a two's complement representation, and the
6056result is the same width as the operands, this instruction returns the
6057correct result for both signed and unsigned integers. If a full product
6058(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
6059sign-extended or zero-extended as appropriate to the width of the full
6060product.
6061
6062``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6063respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6064result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
6065unsigned and/or signed overflow, respectively, occurs.
6066
6067Example:
6068""""""""
6069
Renato Golin124f2592016-07-20 12:16:38 +00006070.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006071
Tim Northover675a0962014-06-13 14:24:23 +00006072 <result> = mul i32 4, %var ; yields i32:result = 4 * %var
Sean Silvab084af42012-12-07 10:36:55 +00006073
6074.. _i_fmul:
6075
6076'``fmul``' Instruction
6077^^^^^^^^^^^^^^^^^^^^^^
6078
6079Syntax:
6080"""""""
6081
6082::
6083
Tim Northover675a0962014-06-13 14:24:23 +00006084 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006085
6086Overview:
6087"""""""""
6088
6089The '``fmul``' instruction returns the product of its two operands.
6090
6091Arguments:
6092""""""""""
6093
6094The two arguments to the '``fmul``' instruction must be :ref:`floating
6095point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6096Both arguments must have identical types.
6097
6098Semantics:
6099""""""""""
6100
6101The value produced is the floating point product of the two operands.
6102This instruction can also take any number of :ref:`fast-math
6103flags <fastmath>`, which are optimization hints to enable otherwise
6104unsafe floating point optimizations:
6105
6106Example:
6107""""""""
6108
Renato Golin124f2592016-07-20 12:16:38 +00006109.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006110
Tim Northover675a0962014-06-13 14:24:23 +00006111 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
Sean Silvab084af42012-12-07 10:36:55 +00006112
6113'``udiv``' Instruction
6114^^^^^^^^^^^^^^^^^^^^^^
6115
6116Syntax:
6117"""""""
6118
6119::
6120
Tim Northover675a0962014-06-13 14:24:23 +00006121 <result> = udiv <ty> <op1>, <op2> ; yields ty:result
6122 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006123
6124Overview:
6125"""""""""
6126
6127The '``udiv``' instruction returns the quotient of its two operands.
6128
6129Arguments:
6130""""""""""
6131
6132The two arguments to the '``udiv``' instruction must be
6133:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6134arguments must have identical types.
6135
6136Semantics:
6137""""""""""
6138
6139The value produced is the unsigned integer quotient of the two operands.
6140
6141Note that unsigned integer division and signed integer division are
6142distinct operations; for signed integer division, use '``sdiv``'.
6143
6144Division by zero leads to undefined behavior.
6145
6146If the ``exact`` keyword is present, the result value of the ``udiv`` is
6147a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
6148such, "((a udiv exact b) mul b) == a").
6149
6150Example:
6151""""""""
6152
Renato Golin124f2592016-07-20 12:16:38 +00006153.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006154
Tim Northover675a0962014-06-13 14:24:23 +00006155 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
Sean Silvab084af42012-12-07 10:36:55 +00006156
6157'``sdiv``' Instruction
6158^^^^^^^^^^^^^^^^^^^^^^
6159
6160Syntax:
6161"""""""
6162
6163::
6164
Tim Northover675a0962014-06-13 14:24:23 +00006165 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
6166 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006167
6168Overview:
6169"""""""""
6170
6171The '``sdiv``' instruction returns the quotient of its two operands.
6172
6173Arguments:
6174""""""""""
6175
6176The two arguments to the '``sdiv``' instruction must be
6177:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6178arguments must have identical types.
6179
6180Semantics:
6181""""""""""
6182
6183The value produced is the signed integer quotient of the two operands
6184rounded towards zero.
6185
6186Note that signed integer division and unsigned integer division are
6187distinct operations; for unsigned integer division, use '``udiv``'.
6188
6189Division by zero leads to undefined behavior. Overflow also leads to
6190undefined behavior; this is a rare case, but can occur, for example, by
6191doing a 32-bit division of -2147483648 by -1.
6192
6193If the ``exact`` keyword is present, the result value of the ``sdiv`` is
6194a :ref:`poison value <poisonvalues>` if the result would be rounded.
6195
6196Example:
6197""""""""
6198
Renato Golin124f2592016-07-20 12:16:38 +00006199.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006200
Tim Northover675a0962014-06-13 14:24:23 +00006201 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
Sean Silvab084af42012-12-07 10:36:55 +00006202
6203.. _i_fdiv:
6204
6205'``fdiv``' Instruction
6206^^^^^^^^^^^^^^^^^^^^^^
6207
6208Syntax:
6209"""""""
6210
6211::
6212
Tim Northover675a0962014-06-13 14:24:23 +00006213 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006214
6215Overview:
6216"""""""""
6217
6218The '``fdiv``' instruction returns the quotient of its two operands.
6219
6220Arguments:
6221""""""""""
6222
6223The two arguments to the '``fdiv``' instruction must be :ref:`floating
6224point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6225Both arguments must have identical types.
6226
6227Semantics:
6228""""""""""
6229
6230The value produced is the floating point quotient of the two operands.
6231This instruction can also take any number of :ref:`fast-math
6232flags <fastmath>`, which are optimization hints to enable otherwise
6233unsafe floating point optimizations:
6234
6235Example:
6236""""""""
6237
Renato Golin124f2592016-07-20 12:16:38 +00006238.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006239
Tim Northover675a0962014-06-13 14:24:23 +00006240 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
Sean Silvab084af42012-12-07 10:36:55 +00006241
6242'``urem``' Instruction
6243^^^^^^^^^^^^^^^^^^^^^^
6244
6245Syntax:
6246"""""""
6247
6248::
6249
Tim Northover675a0962014-06-13 14:24:23 +00006250 <result> = urem <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006251
6252Overview:
6253"""""""""
6254
6255The '``urem``' instruction returns the remainder from the unsigned
6256division of its two arguments.
6257
6258Arguments:
6259""""""""""
6260
6261The two arguments to the '``urem``' instruction must be
6262:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6263arguments must have identical types.
6264
6265Semantics:
6266""""""""""
6267
6268This instruction returns the unsigned integer *remainder* of a division.
6269This instruction always performs an unsigned division to get the
6270remainder.
6271
6272Note that unsigned integer remainder and signed integer remainder are
6273distinct operations; for signed integer remainder, use '``srem``'.
6274
6275Taking the remainder of a division by zero leads to undefined behavior.
6276
6277Example:
6278""""""""
6279
Renato Golin124f2592016-07-20 12:16:38 +00006280.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006281
Tim Northover675a0962014-06-13 14:24:23 +00006282 <result> = urem i32 4, %var ; yields i32:result = 4 % %var
Sean Silvab084af42012-12-07 10:36:55 +00006283
6284'``srem``' Instruction
6285^^^^^^^^^^^^^^^^^^^^^^
6286
6287Syntax:
6288"""""""
6289
6290::
6291
Tim Northover675a0962014-06-13 14:24:23 +00006292 <result> = srem <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006293
6294Overview:
6295"""""""""
6296
6297The '``srem``' instruction returns the remainder from the signed
6298division of its two operands. This instruction can also take
6299:ref:`vector <t_vector>` versions of the values in which case the elements
6300must be integers.
6301
6302Arguments:
6303""""""""""
6304
6305The two arguments to the '``srem``' instruction must be
6306:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6307arguments must have identical types.
6308
6309Semantics:
6310""""""""""
6311
6312This instruction returns the *remainder* of a division (where the result
6313is either zero or has the same sign as the dividend, ``op1``), not the
6314*modulo* operator (where the result is either zero or has the same sign
6315as the divisor, ``op2``) of a value. For more information about the
6316difference, see `The Math
6317Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
6318table of how this is implemented in various languages, please see
6319`Wikipedia: modulo
6320operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
6321
6322Note that signed integer remainder and unsigned integer remainder are
6323distinct operations; for unsigned integer remainder, use '``urem``'.
6324
6325Taking the remainder of a division by zero leads to undefined behavior.
6326Overflow also leads to undefined behavior; this is a rare case, but can
6327occur, for example, by taking the remainder of a 32-bit division of
6328-2147483648 by -1. (The remainder doesn't actually overflow, but this
6329rule lets srem be implemented using instructions that return both the
6330result of the division and the remainder.)
6331
6332Example:
6333""""""""
6334
Renato Golin124f2592016-07-20 12:16:38 +00006335.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006336
Tim Northover675a0962014-06-13 14:24:23 +00006337 <result> = srem i32 4, %var ; yields i32:result = 4 % %var
Sean Silvab084af42012-12-07 10:36:55 +00006338
6339.. _i_frem:
6340
6341'``frem``' Instruction
6342^^^^^^^^^^^^^^^^^^^^^^
6343
6344Syntax:
6345"""""""
6346
6347::
6348
Tim Northover675a0962014-06-13 14:24:23 +00006349 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006350
6351Overview:
6352"""""""""
6353
6354The '``frem``' instruction returns the remainder from the division of
6355its two operands.
6356
6357Arguments:
6358""""""""""
6359
6360The two arguments to the '``frem``' instruction must be :ref:`floating
6361point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6362Both arguments must have identical types.
6363
6364Semantics:
6365""""""""""
6366
6367This instruction returns the *remainder* of a division. The remainder
6368has the same sign as the dividend. This instruction can also take any
6369number of :ref:`fast-math flags <fastmath>`, which are optimization hints
6370to enable otherwise unsafe floating point optimizations:
6371
6372Example:
6373""""""""
6374
Renato Golin124f2592016-07-20 12:16:38 +00006375.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006376
Tim Northover675a0962014-06-13 14:24:23 +00006377 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
Sean Silvab084af42012-12-07 10:36:55 +00006378
6379.. _bitwiseops:
6380
6381Bitwise Binary Operations
6382-------------------------
6383
6384Bitwise binary operators are used to do various forms of bit-twiddling
6385in a program. They are generally very efficient instructions and can
6386commonly be strength reduced from other instructions. They require two
6387operands of the same type, execute an operation on them, and produce a
6388single value. The resulting value is the same type as its operands.
6389
6390'``shl``' Instruction
6391^^^^^^^^^^^^^^^^^^^^^
6392
6393Syntax:
6394"""""""
6395
6396::
6397
Tim Northover675a0962014-06-13 14:24:23 +00006398 <result> = shl <ty> <op1>, <op2> ; yields ty:result
6399 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
6400 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
6401 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006402
6403Overview:
6404"""""""""
6405
6406The '``shl``' instruction returns the first operand shifted to the left
6407a specified number of bits.
6408
6409Arguments:
6410""""""""""
6411
6412Both arguments to the '``shl``' instruction must be the same
6413:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
6414'``op2``' is treated as an unsigned value.
6415
6416Semantics:
6417""""""""""
6418
6419The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
6420where ``n`` is the width of the result. If ``op2`` is (statically or
Sean Silvab8a108c2015-04-17 21:58:55 +00006421dynamically) equal to or larger than the number of bits in
Sean Silvab084af42012-12-07 10:36:55 +00006422``op1``, the result is undefined. If the arguments are vectors, each
6423vector element of ``op1`` is shifted by the corresponding shift amount
6424in ``op2``.
6425
6426If the ``nuw`` keyword is present, then the shift produces a :ref:`poison
6427value <poisonvalues>` if it shifts out any non-zero bits. If the
6428``nsw`` keyword is present, then the shift produces a :ref:`poison
6429value <poisonvalues>` if it shifts out any bits that disagree with the
6430resultant sign bit. As such, NUW/NSW have the same semantics as they
6431would if the shift were expressed as a mul instruction with the same
6432nsw/nuw bits in (mul %op1, (shl 1, %op2)).
6433
6434Example:
6435""""""""
6436
Renato Golin124f2592016-07-20 12:16:38 +00006437.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006438
Tim Northover675a0962014-06-13 14:24:23 +00006439 <result> = shl i32 4, %var ; yields i32: 4 << %var
6440 <result> = shl i32 4, 2 ; yields i32: 16
6441 <result> = shl i32 1, 10 ; yields i32: 1024
Sean Silvab084af42012-12-07 10:36:55 +00006442 <result> = shl i32 1, 32 ; undefined
6443 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
6444
6445'``lshr``' Instruction
6446^^^^^^^^^^^^^^^^^^^^^^
6447
6448Syntax:
6449"""""""
6450
6451::
6452
Tim Northover675a0962014-06-13 14:24:23 +00006453 <result> = lshr <ty> <op1>, <op2> ; yields ty:result
6454 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006455
6456Overview:
6457"""""""""
6458
6459The '``lshr``' instruction (logical shift right) returns the first
6460operand shifted to the right a specified number of bits with zero fill.
6461
6462Arguments:
6463""""""""""
6464
6465Both arguments to the '``lshr``' instruction must be the same
6466:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
6467'``op2``' is treated as an unsigned value.
6468
6469Semantics:
6470""""""""""
6471
6472This instruction always performs a logical shift right operation. The
6473most significant bits of the result will be filled with zero bits after
6474the shift. If ``op2`` is (statically or dynamically) equal to or larger
6475than the number of bits in ``op1``, the result is undefined. If the
6476arguments are vectors, each vector element of ``op1`` is shifted by the
6477corresponding shift amount in ``op2``.
6478
6479If the ``exact`` keyword is present, the result value of the ``lshr`` is
6480a :ref:`poison value <poisonvalues>` if any of the bits shifted out are
6481non-zero.
6482
6483Example:
6484""""""""
6485
Renato Golin124f2592016-07-20 12:16:38 +00006486.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006487
Tim Northover675a0962014-06-13 14:24:23 +00006488 <result> = lshr i32 4, 1 ; yields i32:result = 2
6489 <result> = lshr i32 4, 2 ; yields i32:result = 1
6490 <result> = lshr i8 4, 3 ; yields i8:result = 0
6491 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
Sean Silvab084af42012-12-07 10:36:55 +00006492 <result> = lshr i32 1, 32 ; undefined
6493 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
6494
6495'``ashr``' Instruction
6496^^^^^^^^^^^^^^^^^^^^^^
6497
6498Syntax:
6499"""""""
6500
6501::
6502
Tim Northover675a0962014-06-13 14:24:23 +00006503 <result> = ashr <ty> <op1>, <op2> ; yields ty:result
6504 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006505
6506Overview:
6507"""""""""
6508
6509The '``ashr``' instruction (arithmetic shift right) returns the first
6510operand shifted to the right a specified number of bits with sign
6511extension.
6512
6513Arguments:
6514""""""""""
6515
6516Both arguments to the '``ashr``' instruction must be the same
6517:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
6518'``op2``' is treated as an unsigned value.
6519
6520Semantics:
6521""""""""""
6522
6523This instruction always performs an arithmetic shift right operation,
6524The most significant bits of the result will be filled with the sign bit
6525of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
6526than the number of bits in ``op1``, the result is undefined. If the
6527arguments are vectors, each vector element of ``op1`` is shifted by the
6528corresponding shift amount in ``op2``.
6529
6530If the ``exact`` keyword is present, the result value of the ``ashr`` is
6531a :ref:`poison value <poisonvalues>` if any of the bits shifted out are
6532non-zero.
6533
6534Example:
6535""""""""
6536
Renato Golin124f2592016-07-20 12:16:38 +00006537.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006538
Tim Northover675a0962014-06-13 14:24:23 +00006539 <result> = ashr i32 4, 1 ; yields i32:result = 2
6540 <result> = ashr i32 4, 2 ; yields i32:result = 1
6541 <result> = ashr i8 4, 3 ; yields i8:result = 0
6542 <result> = ashr i8 -2, 1 ; yields i8:result = -1
Sean Silvab084af42012-12-07 10:36:55 +00006543 <result> = ashr i32 1, 32 ; undefined
6544 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
6545
6546'``and``' Instruction
6547^^^^^^^^^^^^^^^^^^^^^
6548
6549Syntax:
6550"""""""
6551
6552::
6553
Tim Northover675a0962014-06-13 14:24:23 +00006554 <result> = and <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006555
6556Overview:
6557"""""""""
6558
6559The '``and``' instruction returns the bitwise logical and of its two
6560operands.
6561
6562Arguments:
6563""""""""""
6564
6565The two arguments to the '``and``' instruction must be
6566:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6567arguments must have identical types.
6568
6569Semantics:
6570""""""""""
6571
6572The truth table used for the '``and``' instruction is:
6573
6574+-----+-----+-----+
6575| In0 | In1 | Out |
6576+-----+-----+-----+
6577| 0 | 0 | 0 |
6578+-----+-----+-----+
6579| 0 | 1 | 0 |
6580+-----+-----+-----+
6581| 1 | 0 | 0 |
6582+-----+-----+-----+
6583| 1 | 1 | 1 |
6584+-----+-----+-----+
6585
6586Example:
6587""""""""
6588
Renato Golin124f2592016-07-20 12:16:38 +00006589.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006590
Tim Northover675a0962014-06-13 14:24:23 +00006591 <result> = and i32 4, %var ; yields i32:result = 4 & %var
6592 <result> = and i32 15, 40 ; yields i32:result = 8
6593 <result> = and i32 4, 8 ; yields i32:result = 0
Sean Silvab084af42012-12-07 10:36:55 +00006594
6595'``or``' Instruction
6596^^^^^^^^^^^^^^^^^^^^
6597
6598Syntax:
6599"""""""
6600
6601::
6602
Tim Northover675a0962014-06-13 14:24:23 +00006603 <result> = or <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006604
6605Overview:
6606"""""""""
6607
6608The '``or``' instruction returns the bitwise logical inclusive or of its
6609two operands.
6610
6611Arguments:
6612""""""""""
6613
6614The two arguments to the '``or``' instruction must be
6615:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6616arguments must have identical types.
6617
6618Semantics:
6619""""""""""
6620
6621The truth table used for the '``or``' instruction is:
6622
6623+-----+-----+-----+
6624| In0 | In1 | Out |
6625+-----+-----+-----+
6626| 0 | 0 | 0 |
6627+-----+-----+-----+
6628| 0 | 1 | 1 |
6629+-----+-----+-----+
6630| 1 | 0 | 1 |
6631+-----+-----+-----+
6632| 1 | 1 | 1 |
6633+-----+-----+-----+
6634
6635Example:
6636""""""""
6637
6638::
6639
Tim Northover675a0962014-06-13 14:24:23 +00006640 <result> = or i32 4, %var ; yields i32:result = 4 | %var
6641 <result> = or i32 15, 40 ; yields i32:result = 47
6642 <result> = or i32 4, 8 ; yields i32:result = 12
Sean Silvab084af42012-12-07 10:36:55 +00006643
6644'``xor``' Instruction
6645^^^^^^^^^^^^^^^^^^^^^
6646
6647Syntax:
6648"""""""
6649
6650::
6651
Tim Northover675a0962014-06-13 14:24:23 +00006652 <result> = xor <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006653
6654Overview:
6655"""""""""
6656
6657The '``xor``' instruction returns the bitwise logical exclusive or of
6658its two operands. The ``xor`` is used to implement the "one's
6659complement" operation, which is the "~" operator in C.
6660
6661Arguments:
6662""""""""""
6663
6664The two arguments to the '``xor``' instruction must be
6665:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6666arguments must have identical types.
6667
6668Semantics:
6669""""""""""
6670
6671The truth table used for the '``xor``' instruction is:
6672
6673+-----+-----+-----+
6674| In0 | In1 | Out |
6675+-----+-----+-----+
6676| 0 | 0 | 0 |
6677+-----+-----+-----+
6678| 0 | 1 | 1 |
6679+-----+-----+-----+
6680| 1 | 0 | 1 |
6681+-----+-----+-----+
6682| 1 | 1 | 0 |
6683+-----+-----+-----+
6684
6685Example:
6686""""""""
6687
Renato Golin124f2592016-07-20 12:16:38 +00006688.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006689
Tim Northover675a0962014-06-13 14:24:23 +00006690 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
6691 <result> = xor i32 15, 40 ; yields i32:result = 39
6692 <result> = xor i32 4, 8 ; yields i32:result = 12
6693 <result> = xor i32 %V, -1 ; yields i32:result = ~%V
Sean Silvab084af42012-12-07 10:36:55 +00006694
6695Vector Operations
6696-----------------
6697
6698LLVM supports several instructions to represent vector operations in a
6699target-independent manner. These instructions cover the element-access
6700and vector-specific operations needed to process vectors effectively.
6701While LLVM does directly support these vector operations, many
6702sophisticated algorithms will want to use target-specific intrinsics to
6703take full advantage of a specific target.
6704
6705.. _i_extractelement:
6706
6707'``extractelement``' Instruction
6708^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6709
6710Syntax:
6711"""""""
6712
6713::
6714
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00006715 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty>
Sean Silvab084af42012-12-07 10:36:55 +00006716
6717Overview:
6718"""""""""
6719
6720The '``extractelement``' instruction extracts a single scalar element
6721from a vector at a specified index.
6722
6723Arguments:
6724""""""""""
6725
6726The first operand of an '``extractelement``' instruction is a value of
6727:ref:`vector <t_vector>` type. The second operand is an index indicating
6728the position from which to extract the element. The index may be a
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00006729variable of any integer type.
Sean Silvab084af42012-12-07 10:36:55 +00006730
6731Semantics:
6732""""""""""
6733
6734The result is a scalar of the same type as the element type of ``val``.
6735Its value is the value at position ``idx`` of ``val``. If ``idx``
6736exceeds the length of ``val``, the results are undefined.
6737
6738Example:
6739""""""""
6740
Renato Golin124f2592016-07-20 12:16:38 +00006741.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006742
6743 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
6744
6745.. _i_insertelement:
6746
6747'``insertelement``' Instruction
6748^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6749
6750Syntax:
6751"""""""
6752
6753::
6754
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00006755 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>>
Sean Silvab084af42012-12-07 10:36:55 +00006756
6757Overview:
6758"""""""""
6759
6760The '``insertelement``' instruction inserts a scalar element into a
6761vector at a specified index.
6762
6763Arguments:
6764""""""""""
6765
6766The first operand of an '``insertelement``' instruction is a value of
6767:ref:`vector <t_vector>` type. The second operand is a scalar value whose
6768type must equal the element type of the first operand. The third operand
6769is an index indicating the position at which to insert the value. The
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00006770index may be a variable of any integer type.
Sean Silvab084af42012-12-07 10:36:55 +00006771
6772Semantics:
6773""""""""""
6774
6775The result is a vector of the same type as ``val``. Its element values
6776are those of ``val`` except at position ``idx``, where it gets the value
6777``elt``. If ``idx`` exceeds the length of ``val``, the results are
6778undefined.
6779
6780Example:
6781""""""""
6782
Renato Golin124f2592016-07-20 12:16:38 +00006783.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006784
6785 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
6786
6787.. _i_shufflevector:
6788
6789'``shufflevector``' Instruction
6790^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6791
6792Syntax:
6793"""""""
6794
6795::
6796
6797 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
6798
6799Overview:
6800"""""""""
6801
6802The '``shufflevector``' instruction constructs a permutation of elements
6803from two input vectors, returning a vector with the same element type as
6804the input and length that is the same as the shuffle mask.
6805
6806Arguments:
6807""""""""""
6808
6809The first two operands of a '``shufflevector``' instruction are vectors
6810with the same type. The third argument is a shuffle mask whose element
6811type is always 'i32'. The result of the instruction is a vector whose
6812length is the same as the shuffle mask and whose element type is the
6813same as the element type of the first two operands.
6814
6815The shuffle mask operand is required to be a constant vector with either
6816constant integer or undef values.
6817
6818Semantics:
6819""""""""""
6820
6821The elements of the two input vectors are numbered from left to right
6822across both of the vectors. The shuffle mask operand specifies, for each
6823element of the result vector, which element of the two input vectors the
6824result element gets. The element selector may be undef (meaning "don't
6825care") and the second operand may be undef if performing a shuffle from
6826only one vector.
6827
6828Example:
6829""""""""
6830
Renato Golin124f2592016-07-20 12:16:38 +00006831.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006832
6833 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
6834 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
6835 <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
6836 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
6837 <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
6838 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
6839 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
6840 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
6841
6842Aggregate Operations
6843--------------------
6844
6845LLVM supports several instructions for working with
6846:ref:`aggregate <t_aggregate>` values.
6847
6848.. _i_extractvalue:
6849
6850'``extractvalue``' Instruction
6851^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6852
6853Syntax:
6854"""""""
6855
6856::
6857
6858 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
6859
6860Overview:
6861"""""""""
6862
6863The '``extractvalue``' instruction extracts the value of a member field
6864from an :ref:`aggregate <t_aggregate>` value.
6865
6866Arguments:
6867""""""""""
6868
6869The first operand of an '``extractvalue``' instruction is a value of
Arch D. Robisona7f8f252015-10-14 19:10:45 +00006870:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
Sean Silvab084af42012-12-07 10:36:55 +00006871constant indices to specify which value to extract in a similar manner
6872as indices in a '``getelementptr``' instruction.
6873
6874The major differences to ``getelementptr`` indexing are:
6875
6876- Since the value being indexed is not a pointer, the first index is
6877 omitted and assumed to be zero.
6878- At least one index must be specified.
6879- Not only struct indices but also array indices must be in bounds.
6880
6881Semantics:
6882""""""""""
6883
6884The result is the value at the position in the aggregate specified by
6885the index operands.
6886
6887Example:
6888""""""""
6889
Renato Golin124f2592016-07-20 12:16:38 +00006890.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006891
6892 <result> = extractvalue {i32, float} %agg, 0 ; yields i32
6893
6894.. _i_insertvalue:
6895
6896'``insertvalue``' Instruction
6897^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6898
6899Syntax:
6900"""""""
6901
6902::
6903
6904 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
6905
6906Overview:
6907"""""""""
6908
6909The '``insertvalue``' instruction inserts a value into a member field in
6910an :ref:`aggregate <t_aggregate>` value.
6911
6912Arguments:
6913""""""""""
6914
6915The first operand of an '``insertvalue``' instruction is a value of
6916:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
6917a first-class value to insert. The following operands are constant
6918indices indicating the position at which to insert the value in a
6919similar manner as indices in a '``extractvalue``' instruction. The value
6920to insert must have the same type as the value identified by the
6921indices.
6922
6923Semantics:
6924""""""""""
6925
6926The result is an aggregate of the same type as ``val``. Its value is
6927that of ``val`` except that the value at the position specified by the
6928indices is that of ``elt``.
6929
6930Example:
6931""""""""
6932
6933.. code-block:: llvm
6934
6935 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
6936 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
Dan Liewffcfe7f2014-09-08 21:19:46 +00006937 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}}
Sean Silvab084af42012-12-07 10:36:55 +00006938
6939.. _memoryops:
6940
6941Memory Access and Addressing Operations
6942---------------------------------------
6943
6944A key design point of an SSA-based representation is how it represents
6945memory. In LLVM, no memory locations are in SSA form, which makes things
6946very simple. This section describes how to read, write, and allocate
6947memory in LLVM.
6948
6949.. _i_alloca:
6950
6951'``alloca``' Instruction
6952^^^^^^^^^^^^^^^^^^^^^^^^
6953
6954Syntax:
6955"""""""
6956
6957::
6958
Tim Northover675a0962014-06-13 14:24:23 +00006959 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] ; yields type*:result
Sean Silvab084af42012-12-07 10:36:55 +00006960
6961Overview:
6962"""""""""
6963
6964The '``alloca``' instruction allocates memory on the stack frame of the
6965currently executing function, to be automatically released when this
6966function returns to its caller. The object is always allocated in the
6967generic address space (address space zero).
6968
6969Arguments:
6970""""""""""
6971
6972The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
6973bytes of memory on the runtime stack, returning a pointer of the
6974appropriate type to the program. If "NumElements" is specified, it is
6975the number of elements allocated, otherwise "NumElements" is defaulted
6976to be one. If a constant alignment is specified, the value result of the
Reid Kleckner15fe7a52014-07-15 01:16:09 +00006977allocation is guaranteed to be aligned to at least that boundary. The
6978alignment may not be greater than ``1 << 29``. If not specified, or if
6979zero, the target can choose to align the allocation on any convenient
6980boundary compatible with the type.
Sean Silvab084af42012-12-07 10:36:55 +00006981
6982'``type``' may be any sized type.
6983
6984Semantics:
6985""""""""""
6986
6987Memory is allocated; a pointer is returned. The operation is undefined
6988if there is insufficient stack space for the allocation. '``alloca``'d
6989memory is automatically released when the function returns. The
6990'``alloca``' instruction is commonly used to represent automatic
6991variables that must have an address available. When the function returns
6992(either with the ``ret`` or ``resume`` instructions), the memory is
6993reclaimed. Allocating zero bytes is legal, but the result is undefined.
6994The order in which memory is allocated (ie., which way the stack grows)
6995is not specified.
6996
6997Example:
6998""""""""
6999
7000.. code-block:: llvm
7001
Tim Northover675a0962014-06-13 14:24:23 +00007002 %ptr = alloca i32 ; yields i32*:ptr
7003 %ptr = alloca i32, i32 4 ; yields i32*:ptr
7004 %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr
7005 %ptr = alloca i32, align 1024 ; yields i32*:ptr
Sean Silvab084af42012-12-07 10:36:55 +00007006
7007.. _i_load:
7008
7009'``load``' Instruction
7010^^^^^^^^^^^^^^^^^^^^^^
7011
7012Syntax:
7013"""""""
7014
7015::
7016
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007017 <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>]
Matt Arsenaultd5b9a362016-04-12 14:41:03 +00007018 <result> = load atomic [volatile] <ty>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>]
Sean Silvab084af42012-12-07 10:36:55 +00007019 !<index> = !{ i32 1 }
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007020 !<deref_bytes_node> = !{i64 <dereferenceable_bytes>}
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007021 !<align_node> = !{ i64 <value_alignment> }
Sean Silvab084af42012-12-07 10:36:55 +00007022
7023Overview:
7024"""""""""
7025
7026The '``load``' instruction is used to read from memory.
7027
7028Arguments:
7029""""""""""
7030
Sanjoy Dasc2cf6ef2016-06-01 16:13:10 +00007031The argument to the ``load`` instruction specifies the memory address from which
7032to load. The type specified must be a :ref:`first class <t_firstclass>` type of
7033known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
7034the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
7035modify the number or order of execution of this ``load`` with other
7036:ref:`volatile operations <volatile>`.
Sean Silvab084af42012-12-07 10:36:55 +00007037
JF Bastiend1fb5852015-12-17 22:09:19 +00007038If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
7039<ordering>` and optional ``singlethread`` argument. The ``release`` and
7040``acq_rel`` orderings are not valid on ``load`` instructions. Atomic loads
7041produce :ref:`defined <memmodel>` results when they may see multiple atomic
7042stores. The type of the pointee must be an integer, pointer, or floating-point
7043type whose bit width is a power of two greater than or equal to eight and less
7044than or equal to a target-specific size limit. ``align`` must be explicitly
7045specified on atomic loads, and the load has undefined behavior if the alignment
7046is not set to a value which is at least the size in bytes of the
7047pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
Sean Silvab084af42012-12-07 10:36:55 +00007048
7049The optional constant ``align`` argument specifies the alignment of the
7050operation (that is, the alignment of the memory address). A value of 0
Eli Bendersky239a78b2013-04-17 20:17:08 +00007051or an omitted ``align`` argument means that the operation has the ABI
Sean Silvab084af42012-12-07 10:36:55 +00007052alignment for the target. It is the responsibility of the code emitter
7053to ensure that the alignment information is correct. Overestimating the
7054alignment results in undefined behavior. Underestimating the alignment
Reid Kleckner15fe7a52014-07-15 01:16:09 +00007055may produce less efficient code. An alignment of 1 is always safe. The
Matt Arsenault7020f252016-06-16 16:33:41 +00007056maximum possible alignment is ``1 << 29``. An alignment value higher
7057than the size of the loaded type implies memory up to the alignment
7058value bytes can be safely loaded without trapping in the default
7059address space. Access of the high bytes can interfere with debugging
7060tools, so should not be accessed if the function has the
7061``sanitize_thread`` or ``sanitize_address`` attributes.
Sean Silvab084af42012-12-07 10:36:55 +00007062
7063The optional ``!nontemporal`` metadata must reference a single
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007064metadata name ``<index>`` corresponding to a metadata node with one
Sean Silvab084af42012-12-07 10:36:55 +00007065``i32`` entry of value 1. The existence of the ``!nontemporal``
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007066metadata on the instruction tells the optimizer and code generator
Sean Silvab084af42012-12-07 10:36:55 +00007067that this load is not expected to be reused in the cache. The code
7068generator may select special instructions to save cache bandwidth, such
7069as the ``MOVNT`` instruction on x86.
7070
7071The optional ``!invariant.load`` metadata must reference a single
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007072metadata name ``<index>`` corresponding to a metadata node with no
7073entries. The existence of the ``!invariant.load`` metadata on the
Philip Reamese1526fc2014-11-24 22:32:43 +00007074instruction tells the optimizer and code generator that the address
7075operand to this load points to memory which can be assumed unchanged.
Mehdi Amini4a121fa2015-03-14 22:04:06 +00007076Being invariant does not imply that a location is dereferenceable,
7077but it does imply that once the location is known dereferenceable
7078its value is henceforth unchanging.
Sean Silvab084af42012-12-07 10:36:55 +00007079
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007080The optional ``!invariant.group`` metadata must reference a single metadata name
7081 ``<index>`` corresponding to a metadata node. See ``invariant.group`` metadata.
7082
Philip Reamescdb72f32014-10-20 22:40:55 +00007083The optional ``!nonnull`` metadata must reference a single
7084metadata name ``<index>`` corresponding to a metadata node with no
7085entries. The existence of the ``!nonnull`` metadata on the
7086instruction tells the optimizer that the value loaded is known to
Piotr Padlewskid97846e2015-09-02 20:33:16 +00007087never be null. This is analogous to the ``nonnull`` attribute
Sean Silvaa1190322015-08-06 22:56:48 +00007088on parameters and return values. This metadata can only be applied
Mehdi Amini4a121fa2015-03-14 22:04:06 +00007089to loads of a pointer type.
Philip Reamescdb72f32014-10-20 22:40:55 +00007090
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007091The optional ``!dereferenceable`` metadata must reference a single metadata
7092name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
Sean Silva706fba52015-08-06 22:56:24 +00007093entry. The existence of the ``!dereferenceable`` metadata on the instruction
Sanjoy Dasf9995472015-05-19 20:10:19 +00007094tells the optimizer that the value loaded is known to be dereferenceable.
Sean Silva706fba52015-08-06 22:56:24 +00007095The number of bytes known to be dereferenceable is specified by the integer
7096value in the metadata node. This is analogous to the ''dereferenceable''
7097attribute on parameters and return values. This metadata can only be applied
Sanjoy Dasf9995472015-05-19 20:10:19 +00007098to loads of a pointer type.
7099
7100The optional ``!dereferenceable_or_null`` metadata must reference a single
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007101metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
7102``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the
Sanjoy Dasf9995472015-05-19 20:10:19 +00007103instruction tells the optimizer that the value loaded is known to be either
7104dereferenceable or null.
Sean Silva706fba52015-08-06 22:56:24 +00007105The number of bytes known to be dereferenceable is specified by the integer
7106value in the metadata node. This is analogous to the ''dereferenceable_or_null''
7107attribute on parameters and return values. This metadata can only be applied
Sanjoy Dasf9995472015-05-19 20:10:19 +00007108to loads of a pointer type.
7109
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007110The optional ``!align`` metadata must reference a single metadata name
7111``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
7112The existence of the ``!align`` metadata on the instruction tells the
7113optimizer that the value loaded is known to be aligned to a boundary specified
7114by the integer value in the metadata node. The alignment must be a power of 2.
7115This is analogous to the ''align'' attribute on parameters and return values.
7116This metadata can only be applied to loads of a pointer type.
7117
Sean Silvab084af42012-12-07 10:36:55 +00007118Semantics:
7119""""""""""
7120
7121The location of memory pointed to is loaded. If the value being loaded
7122is of scalar type then the number of bytes read does not exceed the
7123minimum number of bytes needed to hold all bits of the type. For
7124example, loading an ``i24`` reads at most three bytes. When loading a
7125value of a type like ``i20`` with a size that is not an integral number
7126of bytes, the result is undefined if the value was not originally
7127written using a store of the same type.
7128
7129Examples:
7130"""""""""
7131
7132.. code-block:: llvm
7133
Tim Northover675a0962014-06-13 14:24:23 +00007134 %ptr = alloca i32 ; yields i32*:ptr
7135 store i32 3, i32* %ptr ; yields void
David Blaikiec7aabbb2015-03-04 22:06:14 +00007136 %val = load i32, i32* %ptr ; yields i32:val = i32 3
Sean Silvab084af42012-12-07 10:36:55 +00007137
7138.. _i_store:
7139
7140'``store``' Instruction
7141^^^^^^^^^^^^^^^^^^^^^^^
7142
7143Syntax:
7144"""""""
7145
7146::
7147
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007148 store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.group !<index>] ; yields void
7149 store atomic [volatile] <ty> <value>, <ty>* <pointer> [singlethread] <ordering>, align <alignment> [, !invariant.group !<index>] ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00007150
7151Overview:
7152"""""""""
7153
7154The '``store``' instruction is used to write to memory.
7155
7156Arguments:
7157""""""""""
7158
Sanjoy Dasc2cf6ef2016-06-01 16:13:10 +00007159There are two arguments to the ``store`` instruction: a value to store and an
7160address at which to store it. The type of the ``<pointer>`` operand must be a
7161pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
7162operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
7163allowed to modify the number or order of execution of this ``store`` with other
7164:ref:`volatile operations <volatile>`. Only values of :ref:`first class
7165<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
7166structural type <t_opaque>`) can be stored.
Sean Silvab084af42012-12-07 10:36:55 +00007167
JF Bastiend1fb5852015-12-17 22:09:19 +00007168If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
7169<ordering>` and optional ``singlethread`` argument. The ``acquire`` and
7170``acq_rel`` orderings aren't valid on ``store`` instructions. Atomic loads
7171produce :ref:`defined <memmodel>` results when they may see multiple atomic
7172stores. The type of the pointee must be an integer, pointer, or floating-point
7173type whose bit width is a power of two greater than or equal to eight and less
7174than or equal to a target-specific size limit. ``align`` must be explicitly
7175specified on atomic stores, and the store has undefined behavior if the
7176alignment is not set to a value which is at least the size in bytes of the
7177pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
Sean Silvab084af42012-12-07 10:36:55 +00007178
Eli Benderskyca380842013-04-17 17:17:20 +00007179The optional constant ``align`` argument specifies the alignment of the
Sean Silvab084af42012-12-07 10:36:55 +00007180operation (that is, the alignment of the memory address). A value of 0
Eli Benderskyca380842013-04-17 17:17:20 +00007181or an omitted ``align`` argument means that the operation has the ABI
Sean Silvab084af42012-12-07 10:36:55 +00007182alignment for the target. It is the responsibility of the code emitter
7183to ensure that the alignment information is correct. Overestimating the
Eli Benderskyca380842013-04-17 17:17:20 +00007184alignment results in undefined behavior. Underestimating the
Sean Silvab084af42012-12-07 10:36:55 +00007185alignment may produce less efficient code. An alignment of 1 is always
Matt Arsenault7020f252016-06-16 16:33:41 +00007186safe. The maximum possible alignment is ``1 << 29``. An alignment
7187value higher than the size of the stored type implies memory up to the
7188alignment value bytes can be stored to without trapping in the default
7189address space. Storing to the higher bytes however may result in data
7190races if another thread can access the same address. Introducing a
7191data race is not allowed. Storing to the extra bytes is not allowed
7192even in situations where a data race is known to not exist if the
7193function has the ``sanitize_address`` attribute.
Sean Silvab084af42012-12-07 10:36:55 +00007194
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007195The optional ``!nontemporal`` metadata must reference a single metadata
Eli Benderskyca380842013-04-17 17:17:20 +00007196name ``<index>`` corresponding to a metadata node with one ``i32`` entry of
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007197value 1. The existence of the ``!nontemporal`` metadata on the instruction
Sean Silvab084af42012-12-07 10:36:55 +00007198tells the optimizer and code generator that this load is not expected to
7199be reused in the cache. The code generator may select special
JF Bastiend2d8ffd2016-01-13 04:52:26 +00007200instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
Sean Silvab084af42012-12-07 10:36:55 +00007201x86.
7202
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007203The optional ``!invariant.group`` metadata must reference a
7204single metadata name ``<index>``. See ``invariant.group`` metadata.
7205
Sean Silvab084af42012-12-07 10:36:55 +00007206Semantics:
7207""""""""""
7208
Eli Benderskyca380842013-04-17 17:17:20 +00007209The contents of memory are updated to contain ``<value>`` at the
7210location specified by the ``<pointer>`` operand. If ``<value>`` is
Sean Silvab084af42012-12-07 10:36:55 +00007211of scalar type then the number of bytes written does not exceed the
7212minimum number of bytes needed to hold all bits of the type. For
7213example, storing an ``i24`` writes at most three bytes. When writing a
7214value of a type like ``i20`` with a size that is not an integral number
7215of bytes, it is unspecified what happens to the extra bits that do not
7216belong to the type, but they will typically be overwritten.
7217
7218Example:
7219""""""""
7220
7221.. code-block:: llvm
7222
Tim Northover675a0962014-06-13 14:24:23 +00007223 %ptr = alloca i32 ; yields i32*:ptr
7224 store i32 3, i32* %ptr ; yields void
Nick Lewycky149d04c2015-08-11 01:05:16 +00007225 %val = load i32, i32* %ptr ; yields i32:val = i32 3
Sean Silvab084af42012-12-07 10:36:55 +00007226
7227.. _i_fence:
7228
7229'``fence``' Instruction
7230^^^^^^^^^^^^^^^^^^^^^^^
7231
7232Syntax:
7233"""""""
7234
7235::
7236
Tim Northover675a0962014-06-13 14:24:23 +00007237 fence [singlethread] <ordering> ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00007238
7239Overview:
7240"""""""""
7241
7242The '``fence``' instruction is used to introduce happens-before edges
7243between operations.
7244
7245Arguments:
7246""""""""""
7247
7248'``fence``' instructions take an :ref:`ordering <ordering>` argument which
7249defines what *synchronizes-with* edges they add. They can only be given
7250``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
7251
7252Semantics:
7253""""""""""
7254
7255A fence A which has (at least) ``release`` ordering semantics
7256*synchronizes with* a fence B with (at least) ``acquire`` ordering
7257semantics if and only if there exist atomic operations X and Y, both
7258operating on some atomic object M, such that A is sequenced before X, X
7259modifies M (either directly or through some side effect of a sequence
7260headed by X), Y is sequenced before B, and Y observes M. This provides a
7261*happens-before* dependency between A and B. Rather than an explicit
7262``fence``, one (but not both) of the atomic operations X or Y might
7263provide a ``release`` or ``acquire`` (resp.) ordering constraint and
7264still *synchronize-with* the explicit ``fence`` and establish the
7265*happens-before* edge.
7266
7267A ``fence`` which has ``seq_cst`` ordering, in addition to having both
7268``acquire`` and ``release`` semantics specified above, participates in
7269the global program order of other ``seq_cst`` operations and/or fences.
7270
7271The optional ":ref:`singlethread <singlethread>`" argument specifies
7272that the fence only synchronizes with other fences in the same thread.
7273(This is useful for interacting with signal handlers.)
7274
7275Example:
7276""""""""
7277
7278.. code-block:: llvm
7279
Tim Northover675a0962014-06-13 14:24:23 +00007280 fence acquire ; yields void
7281 fence singlethread seq_cst ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00007282
7283.. _i_cmpxchg:
7284
7285'``cmpxchg``' Instruction
7286^^^^^^^^^^^^^^^^^^^^^^^^^
7287
7288Syntax:
7289"""""""
7290
7291::
7292
Tim Northover675a0962014-06-13 14:24:23 +00007293 cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [singlethread] <success ordering> <failure ordering> ; yields { ty, i1 }
Sean Silvab084af42012-12-07 10:36:55 +00007294
7295Overview:
7296"""""""""
7297
7298The '``cmpxchg``' instruction is used to atomically modify memory. It
7299loads a value in memory and compares it to a given value. If they are
Tim Northover420a2162014-06-13 14:24:07 +00007300equal, it tries to store a new value into the memory.
Sean Silvab084af42012-12-07 10:36:55 +00007301
7302Arguments:
7303""""""""""
7304
7305There are three arguments to the '``cmpxchg``' instruction: an address
7306to operate on, a value to compare to the value currently be at that
7307address, and a new value to place at that address if the compared values
Philip Reames1960cfd2016-02-19 00:06:41 +00007308are equal. The type of '<cmp>' must be an integer or pointer type whose
7309bit width is a power of two greater than or equal to eight and less
7310than or equal to a target-specific size limit. '<cmp>' and '<new>' must
7311have the same type, and the type of '<pointer>' must be a pointer to
7312that type. If the ``cmpxchg`` is marked as ``volatile``, then the
7313optimizer is not allowed to modify the number or order of execution of
7314this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
Sean Silvab084af42012-12-07 10:36:55 +00007315
Tim Northovere94a5182014-03-11 10:48:52 +00007316The success and failure :ref:`ordering <ordering>` arguments specify how this
Tim Northover1dcc9f92014-06-13 14:24:16 +00007317``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
7318must be at least ``monotonic``, the ordering constraint on failure must be no
7319stronger than that on success, and the failure ordering cannot be either
7320``release`` or ``acq_rel``.
Sean Silvab084af42012-12-07 10:36:55 +00007321
7322The optional "``singlethread``" argument declares that the ``cmpxchg``
7323is only atomic with respect to code (usually signal handlers) running in
7324the same thread as the ``cmpxchg``. Otherwise the cmpxchg is atomic with
7325respect to all other code in the system.
7326
7327The pointer passed into cmpxchg must have alignment greater than or
7328equal to the size in memory of the operand.
7329
7330Semantics:
7331""""""""""
7332
Tim Northover420a2162014-06-13 14:24:07 +00007333The contents of memory at the location specified by the '``<pointer>``' operand
7334is read and compared to '``<cmp>``'; if the read value is the equal, the
7335'``<new>``' is written. The original value at the location is returned, together
7336with a flag indicating success (true) or failure (false).
7337
7338If the cmpxchg operation is marked as ``weak`` then a spurious failure is
7339permitted: the operation may not write ``<new>`` even if the comparison
7340matched.
7341
7342If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
7343if the value loaded equals ``cmp``.
Sean Silvab084af42012-12-07 10:36:55 +00007344
Tim Northovere94a5182014-03-11 10:48:52 +00007345A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
7346identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
7347load with an ordering parameter determined the second ordering parameter.
Sean Silvab084af42012-12-07 10:36:55 +00007348
7349Example:
7350""""""""
7351
7352.. code-block:: llvm
7353
7354 entry:
Duncan P. N. Exon Smithc917c7a2016-02-07 05:06:35 +00007355 %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32
Sean Silvab084af42012-12-07 10:36:55 +00007356 br label %loop
7357
7358 loop:
Duncan P. N. Exon Smithc917c7a2016-02-07 05:06:35 +00007359 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
Sean Silvab084af42012-12-07 10:36:55 +00007360 %squared = mul i32 %cmp, %cmp
Tim Northover675a0962014-06-13 14:24:23 +00007361 %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
Tim Northover420a2162014-06-13 14:24:07 +00007362 %value_loaded = extractvalue { i32, i1 } %val_success, 0
7363 %success = extractvalue { i32, i1 } %val_success, 1
Sean Silvab084af42012-12-07 10:36:55 +00007364 br i1 %success, label %done, label %loop
7365
7366 done:
7367 ...
7368
7369.. _i_atomicrmw:
7370
7371'``atomicrmw``' Instruction
7372^^^^^^^^^^^^^^^^^^^^^^^^^^^
7373
7374Syntax:
7375"""""""
7376
7377::
7378
Tim Northover675a0962014-06-13 14:24:23 +00007379 atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [singlethread] <ordering> ; yields ty
Sean Silvab084af42012-12-07 10:36:55 +00007380
7381Overview:
7382"""""""""
7383
7384The '``atomicrmw``' instruction is used to atomically modify memory.
7385
7386Arguments:
7387""""""""""
7388
7389There are three arguments to the '``atomicrmw``' instruction: an
7390operation to apply, an address whose value to modify, an argument to the
7391operation. The operation must be one of the following keywords:
7392
7393- xchg
7394- add
7395- sub
7396- and
7397- nand
7398- or
7399- xor
7400- max
7401- min
7402- umax
7403- umin
7404
7405The type of '<value>' must be an integer type whose bit width is a power
7406of two greater than or equal to eight and less than or equal to a
7407target-specific size limit. The type of the '``<pointer>``' operand must
7408be a pointer to that type. If the ``atomicrmw`` is marked as
7409``volatile``, then the optimizer is not allowed to modify the number or
7410order of execution of this ``atomicrmw`` with other :ref:`volatile
7411operations <volatile>`.
7412
7413Semantics:
7414""""""""""
7415
7416The contents of memory at the location specified by the '``<pointer>``'
7417operand are atomically read, modified, and written back. The original
7418value at the location is returned. The modification is specified by the
7419operation argument:
7420
7421- xchg: ``*ptr = val``
7422- add: ``*ptr = *ptr + val``
7423- sub: ``*ptr = *ptr - val``
7424- and: ``*ptr = *ptr & val``
7425- nand: ``*ptr = ~(*ptr & val)``
7426- or: ``*ptr = *ptr | val``
7427- xor: ``*ptr = *ptr ^ val``
7428- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
7429- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
7430- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned
7431 comparison)
7432- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned
7433 comparison)
7434
7435Example:
7436""""""""
7437
7438.. code-block:: llvm
7439
Tim Northover675a0962014-06-13 14:24:23 +00007440 %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32
Sean Silvab084af42012-12-07 10:36:55 +00007441
7442.. _i_getelementptr:
7443
7444'``getelementptr``' Instruction
7445^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7446
7447Syntax:
7448"""""""
7449
7450::
7451
David Blaikie16a97eb2015-03-04 22:02:58 +00007452 <result> = getelementptr <ty>, <ty>* <ptrval>{, <ty> <idx>}*
7453 <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, <ty> <idx>}*
7454 <result> = getelementptr <ty>, <ptr vector> <ptrval>, <vector index type> <idx>
Sean Silvab084af42012-12-07 10:36:55 +00007455
7456Overview:
7457"""""""""
7458
7459The '``getelementptr``' instruction is used to get the address of a
7460subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007461address calculation only and does not access memory. The instruction can also
7462be used to calculate a vector of such addresses.
Sean Silvab084af42012-12-07 10:36:55 +00007463
7464Arguments:
7465""""""""""
7466
David Blaikie16a97eb2015-03-04 22:02:58 +00007467The first argument is always a type used as the basis for the calculations.
7468The second argument is always a pointer or a vector of pointers, and is the
7469base address to start from. The remaining arguments are indices
Sean Silvab084af42012-12-07 10:36:55 +00007470that indicate which of the elements of the aggregate object are indexed.
7471The interpretation of each index is dependent on the type being indexed
7472into. The first index always indexes the pointer value given as the
7473first argument, the second index indexes a value of the type pointed to
7474(not necessarily the value directly pointed to, since the first index
7475can be non-zero), etc. The first type indexed into must be a pointer
7476value, subsequent types can be arrays, vectors, and structs. Note that
7477subsequent types being indexed into can never be pointers, since that
7478would require loading the pointer before continuing calculation.
7479
7480The type of each index argument depends on the type it is indexing into.
7481When indexing into a (optionally packed) structure, only ``i32`` integer
7482**constants** are allowed (when using a vector of indices they must all
7483be the **same** ``i32`` integer constant). When indexing into an array,
7484pointer or vector, integers of any width are allowed, and they are not
7485required to be constant. These integers are treated as signed values
7486where relevant.
7487
7488For example, let's consider a C code fragment and how it gets compiled
7489to LLVM:
7490
7491.. code-block:: c
7492
7493 struct RT {
7494 char A;
7495 int B[10][20];
7496 char C;
7497 };
7498 struct ST {
7499 int X;
7500 double Y;
7501 struct RT Z;
7502 };
7503
7504 int *foo(struct ST *s) {
7505 return &s[1].Z.B[5][13];
7506 }
7507
7508The LLVM code generated by Clang is:
7509
7510.. code-block:: llvm
7511
7512 %struct.RT = type { i8, [10 x [20 x i32]], i8 }
7513 %struct.ST = type { i32, double, %struct.RT }
7514
7515 define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
7516 entry:
David Blaikie16a97eb2015-03-04 22:02:58 +00007517 %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
Sean Silvab084af42012-12-07 10:36:55 +00007518 ret i32* %arrayidx
7519 }
7520
7521Semantics:
7522""""""""""
7523
7524In the example above, the first index is indexing into the
7525'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
7526= '``{ i32, double, %struct.RT }``' type, a structure. The second index
7527indexes into the third element of the structure, yielding a
7528'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
7529structure. The third index indexes into the second element of the
7530structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
7531dimensions of the array are subscripted into, yielding an '``i32``'
7532type. The '``getelementptr``' instruction returns a pointer to this
7533element, thus computing a value of '``i32*``' type.
7534
7535Note that it is perfectly legal to index partially through a structure,
7536returning a pointer to an inner element. Because of this, the LLVM code
7537for the given testcase is equivalent to:
7538
7539.. code-block:: llvm
7540
7541 define i32* @foo(%struct.ST* %s) {
David Blaikie16a97eb2015-03-04 22:02:58 +00007542 %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1
7543 %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2
7544 %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3
7545 %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4
7546 %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5
Sean Silvab084af42012-12-07 10:36:55 +00007547 ret i32* %t5
7548 }
7549
7550If the ``inbounds`` keyword is present, the result value of the
7551``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base
7552pointer is not an *in bounds* address of an allocated object, or if any
7553of the addresses that would be formed by successive addition of the
7554offsets implied by the indices to the base address with infinitely
7555precise signed arithmetic are not an *in bounds* address of that
7556allocated object. The *in bounds* addresses for an allocated object are
7557all the addresses that point into the object, plus the address one byte
7558past the end. In cases where the base is a vector of pointers the
7559``inbounds`` keyword applies to each of the computations element-wise.
7560
7561If the ``inbounds`` keyword is not present, the offsets are added to the
7562base address with silently-wrapping two's complement arithmetic. If the
7563offsets have a different width from the pointer, they are sign-extended
7564or truncated to the width of the pointer. The result value of the
7565``getelementptr`` may be outside the object pointed to by the base
7566pointer. The result value may not necessarily be used to access memory
7567though, even if it happens to point into allocated storage. See the
7568:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
7569information.
7570
7571The getelementptr instruction is often confusing. For some more insight
7572into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
7573
7574Example:
7575""""""""
7576
7577.. code-block:: llvm
7578
7579 ; yields [12 x i8]*:aptr
David Blaikie16a97eb2015-03-04 22:02:58 +00007580 %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00007581 ; yields i8*:vptr
David Blaikie16a97eb2015-03-04 22:02:58 +00007582 %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00007583 ; yields i8*:eptr
David Blaikie16a97eb2015-03-04 22:02:58 +00007584 %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00007585 ; yields i32*:iptr
David Blaikie16a97eb2015-03-04 22:02:58 +00007586 %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
Sean Silvab084af42012-12-07 10:36:55 +00007587
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007588Vector of pointers:
7589"""""""""""""""""""
7590
7591The ``getelementptr`` returns a vector of pointers, instead of a single address,
7592when one or more of its arguments is a vector. In such cases, all vector
7593arguments should have the same number of elements, and every scalar argument
7594will be effectively broadcast into a vector during address calculation.
Sean Silvab084af42012-12-07 10:36:55 +00007595
7596.. code-block:: llvm
7597
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007598 ; All arguments are vectors:
7599 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8)
7600 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
Sean Silva706fba52015-08-06 22:56:24 +00007601
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007602 ; Add the same scalar offset to each pointer of a vector:
7603 ; A[i] = ptrs[i] + offset*sizeof(i8)
7604 %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
Sean Silva706fba52015-08-06 22:56:24 +00007605
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007606 ; Add distinct offsets to the same pointer:
7607 ; A[i] = ptr + offsets[i]*sizeof(i8)
7608 %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
Sean Silva706fba52015-08-06 22:56:24 +00007609
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007610 ; In all cases described above the type of the result is <4 x i8*>
7611
7612The two following instructions are equivalent:
7613
7614.. code-block:: llvm
7615
7616 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
7617 <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
7618 <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
7619 <4 x i32> %ind4,
7620 <4 x i64> <i64 13, i64 13, i64 13, i64 13>
Sean Silva706fba52015-08-06 22:56:24 +00007621
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007622 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
7623 i32 2, i32 1, <4 x i32> %ind4, i64 13
7624
7625Let's look at the C code, where the vector version of ``getelementptr``
7626makes sense:
7627
7628.. code-block:: c
7629
7630 // Let's assume that we vectorize the following loop:
7631 double *A, B; int *C;
7632 for (int i = 0; i < size; ++i) {
7633 A[i] = B[C[i]];
7634 }
7635
7636.. code-block:: llvm
7637
7638 ; get pointers for 8 elements from array B
7639 %ptrs = getelementptr double, double* %B, <8 x i32> %C
7640 ; load 8 elements from array B into A
7641 %A = call <8 x double> @llvm.masked.gather.v8f64(<8 x double*> %ptrs,
7642 i32 8, <8 x i1> %mask, <8 x double> %passthru)
Sean Silvab084af42012-12-07 10:36:55 +00007643
7644Conversion Operations
7645---------------------
7646
7647The instructions in this category are the conversion instructions
7648(casting) which all take a single operand and a type. They perform
7649various bit conversions on the operand.
7650
7651'``trunc .. to``' Instruction
7652^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7653
7654Syntax:
7655"""""""
7656
7657::
7658
7659 <result> = trunc <ty> <value> to <ty2> ; yields ty2
7660
7661Overview:
7662"""""""""
7663
7664The '``trunc``' instruction truncates its operand to the type ``ty2``.
7665
7666Arguments:
7667""""""""""
7668
7669The '``trunc``' instruction takes a value to trunc, and a type to trunc
7670it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
7671of the same number of integers. The bit size of the ``value`` must be
7672larger than the bit size of the destination type, ``ty2``. Equal sized
7673types are not allowed.
7674
7675Semantics:
7676""""""""""
7677
7678The '``trunc``' instruction truncates the high order bits in ``value``
7679and converts the remaining bits to ``ty2``. Since the source size must
7680be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
7681It will always truncate bits.
7682
7683Example:
7684""""""""
7685
7686.. code-block:: llvm
7687
7688 %X = trunc i32 257 to i8 ; yields i8:1
7689 %Y = trunc i32 123 to i1 ; yields i1:true
7690 %Z = trunc i32 122 to i1 ; yields i1:false
7691 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
7692
7693'``zext .. to``' Instruction
7694^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7695
7696Syntax:
7697"""""""
7698
7699::
7700
7701 <result> = zext <ty> <value> to <ty2> ; yields ty2
7702
7703Overview:
7704"""""""""
7705
7706The '``zext``' instruction zero extends its operand to type ``ty2``.
7707
7708Arguments:
7709""""""""""
7710
7711The '``zext``' instruction takes a value to cast, and a type to cast it
7712to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
7713the same number of integers. The bit size of the ``value`` must be
7714smaller than the bit size of the destination type, ``ty2``.
7715
7716Semantics:
7717""""""""""
7718
7719The ``zext`` fills the high order bits of the ``value`` with zero bits
7720until it reaches the size of the destination type, ``ty2``.
7721
7722When zero extending from i1, the result will always be either 0 or 1.
7723
7724Example:
7725""""""""
7726
7727.. code-block:: llvm
7728
7729 %X = zext i32 257 to i64 ; yields i64:257
7730 %Y = zext i1 true to i32 ; yields i32:1
7731 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
7732
7733'``sext .. to``' Instruction
7734^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7735
7736Syntax:
7737"""""""
7738
7739::
7740
7741 <result> = sext <ty> <value> to <ty2> ; yields ty2
7742
7743Overview:
7744"""""""""
7745
7746The '``sext``' sign extends ``value`` to the type ``ty2``.
7747
7748Arguments:
7749""""""""""
7750
7751The '``sext``' instruction takes a value to cast, and a type to cast it
7752to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
7753the same number of integers. The bit size of the ``value`` must be
7754smaller than the bit size of the destination type, ``ty2``.
7755
7756Semantics:
7757""""""""""
7758
7759The '``sext``' instruction performs a sign extension by copying the sign
7760bit (highest order bit) of the ``value`` until it reaches the bit size
7761of the type ``ty2``.
7762
7763When sign extending from i1, the extension always results in -1 or 0.
7764
7765Example:
7766""""""""
7767
7768.. code-block:: llvm
7769
7770 %X = sext i8 -1 to i16 ; yields i16 :65535
7771 %Y = sext i1 true to i32 ; yields i32:-1
7772 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
7773
7774'``fptrunc .. to``' Instruction
7775^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7776
7777Syntax:
7778"""""""
7779
7780::
7781
7782 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2
7783
7784Overview:
7785"""""""""
7786
7787The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
7788
7789Arguments:
7790""""""""""
7791
7792The '``fptrunc``' instruction takes a :ref:`floating point <t_floating>`
7793value to cast and a :ref:`floating point <t_floating>` type to cast it to.
7794The size of ``value`` must be larger than the size of ``ty2``. This
7795implies that ``fptrunc`` cannot be used to make a *no-op cast*.
7796
7797Semantics:
7798""""""""""
7799
Dan Liew50456fb2015-09-03 18:43:56 +00007800The '``fptrunc``' instruction casts a ``value`` from a larger
Sean Silvab084af42012-12-07 10:36:55 +00007801:ref:`floating point <t_floating>` type to a smaller :ref:`floating
Dan Liew50456fb2015-09-03 18:43:56 +00007802point <t_floating>` type. If the value cannot fit (i.e. overflows) within the
7803destination type, ``ty2``, then the results are undefined. If the cast produces
7804an inexact result, how rounding is performed (e.g. truncation, also known as
7805round to zero) is undefined.
Sean Silvab084af42012-12-07 10:36:55 +00007806
7807Example:
7808""""""""
7809
7810.. code-block:: llvm
7811
7812 %X = fptrunc double 123.0 to float ; yields float:123.0
7813 %Y = fptrunc double 1.0E+300 to float ; yields undefined
7814
7815'``fpext .. to``' Instruction
7816^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7817
7818Syntax:
7819"""""""
7820
7821::
7822
7823 <result> = fpext <ty> <value> to <ty2> ; yields ty2
7824
7825Overview:
7826"""""""""
7827
7828The '``fpext``' extends a floating point ``value`` to a larger floating
7829point value.
7830
7831Arguments:
7832""""""""""
7833
7834The '``fpext``' instruction takes a :ref:`floating point <t_floating>`
7835``value`` to cast, and a :ref:`floating point <t_floating>` type to cast it
7836to. The source type must be smaller than the destination type.
7837
7838Semantics:
7839""""""""""
7840
7841The '``fpext``' instruction extends the ``value`` from a smaller
7842:ref:`floating point <t_floating>` type to a larger :ref:`floating
7843point <t_floating>` type. The ``fpext`` cannot be used to make a
7844*no-op cast* because it always changes bits. Use ``bitcast`` to make a
7845*no-op cast* for a floating point cast.
7846
7847Example:
7848""""""""
7849
7850.. code-block:: llvm
7851
7852 %X = fpext float 3.125 to double ; yields double:3.125000e+00
7853 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000
7854
7855'``fptoui .. to``' Instruction
7856^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7857
7858Syntax:
7859"""""""
7860
7861::
7862
7863 <result> = fptoui <ty> <value> to <ty2> ; yields ty2
7864
7865Overview:
7866"""""""""
7867
7868The '``fptoui``' converts a floating point ``value`` to its unsigned
7869integer equivalent of type ``ty2``.
7870
7871Arguments:
7872""""""""""
7873
7874The '``fptoui``' instruction takes a value to cast, which must be a
7875scalar or vector :ref:`floating point <t_floating>` value, and a type to
7876cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
7877``ty`` is a vector floating point type, ``ty2`` must be a vector integer
7878type with the same number of elements as ``ty``
7879
7880Semantics:
7881""""""""""
7882
7883The '``fptoui``' instruction converts its :ref:`floating
7884point <t_floating>` operand into the nearest (rounding towards zero)
7885unsigned integer value. If the value cannot fit in ``ty2``, the results
7886are undefined.
7887
7888Example:
7889""""""""
7890
7891.. code-block:: llvm
7892
7893 %X = fptoui double 123.0 to i32 ; yields i32:123
7894 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
7895 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1
7896
7897'``fptosi .. to``' Instruction
7898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7899
7900Syntax:
7901"""""""
7902
7903::
7904
7905 <result> = fptosi <ty> <value> to <ty2> ; yields ty2
7906
7907Overview:
7908"""""""""
7909
7910The '``fptosi``' instruction converts :ref:`floating point <t_floating>`
7911``value`` to type ``ty2``.
7912
7913Arguments:
7914""""""""""
7915
7916The '``fptosi``' instruction takes a value to cast, which must be a
7917scalar or vector :ref:`floating point <t_floating>` value, and a type to
7918cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
7919``ty`` is a vector floating point type, ``ty2`` must be a vector integer
7920type with the same number of elements as ``ty``
7921
7922Semantics:
7923""""""""""
7924
7925The '``fptosi``' instruction converts its :ref:`floating
7926point <t_floating>` operand into the nearest (rounding towards zero)
7927signed integer value. If the value cannot fit in ``ty2``, the results
7928are undefined.
7929
7930Example:
7931""""""""
7932
7933.. code-block:: llvm
7934
7935 %X = fptosi double -123.0 to i32 ; yields i32:-123
7936 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
7937 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1
7938
7939'``uitofp .. to``' Instruction
7940^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7941
7942Syntax:
7943"""""""
7944
7945::
7946
7947 <result> = uitofp <ty> <value> to <ty2> ; yields ty2
7948
7949Overview:
7950"""""""""
7951
7952The '``uitofp``' instruction regards ``value`` as an unsigned integer
7953and converts that value to the ``ty2`` type.
7954
7955Arguments:
7956""""""""""
7957
7958The '``uitofp``' instruction takes a value to cast, which must be a
7959scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
7960``ty2``, which must be an :ref:`floating point <t_floating>` type. If
7961``ty`` is a vector integer type, ``ty2`` must be a vector floating point
7962type with the same number of elements as ``ty``
7963
7964Semantics:
7965""""""""""
7966
7967The '``uitofp``' instruction interprets its operand as an unsigned
7968integer quantity and converts it to the corresponding floating point
7969value. If the value cannot fit in the floating point value, the results
7970are undefined.
7971
7972Example:
7973""""""""
7974
7975.. code-block:: llvm
7976
7977 %X = uitofp i32 257 to float ; yields float:257.0
7978 %Y = uitofp i8 -1 to double ; yields double:255.0
7979
7980'``sitofp .. to``' Instruction
7981^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7982
7983Syntax:
7984"""""""
7985
7986::
7987
7988 <result> = sitofp <ty> <value> to <ty2> ; yields ty2
7989
7990Overview:
7991"""""""""
7992
7993The '``sitofp``' instruction regards ``value`` as a signed integer and
7994converts that value to the ``ty2`` type.
7995
7996Arguments:
7997""""""""""
7998
7999The '``sitofp``' instruction takes a value to cast, which must be a
8000scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
8001``ty2``, which must be an :ref:`floating point <t_floating>` type. If
8002``ty`` is a vector integer type, ``ty2`` must be a vector floating point
8003type with the same number of elements as ``ty``
8004
8005Semantics:
8006""""""""""
8007
8008The '``sitofp``' instruction interprets its operand as a signed integer
8009quantity and converts it to the corresponding floating point value. If
8010the value cannot fit in the floating point value, the results are
8011undefined.
8012
8013Example:
8014""""""""
8015
8016.. code-block:: llvm
8017
8018 %X = sitofp i32 257 to float ; yields float:257.0
8019 %Y = sitofp i8 -1 to double ; yields double:-1.0
8020
8021.. _i_ptrtoint:
8022
8023'``ptrtoint .. to``' Instruction
8024^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8025
8026Syntax:
8027"""""""
8028
8029::
8030
8031 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2
8032
8033Overview:
8034"""""""""
8035
8036The '``ptrtoint``' instruction converts the pointer or a vector of
8037pointers ``value`` to the integer (or vector of integers) type ``ty2``.
8038
8039Arguments:
8040""""""""""
8041
8042The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
Ed Maste8ed40ce2015-04-14 20:52:58 +00008043a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
Sean Silvab084af42012-12-07 10:36:55 +00008044type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
8045a vector of integers type.
8046
8047Semantics:
8048""""""""""
8049
8050The '``ptrtoint``' instruction converts ``value`` to integer type
8051``ty2`` by interpreting the pointer value as an integer and either
8052truncating or zero extending that value to the size of the integer type.
8053If ``value`` is smaller than ``ty2`` then a zero extension is done. If
8054``value`` is larger than ``ty2`` then a truncation is done. If they are
8055the same size, then nothing is done (*no-op cast*) other than a type
8056change.
8057
8058Example:
8059""""""""
8060
8061.. code-block:: llvm
8062
8063 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture
8064 %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture
8065 %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
8066
8067.. _i_inttoptr:
8068
8069'``inttoptr .. to``' Instruction
8070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8071
8072Syntax:
8073"""""""
8074
8075::
8076
8077 <result> = inttoptr <ty> <value> to <ty2> ; yields ty2
8078
8079Overview:
8080"""""""""
8081
8082The '``inttoptr``' instruction converts an integer ``value`` to a
8083pointer type, ``ty2``.
8084
8085Arguments:
8086""""""""""
8087
8088The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
8089cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
8090type.
8091
8092Semantics:
8093""""""""""
8094
8095The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
8096applying either a zero extension or a truncation depending on the size
8097of the integer ``value``. If ``value`` is larger than the size of a
8098pointer then a truncation is done. If ``value`` is smaller than the size
8099of a pointer then a zero extension is done. If they are the same size,
8100nothing is done (*no-op cast*).
8101
8102Example:
8103""""""""
8104
8105.. code-block:: llvm
8106
8107 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture
8108 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture
8109 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture
8110 %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
8111
8112.. _i_bitcast:
8113
8114'``bitcast .. to``' Instruction
8115^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8116
8117Syntax:
8118"""""""
8119
8120::
8121
8122 <result> = bitcast <ty> <value> to <ty2> ; yields ty2
8123
8124Overview:
8125"""""""""
8126
8127The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
8128changing any bits.
8129
8130Arguments:
8131""""""""""
8132
8133The '``bitcast``' instruction takes a value to cast, which must be a
8134non-aggregate first class value, and a type to cast it to, which must
Matt Arsenault24b49c42013-07-31 17:49:08 +00008135also be a non-aggregate :ref:`first class <t_firstclass>` type. The
8136bit sizes of ``value`` and the destination type, ``ty2``, must be
Sean Silvaa1190322015-08-06 22:56:48 +00008137identical. If the source type is a pointer, the destination type must
Matt Arsenault24b49c42013-07-31 17:49:08 +00008138also be a pointer of the same size. This instruction supports bitwise
8139conversion of vectors to integers and to vectors of other types (as
8140long as they have the same size).
Sean Silvab084af42012-12-07 10:36:55 +00008141
8142Semantics:
8143""""""""""
8144
Matt Arsenault24b49c42013-07-31 17:49:08 +00008145The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
8146is always a *no-op cast* because no bits change with this
8147conversion. The conversion is done as if the ``value`` had been stored
8148to memory and read back as type ``ty2``. Pointer (or vector of
8149pointers) types may only be converted to other pointer (or vector of
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008150pointers) types with the same address space through this instruction.
8151To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
8152or :ref:`ptrtoint <i_ptrtoint>` instructions first.
Sean Silvab084af42012-12-07 10:36:55 +00008153
8154Example:
8155""""""""
8156
Renato Golin124f2592016-07-20 12:16:38 +00008157.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00008158
8159 %X = bitcast i8 255 to i8 ; yields i8 :-1
8160 %Y = bitcast i32* %x to sint* ; yields sint*:%x
8161 %Z = bitcast <2 x int> %V to i64; ; yields i64: %V
8162 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
8163
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008164.. _i_addrspacecast:
8165
8166'``addrspacecast .. to``' Instruction
8167^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8168
8169Syntax:
8170"""""""
8171
8172::
8173
8174 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2
8175
8176Overview:
8177"""""""""
8178
8179The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
8180address space ``n`` to type ``pty2`` in address space ``m``.
8181
8182Arguments:
8183""""""""""
8184
8185The '``addrspacecast``' instruction takes a pointer or vector of pointer value
8186to cast and a pointer type to cast it to, which must have a different
8187address space.
8188
8189Semantics:
8190""""""""""
8191
8192The '``addrspacecast``' instruction converts the pointer value
8193``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
Matt Arsenault54a2a172013-11-15 05:44:56 +00008194value modification, depending on the target and the address space
8195pair. Pointer conversions within the same address space must be
8196performed with the ``bitcast`` instruction. Note that if the address space
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008197conversion is legal then both result and operand refer to the same memory
8198location.
8199
8200Example:
8201""""""""
8202
8203.. code-block:: llvm
8204
Matt Arsenault9c13dd02013-11-15 22:43:50 +00008205 %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x
8206 %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y
8207 %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008208
Sean Silvab084af42012-12-07 10:36:55 +00008209.. _otherops:
8210
8211Other Operations
8212----------------
8213
8214The instructions in this category are the "miscellaneous" instructions,
8215which defy better classification.
8216
8217.. _i_icmp:
8218
8219'``icmp``' Instruction
8220^^^^^^^^^^^^^^^^^^^^^^
8221
8222Syntax:
8223"""""""
8224
8225::
8226
Tim Northover675a0962014-06-13 14:24:23 +00008227 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Sean Silvab084af42012-12-07 10:36:55 +00008228
8229Overview:
8230"""""""""
8231
8232The '``icmp``' instruction returns a boolean value or a vector of
8233boolean values based on comparison of its two integer, integer vector,
8234pointer, or pointer vector operands.
8235
8236Arguments:
8237""""""""""
8238
8239The '``icmp``' instruction takes three operands. The first operand is
8240the condition code indicating the kind of comparison to perform. It is
Sanjay Patel43d41442016-03-30 21:38:20 +00008241not a value, just a keyword. The possible condition codes are:
Sean Silvab084af42012-12-07 10:36:55 +00008242
8243#. ``eq``: equal
8244#. ``ne``: not equal
8245#. ``ugt``: unsigned greater than
8246#. ``uge``: unsigned greater or equal
8247#. ``ult``: unsigned less than
8248#. ``ule``: unsigned less or equal
8249#. ``sgt``: signed greater than
8250#. ``sge``: signed greater or equal
8251#. ``slt``: signed less than
8252#. ``sle``: signed less or equal
8253
8254The remaining two arguments must be :ref:`integer <t_integer>` or
8255:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
8256must also be identical types.
8257
8258Semantics:
8259""""""""""
8260
8261The '``icmp``' compares ``op1`` and ``op2`` according to the condition
8262code given as ``cond``. The comparison performed always yields either an
8263:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
8264
8265#. ``eq``: yields ``true`` if the operands are equal, ``false``
8266 otherwise. No sign interpretation is necessary or performed.
8267#. ``ne``: yields ``true`` if the operands are unequal, ``false``
8268 otherwise. No sign interpretation is necessary or performed.
8269#. ``ugt``: interprets the operands as unsigned values and yields
8270 ``true`` if ``op1`` is greater than ``op2``.
8271#. ``uge``: interprets the operands as unsigned values and yields
8272 ``true`` if ``op1`` is greater than or equal to ``op2``.
8273#. ``ult``: interprets the operands as unsigned values and yields
8274 ``true`` if ``op1`` is less than ``op2``.
8275#. ``ule``: interprets the operands as unsigned values and yields
8276 ``true`` if ``op1`` is less than or equal to ``op2``.
8277#. ``sgt``: interprets the operands as signed values and yields ``true``
8278 if ``op1`` is greater than ``op2``.
8279#. ``sge``: interprets the operands as signed values and yields ``true``
8280 if ``op1`` is greater than or equal to ``op2``.
8281#. ``slt``: interprets the operands as signed values and yields ``true``
8282 if ``op1`` is less than ``op2``.
8283#. ``sle``: interprets the operands as signed values and yields ``true``
8284 if ``op1`` is less than or equal to ``op2``.
8285
8286If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
8287are compared as if they were integers.
8288
8289If the operands are integer vectors, then they are compared element by
8290element. The result is an ``i1`` vector with the same number of elements
8291as the values being compared. Otherwise, the result is an ``i1``.
8292
8293Example:
8294""""""""
8295
Renato Golin124f2592016-07-20 12:16:38 +00008296.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00008297
8298 <result> = icmp eq i32 4, 5 ; yields: result=false
8299 <result> = icmp ne float* %X, %X ; yields: result=false
8300 <result> = icmp ult i16 4, 5 ; yields: result=true
8301 <result> = icmp sgt i16 4, 5 ; yields: result=false
8302 <result> = icmp ule i16 -4, 5 ; yields: result=false
8303 <result> = icmp sge i16 4, 5 ; yields: result=false
8304
Sean Silvab084af42012-12-07 10:36:55 +00008305.. _i_fcmp:
8306
8307'``fcmp``' Instruction
8308^^^^^^^^^^^^^^^^^^^^^^
8309
8310Syntax:
8311"""""""
8312
8313::
8314
James Molloy88eb5352015-07-10 12:52:00 +00008315 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Sean Silvab084af42012-12-07 10:36:55 +00008316
8317Overview:
8318"""""""""
8319
8320The '``fcmp``' instruction returns a boolean value or vector of boolean
8321values based on comparison of its operands.
8322
8323If the operands are floating point scalars, then the result type is a
8324boolean (:ref:`i1 <t_integer>`).
8325
8326If the operands are floating point vectors, then the result type is a
8327vector of boolean with the same number of elements as the operands being
8328compared.
8329
8330Arguments:
8331""""""""""
8332
8333The '``fcmp``' instruction takes three operands. The first operand is
8334the condition code indicating the kind of comparison to perform. It is
Sanjay Patel43d41442016-03-30 21:38:20 +00008335not a value, just a keyword. The possible condition codes are:
Sean Silvab084af42012-12-07 10:36:55 +00008336
8337#. ``false``: no comparison, always returns false
8338#. ``oeq``: ordered and equal
8339#. ``ogt``: ordered and greater than
8340#. ``oge``: ordered and greater than or equal
8341#. ``olt``: ordered and less than
8342#. ``ole``: ordered and less than or equal
8343#. ``one``: ordered and not equal
8344#. ``ord``: ordered (no nans)
8345#. ``ueq``: unordered or equal
8346#. ``ugt``: unordered or greater than
8347#. ``uge``: unordered or greater than or equal
8348#. ``ult``: unordered or less than
8349#. ``ule``: unordered or less than or equal
8350#. ``une``: unordered or not equal
8351#. ``uno``: unordered (either nans)
8352#. ``true``: no comparison, always returns true
8353
8354*Ordered* means that neither operand is a QNAN while *unordered* means
8355that either operand may be a QNAN.
8356
8357Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating
8358point <t_floating>` type or a :ref:`vector <t_vector>` of floating point
8359type. They must have identical types.
8360
8361Semantics:
8362""""""""""
8363
8364The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
8365condition code given as ``cond``. If the operands are vectors, then the
8366vectors are compared element by element. Each comparison performed
8367always yields an :ref:`i1 <t_integer>` result, as follows:
8368
8369#. ``false``: always yields ``false``, regardless of operands.
8370#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
8371 is equal to ``op2``.
8372#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
8373 is greater than ``op2``.
8374#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
8375 is greater than or equal to ``op2``.
8376#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
8377 is less than ``op2``.
8378#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
8379 is less than or equal to ``op2``.
8380#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
8381 is not equal to ``op2``.
8382#. ``ord``: yields ``true`` if both operands are not a QNAN.
8383#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
8384 equal to ``op2``.
8385#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
8386 greater than ``op2``.
8387#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
8388 greater than or equal to ``op2``.
8389#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
8390 less than ``op2``.
8391#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
8392 less than or equal to ``op2``.
8393#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
8394 not equal to ``op2``.
8395#. ``uno``: yields ``true`` if either operand is a QNAN.
8396#. ``true``: always yields ``true``, regardless of operands.
8397
James Molloy88eb5352015-07-10 12:52:00 +00008398The ``fcmp`` instruction can also optionally take any number of
8399:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
8400otherwise unsafe floating point optimizations.
8401
8402Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
8403only flags that have any effect on its semantics are those that allow
8404assumptions to be made about the values of input arguments; namely
8405``nnan``, ``ninf``, and ``nsz``. See :ref:`fastmath` for more information.
8406
Sean Silvab084af42012-12-07 10:36:55 +00008407Example:
8408""""""""
8409
Renato Golin124f2592016-07-20 12:16:38 +00008410.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00008411
8412 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false
8413 <result> = fcmp one float 4.0, 5.0 ; yields: result=true
8414 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true
8415 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false
8416
Sean Silvab084af42012-12-07 10:36:55 +00008417.. _i_phi:
8418
8419'``phi``' Instruction
8420^^^^^^^^^^^^^^^^^^^^^
8421
8422Syntax:
8423"""""""
8424
8425::
8426
8427 <result> = phi <ty> [ <val0>, <label0>], ...
8428
8429Overview:
8430"""""""""
8431
8432The '``phi``' instruction is used to implement the φ node in the SSA
8433graph representing the function.
8434
8435Arguments:
8436""""""""""
8437
8438The type of the incoming values is specified with the first type field.
8439After this, the '``phi``' instruction takes a list of pairs as
8440arguments, with one pair for each predecessor basic block of the current
8441block. Only values of :ref:`first class <t_firstclass>` type may be used as
8442the value arguments to the PHI node. Only labels may be used as the
8443label arguments.
8444
8445There must be no non-phi instructions between the start of a basic block
8446and the PHI instructions: i.e. PHI instructions must be first in a basic
8447block.
8448
8449For the purposes of the SSA form, the use of each incoming value is
8450deemed to occur on the edge from the corresponding predecessor block to
8451the current block (but after any definition of an '``invoke``'
8452instruction's return value on the same edge).
8453
8454Semantics:
8455""""""""""
8456
8457At runtime, the '``phi``' instruction logically takes on the value
8458specified by the pair corresponding to the predecessor basic block that
8459executed just prior to the current block.
8460
8461Example:
8462""""""""
8463
8464.. code-block:: llvm
8465
8466 Loop: ; Infinite loop that counts from 0 on up...
8467 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
8468 %nextindvar = add i32 %indvar, 1
8469 br label %Loop
8470
8471.. _i_select:
8472
8473'``select``' Instruction
8474^^^^^^^^^^^^^^^^^^^^^^^^
8475
8476Syntax:
8477"""""""
8478
8479::
8480
8481 <result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty
8482
8483 selty is either i1 or {<N x i1>}
8484
8485Overview:
8486"""""""""
8487
8488The '``select``' instruction is used to choose one value based on a
Joerg Sonnenberger94321ec2014-03-26 15:30:21 +00008489condition, without IR-level branching.
Sean Silvab084af42012-12-07 10:36:55 +00008490
8491Arguments:
8492""""""""""
8493
8494The '``select``' instruction requires an 'i1' value or a vector of 'i1'
8495values indicating the condition, and two values of the same :ref:`first
David Majnemer40a0b592015-03-03 22:45:47 +00008496class <t_firstclass>` type.
Sean Silvab084af42012-12-07 10:36:55 +00008497
8498Semantics:
8499""""""""""
8500
8501If the condition is an i1 and it evaluates to 1, the instruction returns
8502the first value argument; otherwise, it returns the second value
8503argument.
8504
8505If the condition is a vector of i1, then the value arguments must be
8506vectors of the same size, and the selection is done element by element.
8507
David Majnemer40a0b592015-03-03 22:45:47 +00008508If the condition is an i1 and the value arguments are vectors of the
8509same size, then an entire vector is selected.
8510
Sean Silvab084af42012-12-07 10:36:55 +00008511Example:
8512""""""""
8513
8514.. code-block:: llvm
8515
8516 %X = select i1 true, i8 17, i8 42 ; yields i8:17
8517
8518.. _i_call:
8519
8520'``call``' Instruction
8521^^^^^^^^^^^^^^^^^^^^^^
8522
8523Syntax:
8524"""""""
8525
8526::
8527
David Blaikieb83cf102016-07-13 17:21:34 +00008528 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00008529 [ operand bundles ]
Sean Silvab084af42012-12-07 10:36:55 +00008530
8531Overview:
8532"""""""""
8533
8534The '``call``' instruction represents a simple function call.
8535
8536Arguments:
8537""""""""""
8538
8539This instruction requires several arguments:
8540
Reid Kleckner5772b772014-04-24 20:14:34 +00008541#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
Sean Silvaa1190322015-08-06 22:56:48 +00008542 should perform tail call optimization. The ``tail`` marker is a hint that
8543 `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
Reid Kleckner5772b772014-04-24 20:14:34 +00008544 means that the call must be tail call optimized in order for the program to
Sean Silvaa1190322015-08-06 22:56:48 +00008545 be correct. The ``musttail`` marker provides these guarantees:
Reid Kleckner5772b772014-04-24 20:14:34 +00008546
8547 #. The call will not cause unbounded stack growth if it is part of a
8548 recursive cycle in the call graph.
8549 #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are
8550 forwarded in place.
8551
8552 Both markers imply that the callee does not access allocas or varargs from
Sean Silvaa1190322015-08-06 22:56:48 +00008553 the caller. Calls marked ``musttail`` must obey the following additional
Reid Kleckner5772b772014-04-24 20:14:34 +00008554 rules:
8555
8556 - The call must immediately precede a :ref:`ret <i_ret>` instruction,
8557 or a pointer bitcast followed by a ret instruction.
8558 - The ret instruction must return the (possibly bitcasted) value
8559 produced by the call or void.
Sean Silvaa1190322015-08-06 22:56:48 +00008560 - The caller and callee prototypes must match. Pointer types of
Reid Kleckner5772b772014-04-24 20:14:34 +00008561 parameters or return types may differ in pointee type, but not
8562 in address space.
8563 - The calling conventions of the caller and callee must match.
8564 - All ABI-impacting function attributes, such as sret, byval, inreg,
8565 returned, and inalloca, must match.
Reid Kleckner83498642014-08-26 00:33:28 +00008566 - The callee must be varargs iff the caller is varargs. Bitcasting a
8567 non-varargs function to the appropriate varargs type is legal so
8568 long as the non-varargs prefixes obey the other rules.
Reid Kleckner5772b772014-04-24 20:14:34 +00008569
8570 Tail call optimization for calls marked ``tail`` is guaranteed to occur if
8571 the following conditions are met:
Sean Silvab084af42012-12-07 10:36:55 +00008572
8573 - Caller and callee both have the calling convention ``fastcc``.
8574 - The call is in tail position (ret immediately follows call and ret
8575 uses value of call or is void).
8576 - Option ``-tailcallopt`` is enabled, or
8577 ``llvm::GuaranteedTailCallOpt`` is ``true``.
Alp Tokercf218752014-06-30 18:57:16 +00008578 - `Platform-specific constraints are
Sean Silvab084af42012-12-07 10:36:55 +00008579 met. <CodeGenerator.html#tailcallopt>`_
8580
Akira Hatanaka5cfcce122015-11-06 23:55:38 +00008581#. The optional ``notail`` marker indicates that the optimizers should not add
8582 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
8583 call optimization from being performed on the call.
8584
Sanjay Patelfa54ace2015-12-14 21:59:03 +00008585#. The optional ``fast-math flags`` marker indicates that the call has one or more
8586 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
8587 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
8588 for calls that return a floating-point scalar or vector type.
8589
Sean Silvab084af42012-12-07 10:36:55 +00008590#. The optional "cconv" marker indicates which :ref:`calling
8591 convention <callingconv>` the call should use. If none is
8592 specified, the call defaults to using C calling conventions. The
8593 calling convention of the call must match the calling convention of
8594 the target function, or else the behavior is undefined.
8595#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
8596 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
8597 are valid here.
8598#. '``ty``': the type of the call instruction itself which is also the
8599 type of the return value. Functions that return no value are marked
8600 ``void``.
David Blaikieb83cf102016-07-13 17:21:34 +00008601#. '``fnty``': shall be the signature of the function being called. The
8602 argument types must match the types implied by this signature. This
8603 type can be omitted if the function is not varargs.
Sean Silvab084af42012-12-07 10:36:55 +00008604#. '``fnptrval``': An LLVM value containing a pointer to a function to
David Blaikieb83cf102016-07-13 17:21:34 +00008605 be called. In most cases, this is a direct function call, but
Sean Silvab084af42012-12-07 10:36:55 +00008606 indirect ``call``'s are just as possible, calling an arbitrary pointer
8607 to function value.
8608#. '``function args``': argument list whose types match the function
8609 signature argument types and parameter attributes. All arguments must
8610 be of :ref:`first class <t_firstclass>` type. If the function signature
8611 indicates the function accepts a variable number of arguments, the
8612 extra arguments can be specified.
8613#. The optional :ref:`function attributes <fnattrs>` list. Only
Matt Arsenault50d02ef2016-06-10 00:36:57 +00008614 '``noreturn``', '``nounwind``', '``readonly``' , '``readnone``',
8615 and '``convergent``' attributes are valid here.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00008616#. The optional :ref:`operand bundles <opbundles>` list.
Sean Silvab084af42012-12-07 10:36:55 +00008617
8618Semantics:
8619""""""""""
8620
8621The '``call``' instruction is used to cause control flow to transfer to
8622a specified function, with its incoming arguments bound to the specified
8623values. Upon a '``ret``' instruction in the called function, control
8624flow continues with the instruction after the function call, and the
8625return value of the function is bound to the result argument.
8626
8627Example:
8628""""""""
8629
8630.. code-block:: llvm
8631
8632 %retval = call i32 @test(i32 %argc)
8633 call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32
8634 %X = tail call i32 @foo() ; yields i32
8635 %Y = tail call fastcc i32 @foo() ; yields i32
8636 call void %foo(i8 97 signext)
8637
8638 %struct.A = type { i32, i8 }
Tim Northover675a0962014-06-13 14:24:23 +00008639 %r = call %struct.A @foo() ; yields { i32, i8 }
Sean Silvab084af42012-12-07 10:36:55 +00008640 %gr = extractvalue %struct.A %r, 0 ; yields i32
8641 %gr1 = extractvalue %struct.A %r, 1 ; yields i8
8642 %Z = call void @foo() noreturn ; indicates that %foo never returns normally
8643 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
8644
8645llvm treats calls to some functions with names and arguments that match
8646the standard C99 library as being the C99 library functions, and may
8647perform optimizations or generate code for them under that assumption.
8648This is something we'd like to change in the future to provide better
8649support for freestanding environments and non-C-based languages.
8650
8651.. _i_va_arg:
8652
8653'``va_arg``' Instruction
8654^^^^^^^^^^^^^^^^^^^^^^^^
8655
8656Syntax:
8657"""""""
8658
8659::
8660
8661 <resultval> = va_arg <va_list*> <arglist>, <argty>
8662
8663Overview:
8664"""""""""
8665
8666The '``va_arg``' instruction is used to access arguments passed through
8667the "variable argument" area of a function call. It is used to implement
8668the ``va_arg`` macro in C.
8669
8670Arguments:
8671""""""""""
8672
8673This instruction takes a ``va_list*`` value and the type of the
8674argument. It returns a value of the specified argument type and
8675increments the ``va_list`` to point to the next argument. The actual
8676type of ``va_list`` is target specific.
8677
8678Semantics:
8679""""""""""
8680
8681The '``va_arg``' instruction loads an argument of the specified type
8682from the specified ``va_list`` and causes the ``va_list`` to point to
8683the next argument. For more information, see the variable argument
8684handling :ref:`Intrinsic Functions <int_varargs>`.
8685
8686It is legal for this instruction to be called in a function which does
8687not take a variable number of arguments, for example, the ``vfprintf``
8688function.
8689
8690``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
8691function <intrinsics>` because it takes a type as an argument.
8692
8693Example:
8694""""""""
8695
8696See the :ref:`variable argument processing <int_varargs>` section.
8697
8698Note that the code generator does not yet fully support va\_arg on many
8699targets. Also, it does not currently support va\_arg with aggregate
8700types on any target.
8701
8702.. _i_landingpad:
8703
8704'``landingpad``' Instruction
8705^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8706
8707Syntax:
8708"""""""
8709
8710::
8711
David Majnemer7fddecc2015-06-17 20:52:32 +00008712 <resultval> = landingpad <resultty> <clause>+
8713 <resultval> = landingpad <resultty> cleanup <clause>*
Sean Silvab084af42012-12-07 10:36:55 +00008714
8715 <clause> := catch <type> <value>
8716 <clause> := filter <array constant type> <array constant>
8717
8718Overview:
8719"""""""""
8720
8721The '``landingpad``' instruction is used by `LLVM's exception handling
8722system <ExceptionHandling.html#overview>`_ to specify that a basic block
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00008723is a landing pad --- one where the exception lands, and corresponds to the
Sean Silvab084af42012-12-07 10:36:55 +00008724code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
David Majnemer7fddecc2015-06-17 20:52:32 +00008725defines values supplied by the :ref:`personality function <personalityfn>` upon
Sean Silvab084af42012-12-07 10:36:55 +00008726re-entry to the function. The ``resultval`` has the type ``resultty``.
8727
8728Arguments:
8729""""""""""
8730
David Majnemer7fddecc2015-06-17 20:52:32 +00008731The optional
Sean Silvab084af42012-12-07 10:36:55 +00008732``cleanup`` flag indicates that the landing pad block is a cleanup.
8733
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00008734A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
Sean Silvab084af42012-12-07 10:36:55 +00008735contains the global variable representing the "type" that may be caught
8736or filtered respectively. Unlike the ``catch`` clause, the ``filter``
8737clause takes an array constant as its argument. Use
8738"``[0 x i8**] undef``" for a filter which cannot throw. The
8739'``landingpad``' instruction must contain *at least* one ``clause`` or
8740the ``cleanup`` flag.
8741
8742Semantics:
8743""""""""""
8744
8745The '``landingpad``' instruction defines the values which are set by the
David Majnemer7fddecc2015-06-17 20:52:32 +00008746:ref:`personality function <personalityfn>` upon re-entry to the function, and
Sean Silvab084af42012-12-07 10:36:55 +00008747therefore the "result type" of the ``landingpad`` instruction. As with
8748calling conventions, how the personality function results are
8749represented in LLVM IR is target specific.
8750
8751The clauses are applied in order from top to bottom. If two
8752``landingpad`` instructions are merged together through inlining, the
8753clauses from the calling function are appended to the list of clauses.
8754When the call stack is being unwound due to an exception being thrown,
8755the exception is compared against each ``clause`` in turn. If it doesn't
8756match any of the clauses, and the ``cleanup`` flag is not set, then
8757unwinding continues further up the call stack.
8758
8759The ``landingpad`` instruction has several restrictions:
8760
8761- A landing pad block is a basic block which is the unwind destination
8762 of an '``invoke``' instruction.
8763- A landing pad block must have a '``landingpad``' instruction as its
8764 first non-PHI instruction.
8765- There can be only one '``landingpad``' instruction within the landing
8766 pad block.
8767- A basic block that is not a landing pad block may not include a
8768 '``landingpad``' instruction.
Sean Silvab084af42012-12-07 10:36:55 +00008769
8770Example:
8771""""""""
8772
8773.. code-block:: llvm
8774
8775 ;; A landing pad which can catch an integer.
David Majnemer7fddecc2015-06-17 20:52:32 +00008776 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00008777 catch i8** @_ZTIi
8778 ;; A landing pad that is a cleanup.
David Majnemer7fddecc2015-06-17 20:52:32 +00008779 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00008780 cleanup
8781 ;; A landing pad which can catch an integer and can only throw a double.
David Majnemer7fddecc2015-06-17 20:52:32 +00008782 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00008783 catch i8** @_ZTIi
8784 filter [1 x i8**] [@_ZTId]
8785
Joseph Tremoulet2adaa982016-01-10 04:46:10 +00008786.. _i_catchpad:
8787
8788'``catchpad``' Instruction
8789^^^^^^^^^^^^^^^^^^^^^^^^^^
8790
8791Syntax:
8792"""""""
8793
8794::
8795
8796 <resultval> = catchpad within <catchswitch> [<args>*]
8797
8798Overview:
8799"""""""""
8800
8801The '``catchpad``' instruction is used by `LLVM's exception handling
8802system <ExceptionHandling.html#overview>`_ to specify that a basic block
8803begins a catch handler --- one where a personality routine attempts to transfer
8804control to catch an exception.
8805
8806Arguments:
8807""""""""""
8808
8809The ``catchswitch`` operand must always be a token produced by a
8810:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
8811ensures that each ``catchpad`` has exactly one predecessor block, and it always
8812terminates in a ``catchswitch``.
8813
8814The ``args`` correspond to whatever information the personality routine
8815requires to know if this is an appropriate handler for the exception. Control
8816will transfer to the ``catchpad`` if this is the first appropriate handler for
8817the exception.
8818
8819The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
8820``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
8821pads.
8822
8823Semantics:
8824""""""""""
8825
8826When the call stack is being unwound due to an exception being thrown, the
8827exception is compared against the ``args``. If it doesn't match, control will
8828not reach the ``catchpad`` instruction. The representation of ``args`` is
8829entirely target and personality function-specific.
8830
8831Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
8832instruction must be the first non-phi of its parent basic block.
8833
8834The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
8835instructions is described in the
8836`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
8837
8838When a ``catchpad`` has been "entered" but not yet "exited" (as
8839described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8840it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
8841that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
8842
8843Example:
8844""""""""
8845
Renato Golin124f2592016-07-20 12:16:38 +00008846.. code-block:: text
Joseph Tremoulet2adaa982016-01-10 04:46:10 +00008847
8848 dispatch:
8849 %cs = catchswitch within none [label %handler0] unwind to caller
8850 ;; A catch block which can catch an integer.
8851 handler0:
8852 %tok = catchpad within %cs [i8** @_ZTIi]
8853
David Majnemer654e1302015-07-31 17:58:14 +00008854.. _i_cleanuppad:
8855
8856'``cleanuppad``' Instruction
8857^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8858
8859Syntax:
8860"""""""
8861
8862::
8863
David Majnemer8a1c45d2015-12-12 05:38:55 +00008864 <resultval> = cleanuppad within <parent> [<args>*]
David Majnemer654e1302015-07-31 17:58:14 +00008865
8866Overview:
8867"""""""""
8868
8869The '``cleanuppad``' instruction is used by `LLVM's exception handling
8870system <ExceptionHandling.html#overview>`_ to specify that a basic block
8871is a cleanup block --- one where a personality routine attempts to
8872transfer control to run cleanup actions.
8873The ``args`` correspond to whatever additional
8874information the :ref:`personality function <personalityfn>` requires to
8875execute the cleanup.
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00008876The ``resultval`` has the type :ref:`token <t_token>` and is used to
David Majnemer8a1c45d2015-12-12 05:38:55 +00008877match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
8878The ``parent`` argument is the token of the funclet that contains the
8879``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
8880this operand may be the token ``none``.
David Majnemer654e1302015-07-31 17:58:14 +00008881
8882Arguments:
8883""""""""""
8884
8885The instruction takes a list of arbitrary values which are interpreted
8886by the :ref:`personality function <personalityfn>`.
8887
8888Semantics:
8889""""""""""
8890
David Majnemer654e1302015-07-31 17:58:14 +00008891When the call stack is being unwound due to an exception being thrown,
8892the :ref:`personality function <personalityfn>` transfers control to the
8893``cleanuppad`` with the aid of the personality-specific arguments.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00008894As with calling conventions, how the personality function results are
8895represented in LLVM IR is target specific.
David Majnemer654e1302015-07-31 17:58:14 +00008896
8897The ``cleanuppad`` instruction has several restrictions:
8898
8899- A cleanup block is a basic block which is the unwind destination of
8900 an exceptional instruction.
8901- A cleanup block must have a '``cleanuppad``' instruction as its
8902 first non-PHI instruction.
8903- There can be only one '``cleanuppad``' instruction within the
8904 cleanup block.
8905- A basic block that is not a cleanup block may not include a
8906 '``cleanuppad``' instruction.
David Majnemer8a1c45d2015-12-12 05:38:55 +00008907
Joseph Tremoulete28885e2016-01-10 04:28:38 +00008908When a ``cleanuppad`` has been "entered" but not yet "exited" (as
8909described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
8910it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
8911that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
David Majnemer8a1c45d2015-12-12 05:38:55 +00008912
David Majnemer654e1302015-07-31 17:58:14 +00008913Example:
8914""""""""
8915
Renato Golin124f2592016-07-20 12:16:38 +00008916.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00008917
David Majnemer8a1c45d2015-12-12 05:38:55 +00008918 %tok = cleanuppad within %cs []
David Majnemer654e1302015-07-31 17:58:14 +00008919
Sean Silvab084af42012-12-07 10:36:55 +00008920.. _intrinsics:
8921
8922Intrinsic Functions
8923===================
8924
8925LLVM supports the notion of an "intrinsic function". These functions
8926have well known names and semantics and are required to follow certain
8927restrictions. Overall, these intrinsics represent an extension mechanism
8928for the LLVM language that does not require changing all of the
8929transformations in LLVM when adding to the language (or the bitcode
8930reader/writer, the parser, etc...).
8931
8932Intrinsic function names must all start with an "``llvm.``" prefix. This
8933prefix is reserved in LLVM for intrinsic names; thus, function names may
8934not begin with this prefix. Intrinsic functions must always be external
8935functions: you cannot define the body of intrinsic functions. Intrinsic
8936functions may only be used in call or invoke instructions: it is illegal
8937to take the address of an intrinsic function. Additionally, because
8938intrinsic functions are part of the LLVM language, it is required if any
8939are added that they be documented here.
8940
8941Some intrinsic functions can be overloaded, i.e., the intrinsic
8942represents a family of functions that perform the same operation but on
8943different data types. Because LLVM can represent over 8 million
8944different integer types, overloading is used commonly to allow an
8945intrinsic function to operate on any integer type. One or more of the
8946argument types or the result type can be overloaded to accept any
8947integer type. Argument types may also be defined as exactly matching a
8948previous argument's type or the result type. This allows an intrinsic
8949function which accepts multiple arguments, but needs all of them to be
8950of the same type, to only be overloaded with respect to a single
8951argument or the result.
8952
8953Overloaded intrinsics will have the names of its overloaded argument
8954types encoded into its function name, each preceded by a period. Only
8955those types which are overloaded result in a name suffix. Arguments
8956whose type is matched against another type do not. For example, the
8957``llvm.ctpop`` function can take an integer of any width and returns an
8958integer of exactly the same integer width. This leads to a family of
8959functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
8960``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
8961overloaded, and only one type suffix is required. Because the argument's
8962type is matched against the return type, it does not require its own
8963name suffix.
8964
8965To learn how to add an intrinsic function, please see the `Extending
8966LLVM Guide <ExtendingLLVM.html>`_.
8967
8968.. _int_varargs:
8969
8970Variable Argument Handling Intrinsics
8971-------------------------------------
8972
8973Variable argument support is defined in LLVM with the
8974:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
8975functions. These functions are related to the similarly named macros
8976defined in the ``<stdarg.h>`` header file.
8977
8978All of these functions operate on arguments that use a target-specific
8979value type "``va_list``". The LLVM assembly language reference manual
8980does not define what this type is, so all transformations should be
8981prepared to handle these functions regardless of the type used.
8982
8983This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
8984variable argument handling intrinsic functions are used.
8985
8986.. code-block:: llvm
8987
Tim Northoverab60bb92014-11-02 01:21:51 +00008988 ; This struct is different for every platform. For most platforms,
8989 ; it is merely an i8*.
8990 %struct.va_list = type { i8* }
8991
8992 ; For Unix x86_64 platforms, va_list is the following struct:
8993 ; %struct.va_list = type { i32, i32, i8*, i8* }
8994
Sean Silvab084af42012-12-07 10:36:55 +00008995 define i32 @test(i32 %X, ...) {
8996 ; Initialize variable argument processing
Tim Northoverab60bb92014-11-02 01:21:51 +00008997 %ap = alloca %struct.va_list
8998 %ap2 = bitcast %struct.va_list* %ap to i8*
Sean Silvab084af42012-12-07 10:36:55 +00008999 call void @llvm.va_start(i8* %ap2)
9000
9001 ; Read a single integer argument
Tim Northoverab60bb92014-11-02 01:21:51 +00009002 %tmp = va_arg i8* %ap2, i32
Sean Silvab084af42012-12-07 10:36:55 +00009003
9004 ; Demonstrate usage of llvm.va_copy and llvm.va_end
9005 %aq = alloca i8*
9006 %aq2 = bitcast i8** %aq to i8*
9007 call void @llvm.va_copy(i8* %aq2, i8* %ap2)
9008 call void @llvm.va_end(i8* %aq2)
9009
9010 ; Stop processing of arguments.
9011 call void @llvm.va_end(i8* %ap2)
9012 ret i32 %tmp
9013 }
9014
9015 declare void @llvm.va_start(i8*)
9016 declare void @llvm.va_copy(i8*, i8*)
9017 declare void @llvm.va_end(i8*)
9018
9019.. _int_va_start:
9020
9021'``llvm.va_start``' Intrinsic
9022^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9023
9024Syntax:
9025"""""""
9026
9027::
9028
Nick Lewycky04f6de02013-09-11 22:04:52 +00009029 declare void @llvm.va_start(i8* <arglist>)
Sean Silvab084af42012-12-07 10:36:55 +00009030
9031Overview:
9032"""""""""
9033
9034The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
9035subsequent use by ``va_arg``.
9036
9037Arguments:
9038""""""""""
9039
9040The argument is a pointer to a ``va_list`` element to initialize.
9041
9042Semantics:
9043""""""""""
9044
9045The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
9046available in C. In a target-dependent way, it initializes the
9047``va_list`` element to which the argument points, so that the next call
9048to ``va_arg`` will produce the first variable argument passed to the
9049function. Unlike the C ``va_start`` macro, this intrinsic does not need
9050to know the last argument of the function as the compiler can figure
9051that out.
9052
9053'``llvm.va_end``' Intrinsic
9054^^^^^^^^^^^^^^^^^^^^^^^^^^^
9055
9056Syntax:
9057"""""""
9058
9059::
9060
9061 declare void @llvm.va_end(i8* <arglist>)
9062
9063Overview:
9064"""""""""
9065
9066The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
9067initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
9068
9069Arguments:
9070""""""""""
9071
9072The argument is a pointer to a ``va_list`` to destroy.
9073
9074Semantics:
9075""""""""""
9076
9077The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
9078available in C. In a target-dependent way, it destroys the ``va_list``
9079element to which the argument points. Calls to
9080:ref:`llvm.va_start <int_va_start>` and
9081:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
9082``llvm.va_end``.
9083
9084.. _int_va_copy:
9085
9086'``llvm.va_copy``' Intrinsic
9087^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9088
9089Syntax:
9090"""""""
9091
9092::
9093
9094 declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
9095
9096Overview:
9097"""""""""
9098
9099The '``llvm.va_copy``' intrinsic copies the current argument position
9100from the source argument list to the destination argument list.
9101
9102Arguments:
9103""""""""""
9104
9105The first argument is a pointer to a ``va_list`` element to initialize.
9106The second argument is a pointer to a ``va_list`` element to copy from.
9107
9108Semantics:
9109""""""""""
9110
9111The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
9112available in C. In a target-dependent way, it copies the source
9113``va_list`` element into the destination ``va_list`` element. This
9114intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
9115arbitrarily complex and require, for example, memory allocation.
9116
9117Accurate Garbage Collection Intrinsics
9118--------------------------------------
9119
Philip Reamesc5b0f562015-02-25 23:52:06 +00009120LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
Mehdi Amini4a121fa2015-03-14 22:04:06 +00009121(GC) requires the frontend to generate code containing appropriate intrinsic
9122calls and select an appropriate GC strategy which knows how to lower these
Philip Reamesc5b0f562015-02-25 23:52:06 +00009123intrinsics in a manner which is appropriate for the target collector.
9124
Sean Silvab084af42012-12-07 10:36:55 +00009125These intrinsics allow identification of :ref:`GC roots on the
9126stack <int_gcroot>`, as well as garbage collector implementations that
9127require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
Philip Reamesc5b0f562015-02-25 23:52:06 +00009128Frontends for type-safe garbage collected languages should generate
Sean Silvab084af42012-12-07 10:36:55 +00009129these intrinsics to make use of the LLVM garbage collectors. For more
Philip Reamesf80bbff2015-02-25 23:45:20 +00009130details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
Sean Silvab084af42012-12-07 10:36:55 +00009131
Philip Reamesf80bbff2015-02-25 23:45:20 +00009132Experimental Statepoint Intrinsics
9133^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9134
9135LLVM provides an second experimental set of intrinsics for describing garbage
Sean Silvaa1190322015-08-06 22:56:48 +00009136collection safepoints in compiled code. These intrinsics are an alternative
Mehdi Amini4a121fa2015-03-14 22:04:06 +00009137to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
Sean Silvaa1190322015-08-06 22:56:48 +00009138:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
Mehdi Amini4a121fa2015-03-14 22:04:06 +00009139differences in approach are covered in the `Garbage Collection with LLVM
Sean Silvaa1190322015-08-06 22:56:48 +00009140<GarbageCollection.html>`_ documentation. The intrinsics themselves are
Philip Reamesf80bbff2015-02-25 23:45:20 +00009141described in :doc:`Statepoints`.
Sean Silvab084af42012-12-07 10:36:55 +00009142
9143.. _int_gcroot:
9144
9145'``llvm.gcroot``' Intrinsic
9146^^^^^^^^^^^^^^^^^^^^^^^^^^^
9147
9148Syntax:
9149"""""""
9150
9151::
9152
9153 declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
9154
9155Overview:
9156"""""""""
9157
9158The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
9159the code generator, and allows some metadata to be associated with it.
9160
9161Arguments:
9162""""""""""
9163
9164The first argument specifies the address of a stack object that contains
9165the root pointer. The second pointer (which must be either a constant or
9166a global value address) contains the meta-data to be associated with the
9167root.
9168
9169Semantics:
9170""""""""""
9171
9172At runtime, a call to this intrinsic stores a null pointer into the
9173"ptrloc" location. At compile-time, the code generator generates
9174information to allow the runtime to find the pointer at GC safe points.
9175The '``llvm.gcroot``' intrinsic may only be used in a function which
9176:ref:`specifies a GC algorithm <gc>`.
9177
9178.. _int_gcread:
9179
9180'``llvm.gcread``' Intrinsic
9181^^^^^^^^^^^^^^^^^^^^^^^^^^^
9182
9183Syntax:
9184"""""""
9185
9186::
9187
9188 declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
9189
9190Overview:
9191"""""""""
9192
9193The '``llvm.gcread``' intrinsic identifies reads of references from heap
9194locations, allowing garbage collector implementations that require read
9195barriers.
9196
9197Arguments:
9198""""""""""
9199
9200The second argument is the address to read from, which should be an
9201address allocated from the garbage collector. The first object is a
9202pointer to the start of the referenced object, if needed by the language
9203runtime (otherwise null).
9204
9205Semantics:
9206""""""""""
9207
9208The '``llvm.gcread``' intrinsic has the same semantics as a load
9209instruction, but may be replaced with substantially more complex code by
9210the garbage collector runtime, as needed. The '``llvm.gcread``'
9211intrinsic may only be used in a function which :ref:`specifies a GC
9212algorithm <gc>`.
9213
9214.. _int_gcwrite:
9215
9216'``llvm.gcwrite``' Intrinsic
9217^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9218
9219Syntax:
9220"""""""
9221
9222::
9223
9224 declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
9225
9226Overview:
9227"""""""""
9228
9229The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
9230locations, allowing garbage collector implementations that require write
9231barriers (such as generational or reference counting collectors).
9232
9233Arguments:
9234""""""""""
9235
9236The first argument is the reference to store, the second is the start of
9237the object to store it to, and the third is the address of the field of
9238Obj to store to. If the runtime does not require a pointer to the
9239object, Obj may be null.
9240
9241Semantics:
9242""""""""""
9243
9244The '``llvm.gcwrite``' intrinsic has the same semantics as a store
9245instruction, but may be replaced with substantially more complex code by
9246the garbage collector runtime, as needed. The '``llvm.gcwrite``'
9247intrinsic may only be used in a function which :ref:`specifies a GC
9248algorithm <gc>`.
9249
9250Code Generator Intrinsics
9251-------------------------
9252
9253These intrinsics are provided by LLVM to expose special features that
9254may only be implemented with code generator support.
9255
9256'``llvm.returnaddress``' Intrinsic
9257^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9258
9259Syntax:
9260"""""""
9261
9262::
9263
9264 declare i8 *@llvm.returnaddress(i32 <level>)
9265
9266Overview:
9267"""""""""
9268
9269The '``llvm.returnaddress``' intrinsic attempts to compute a
9270target-specific value indicating the return address of the current
9271function or one of its callers.
9272
9273Arguments:
9274""""""""""
9275
9276The argument to this intrinsic indicates which function to return the
9277address for. Zero indicates the calling function, one indicates its
9278caller, etc. The argument is **required** to be a constant integer
9279value.
9280
9281Semantics:
9282""""""""""
9283
9284The '``llvm.returnaddress``' intrinsic either returns a pointer
9285indicating the return address of the specified call frame, or zero if it
9286cannot be identified. The value returned by this intrinsic is likely to
9287be incorrect or 0 for arguments other than zero, so it should only be
9288used for debugging purposes.
9289
9290Note that calling this intrinsic does not prevent function inlining or
9291other aggressive transformations, so the value returned may not be that
9292of the obvious source-language caller.
9293
9294'``llvm.frameaddress``' Intrinsic
9295^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9296
9297Syntax:
9298"""""""
9299
9300::
9301
9302 declare i8* @llvm.frameaddress(i32 <level>)
9303
9304Overview:
9305"""""""""
9306
9307The '``llvm.frameaddress``' intrinsic attempts to return the
9308target-specific frame pointer value for the specified stack frame.
9309
9310Arguments:
9311""""""""""
9312
9313The argument to this intrinsic indicates which function to return the
9314frame pointer for. Zero indicates the calling function, one indicates
9315its caller, etc. The argument is **required** to be a constant integer
9316value.
9317
9318Semantics:
9319""""""""""
9320
9321The '``llvm.frameaddress``' intrinsic either returns a pointer
9322indicating the frame address of the specified call frame, or zero if it
9323cannot be identified. The value returned by this intrinsic is likely to
9324be incorrect or 0 for arguments other than zero, so it should only be
9325used for debugging purposes.
9326
9327Note that calling this intrinsic does not prevent function inlining or
9328other aggressive transformations, so the value returned may not be that
9329of the obvious source-language caller.
9330
Reid Kleckner60381792015-07-07 22:25:32 +00009331'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
Reid Klecknere9b89312015-01-13 00:48:10 +00009332^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9333
9334Syntax:
9335"""""""
9336
9337::
9338
Reid Kleckner60381792015-07-07 22:25:32 +00009339 declare void @llvm.localescape(...)
9340 declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
Reid Klecknere9b89312015-01-13 00:48:10 +00009341
9342Overview:
9343"""""""""
9344
Reid Kleckner60381792015-07-07 22:25:32 +00009345The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
9346allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
Reid Klecknercfb9ce52015-03-05 18:26:34 +00009347live frame pointer to recover the address of the allocation. The offset is
Reid Kleckner60381792015-07-07 22:25:32 +00009348computed during frame layout of the caller of ``llvm.localescape``.
Reid Klecknere9b89312015-01-13 00:48:10 +00009349
9350Arguments:
9351""""""""""
9352
Reid Kleckner60381792015-07-07 22:25:32 +00009353All arguments to '``llvm.localescape``' must be pointers to static allocas or
9354casts of static allocas. Each function can only call '``llvm.localescape``'
Reid Klecknercfb9ce52015-03-05 18:26:34 +00009355once, and it can only do so from the entry block.
Reid Klecknere9b89312015-01-13 00:48:10 +00009356
Reid Kleckner60381792015-07-07 22:25:32 +00009357The ``func`` argument to '``llvm.localrecover``' must be a constant
Reid Klecknere9b89312015-01-13 00:48:10 +00009358bitcasted pointer to a function defined in the current module. The code
9359generator cannot determine the frame allocation offset of functions defined in
9360other modules.
9361
Reid Klecknerd5afc62f2015-07-07 23:23:03 +00009362The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
9363call frame that is currently live. The return value of '``llvm.localaddress``'
9364is one way to produce such a value, but various runtimes also expose a suitable
9365pointer in platform-specific ways.
Reid Klecknere9b89312015-01-13 00:48:10 +00009366
Reid Kleckner60381792015-07-07 22:25:32 +00009367The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
9368'``llvm.localescape``' to recover. It is zero-indexed.
Reid Klecknercfb9ce52015-03-05 18:26:34 +00009369
Reid Klecknere9b89312015-01-13 00:48:10 +00009370Semantics:
9371""""""""""
9372
Reid Kleckner60381792015-07-07 22:25:32 +00009373These intrinsics allow a group of functions to share access to a set of local
9374stack allocations of a one parent function. The parent function may call the
9375'``llvm.localescape``' intrinsic once from the function entry block, and the
9376child functions can use '``llvm.localrecover``' to access the escaped allocas.
9377The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
9378the escaped allocas are allocated, which would break attempts to use
9379'``llvm.localrecover``'.
Reid Klecknere9b89312015-01-13 00:48:10 +00009380
Renato Golinc7aea402014-05-06 16:51:25 +00009381.. _int_read_register:
9382.. _int_write_register:
9383
9384'``llvm.read_register``' and '``llvm.write_register``' Intrinsics
9385^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9386
9387Syntax:
9388"""""""
9389
9390::
9391
9392 declare i32 @llvm.read_register.i32(metadata)
9393 declare i64 @llvm.read_register.i64(metadata)
9394 declare void @llvm.write_register.i32(metadata, i32 @value)
9395 declare void @llvm.write_register.i64(metadata, i64 @value)
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00009396 !0 = !{!"sp\00"}
Renato Golinc7aea402014-05-06 16:51:25 +00009397
9398Overview:
9399"""""""""
9400
9401The '``llvm.read_register``' and '``llvm.write_register``' intrinsics
9402provides access to the named register. The register must be valid on
9403the architecture being compiled to. The type needs to be compatible
9404with the register being read.
9405
9406Semantics:
9407""""""""""
9408
9409The '``llvm.read_register``' intrinsic returns the current value of the
9410register, where possible. The '``llvm.write_register``' intrinsic sets
9411the current value of the register, where possible.
9412
9413This is useful to implement named register global variables that need
9414to always be mapped to a specific register, as is common practice on
9415bare-metal programs including OS kernels.
9416
9417The compiler doesn't check for register availability or use of the used
9418register in surrounding code, including inline assembly. Because of that,
9419allocatable registers are not supported.
9420
9421Warning: So far it only works with the stack pointer on selected
Tim Northover3b0846e2014-05-24 12:50:23 +00009422architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
Renato Golinc7aea402014-05-06 16:51:25 +00009423work is needed to support other registers and even more so, allocatable
9424registers.
9425
Sean Silvab084af42012-12-07 10:36:55 +00009426.. _int_stacksave:
9427
9428'``llvm.stacksave``' Intrinsic
9429^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9430
9431Syntax:
9432"""""""
9433
9434::
9435
9436 declare i8* @llvm.stacksave()
9437
9438Overview:
9439"""""""""
9440
9441The '``llvm.stacksave``' intrinsic is used to remember the current state
9442of the function stack, for use with
9443:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
9444implementing language features like scoped automatic variable sized
9445arrays in C99.
9446
9447Semantics:
9448""""""""""
9449
9450This intrinsic returns a opaque pointer value that can be passed to
9451:ref:`llvm.stackrestore <int_stackrestore>`. When an
9452``llvm.stackrestore`` intrinsic is executed with a value saved from
9453``llvm.stacksave``, it effectively restores the state of the stack to
9454the state it was in when the ``llvm.stacksave`` intrinsic executed. In
9455practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
9456were allocated after the ``llvm.stacksave`` was executed.
9457
9458.. _int_stackrestore:
9459
9460'``llvm.stackrestore``' Intrinsic
9461^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9462
9463Syntax:
9464"""""""
9465
9466::
9467
9468 declare void @llvm.stackrestore(i8* %ptr)
9469
9470Overview:
9471"""""""""
9472
9473The '``llvm.stackrestore``' intrinsic is used to restore the state of
9474the function stack to the state it was in when the corresponding
9475:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
9476useful for implementing language features like scoped automatic variable
9477sized arrays in C99.
9478
9479Semantics:
9480""""""""""
9481
9482See the description for :ref:`llvm.stacksave <int_stacksave>`.
9483
Yury Gribovd7dbb662015-12-01 11:40:55 +00009484.. _int_get_dynamic_area_offset:
9485
9486'``llvm.get.dynamic.area.offset``' Intrinsic
Yury Gribov81f3f152015-12-01 13:24:48 +00009487^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Yury Gribovd7dbb662015-12-01 11:40:55 +00009488
9489Syntax:
9490"""""""
9491
9492::
9493
9494 declare i32 @llvm.get.dynamic.area.offset.i32()
9495 declare i64 @llvm.get.dynamic.area.offset.i64()
9496
9497 Overview:
9498 """""""""
9499
9500 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
9501 get the offset from native stack pointer to the address of the most
9502 recent dynamic alloca on the caller's stack. These intrinsics are
9503 intendend for use in combination with
9504 :ref:`llvm.stacksave <int_stacksave>` to get a
9505 pointer to the most recent dynamic alloca. This is useful, for example,
9506 for AddressSanitizer's stack unpoisoning routines.
9507
9508Semantics:
9509""""""""""
9510
9511 These intrinsics return a non-negative integer value that can be used to
9512 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
9513 on the caller's stack. In particular, for targets where stack grows downwards,
9514 adding this offset to the native stack pointer would get the address of the most
9515 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
Sylvestre Ledru0455cbe2016-07-28 09:28:58 +00009516 complicated, because subtracting this value from stack pointer would get the address
Yury Gribovd7dbb662015-12-01 11:40:55 +00009517 one past the end of the most recent dynamic alloca.
9518
9519 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
9520 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
9521 compile-time-known constant value.
9522
9523 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
9524 must match the target's generic address space's (address space 0) pointer type.
9525
Sean Silvab084af42012-12-07 10:36:55 +00009526'``llvm.prefetch``' Intrinsic
9527^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9528
9529Syntax:
9530"""""""
9531
9532::
9533
9534 declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
9535
9536Overview:
9537"""""""""
9538
9539The '``llvm.prefetch``' intrinsic is a hint to the code generator to
9540insert a prefetch instruction if supported; otherwise, it is a noop.
9541Prefetches have no effect on the behavior of the program but can change
9542its performance characteristics.
9543
9544Arguments:
9545""""""""""
9546
9547``address`` is the address to be prefetched, ``rw`` is the specifier
9548determining if the fetch should be for a read (0) or write (1), and
9549``locality`` is a temporal locality specifier ranging from (0) - no
9550locality, to (3) - extremely local keep in cache. The ``cache type``
9551specifies whether the prefetch is performed on the data (1) or
9552instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
9553arguments must be constant integers.
9554
9555Semantics:
9556""""""""""
9557
9558This intrinsic does not modify the behavior of the program. In
9559particular, prefetches cannot trap and do not produce a value. On
9560targets that support this intrinsic, the prefetch can provide hints to
9561the processor cache for better performance.
9562
9563'``llvm.pcmarker``' Intrinsic
9564^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9565
9566Syntax:
9567"""""""
9568
9569::
9570
9571 declare void @llvm.pcmarker(i32 <id>)
9572
9573Overview:
9574"""""""""
9575
9576The '``llvm.pcmarker``' intrinsic is a method to export a Program
9577Counter (PC) in a region of code to simulators and other tools. The
9578method is target specific, but it is expected that the marker will use
9579exported symbols to transmit the PC of the marker. The marker makes no
9580guarantees that it will remain with any specific instruction after
9581optimizations. It is possible that the presence of a marker will inhibit
9582optimizations. The intended use is to be inserted after optimizations to
9583allow correlations of simulation runs.
9584
9585Arguments:
9586""""""""""
9587
9588``id`` is a numerical id identifying the marker.
9589
9590Semantics:
9591""""""""""
9592
9593This intrinsic does not modify the behavior of the program. Backends
9594that do not support this intrinsic may ignore it.
9595
9596'``llvm.readcyclecounter``' Intrinsic
9597^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9598
9599Syntax:
9600"""""""
9601
9602::
9603
9604 declare i64 @llvm.readcyclecounter()
9605
9606Overview:
9607"""""""""
9608
9609The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
9610counter register (or similar low latency, high accuracy clocks) on those
9611targets that support it. On X86, it should map to RDTSC. On Alpha, it
9612should map to RPCC. As the backing counters overflow quickly (on the
9613order of 9 seconds on alpha), this should only be used for small
9614timings.
9615
9616Semantics:
9617""""""""""
9618
9619When directly supported, reading the cycle counter should not modify any
9620memory. Implementations are allowed to either return a application
9621specific value or a system wide value. On backends without support, this
9622is lowered to a constant 0.
9623
Tim Northoverbc933082013-05-23 19:11:20 +00009624Note that runtime support may be conditional on the privilege-level code is
9625running at and the host platform.
9626
Renato Golinc0a3c1d2014-03-26 12:52:28 +00009627'``llvm.clear_cache``' Intrinsic
9628^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9629
9630Syntax:
9631"""""""
9632
9633::
9634
9635 declare void @llvm.clear_cache(i8*, i8*)
9636
9637Overview:
9638"""""""""
9639
Joerg Sonnenberger03014d62014-03-26 14:35:21 +00009640The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
9641in the specified range to the execution unit of the processor. On
9642targets with non-unified instruction and data cache, the implementation
9643flushes the instruction cache.
Renato Golinc0a3c1d2014-03-26 12:52:28 +00009644
9645Semantics:
9646""""""""""
9647
Joerg Sonnenberger03014d62014-03-26 14:35:21 +00009648On platforms with coherent instruction and data caches (e.g. x86), this
9649intrinsic is a nop. On platforms with non-coherent instruction and data
Alp Toker16f98b22014-04-09 14:47:27 +00009650cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
Joerg Sonnenberger03014d62014-03-26 14:35:21 +00009651instructions or a system call, if cache flushing requires special
9652privileges.
Renato Golinc0a3c1d2014-03-26 12:52:28 +00009653
Sean Silvad02bf3e2014-04-07 22:29:53 +00009654The default behavior is to emit a call to ``__clear_cache`` from the run
Joerg Sonnenberger03014d62014-03-26 14:35:21 +00009655time library.
Renato Golin93010e62014-03-26 14:01:32 +00009656
Joerg Sonnenberger03014d62014-03-26 14:35:21 +00009657This instrinsic does *not* empty the instruction pipeline. Modifications
9658of the current function are outside the scope of the intrinsic.
Renato Golinc0a3c1d2014-03-26 12:52:28 +00009659
Justin Bogner61ba2e32014-12-08 18:02:35 +00009660'``llvm.instrprof_increment``' Intrinsic
9661^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9662
9663Syntax:
9664"""""""
9665
9666::
9667
9668 declare void @llvm.instrprof_increment(i8* <name>, i64 <hash>,
9669 i32 <num-counters>, i32 <index>)
9670
9671Overview:
9672"""""""""
9673
9674The '``llvm.instrprof_increment``' intrinsic can be emitted by a
9675frontend for use with instrumentation based profiling. These will be
9676lowered by the ``-instrprof`` pass to generate execution counts of a
9677program at runtime.
9678
9679Arguments:
9680""""""""""
9681
9682The first argument is a pointer to a global variable containing the
9683name of the entity being instrumented. This should generally be the
9684(mangled) function name for a set of counters.
9685
9686The second argument is a hash value that can be used by the consumer
9687of the profile data to detect changes to the instrumented source, and
9688the third is the number of counters associated with ``name``. It is an
9689error if ``hash`` or ``num-counters`` differ between two instances of
9690``instrprof_increment`` that refer to the same name.
9691
9692The last argument refers to which of the counters for ``name`` should
9693be incremented. It should be a value between 0 and ``num-counters``.
9694
9695Semantics:
9696""""""""""
9697
9698This intrinsic represents an increment of a profiling counter. It will
9699cause the ``-instrprof`` pass to generate the appropriate data
9700structures and the code to increment the appropriate value, in a
9701format that can be written out by a compiler runtime and consumed via
9702the ``llvm-profdata`` tool.
9703
Betul Buyukkurt6fac1742015-11-18 18:14:55 +00009704'``llvm.instrprof_value_profile``' Intrinsic
9705^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9706
9707Syntax:
9708"""""""
9709
9710::
9711
9712 declare void @llvm.instrprof_value_profile(i8* <name>, i64 <hash>,
9713 i64 <value>, i32 <value_kind>,
9714 i32 <index>)
9715
9716Overview:
9717"""""""""
9718
9719The '``llvm.instrprof_value_profile``' intrinsic can be emitted by a
9720frontend for use with instrumentation based profiling. This will be
9721lowered by the ``-instrprof`` pass to find out the target values,
9722instrumented expressions take in a program at runtime.
9723
9724Arguments:
9725""""""""""
9726
9727The first argument is a pointer to a global variable containing the
9728name of the entity being instrumented. ``name`` should generally be the
9729(mangled) function name for a set of counters.
9730
9731The second argument is a hash value that can be used by the consumer
9732of the profile data to detect changes to the instrumented source. It
9733is an error if ``hash`` differs between two instances of
9734``llvm.instrprof_*`` that refer to the same name.
9735
9736The third argument is the value of the expression being profiled. The profiled
9737expression's value should be representable as an unsigned 64-bit value. The
9738fourth argument represents the kind of value profiling that is being done. The
9739supported value profiling kinds are enumerated through the
9740``InstrProfValueKind`` type declared in the
9741``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
9742index of the instrumented expression within ``name``. It should be >= 0.
9743
9744Semantics:
9745""""""""""
9746
9747This intrinsic represents the point where a call to a runtime routine
9748should be inserted for value profiling of target expressions. ``-instrprof``
9749pass will generate the appropriate data structures and replace the
9750``llvm.instrprof_value_profile`` intrinsic with the call to the profile
9751runtime library with proper arguments.
9752
Marcin Koscielnicki3fdc2572016-04-19 20:51:05 +00009753'``llvm.thread.pointer``' Intrinsic
9754^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9755
9756Syntax:
9757"""""""
9758
9759::
9760
9761 declare i8* @llvm.thread.pointer()
9762
9763Overview:
9764"""""""""
9765
9766The '``llvm.thread.pointer``' intrinsic returns the value of the thread
9767pointer.
9768
9769Semantics:
9770""""""""""
9771
9772The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
9773for the current thread. The exact semantics of this value are target
9774specific: it may point to the start of TLS area, to the end, or somewhere
9775in the middle. Depending on the target, this intrinsic may read a register,
9776call a helper function, read from an alternate memory space, or perform
9777other operations necessary to locate the TLS area. Not all targets support
9778this intrinsic.
9779
Sean Silvab084af42012-12-07 10:36:55 +00009780Standard C Library Intrinsics
9781-----------------------------
9782
9783LLVM provides intrinsics for a few important standard C library
9784functions. These intrinsics allow source-language front-ends to pass
9785information about the alignment of the pointer arguments to the code
9786generator, providing opportunity for more efficient code generation.
9787
9788.. _int_memcpy:
9789
9790'``llvm.memcpy``' Intrinsic
9791^^^^^^^^^^^^^^^^^^^^^^^^^^^
9792
9793Syntax:
9794"""""""
9795
9796This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
9797integer bit width and for different address spaces. Not all targets
9798support all bit widths however.
9799
9800::
9801
9802 declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
9803 i32 <len>, i32 <align>, i1 <isvolatile>)
9804 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
9805 i64 <len>, i32 <align>, i1 <isvolatile>)
9806
9807Overview:
9808"""""""""
9809
9810The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
9811source location to the destination location.
9812
9813Note that, unlike the standard libc function, the ``llvm.memcpy.*``
9814intrinsics do not return a value, takes extra alignment/isvolatile
9815arguments and the pointers can be in specified address spaces.
9816
9817Arguments:
9818""""""""""
9819
9820The first argument is a pointer to the destination, the second is a
9821pointer to the source. The third argument is an integer argument
9822specifying the number of bytes to copy, the fourth argument is the
9823alignment of the source and destination locations, and the fifth is a
9824boolean indicating a volatile access.
9825
9826If the call to this intrinsic has an alignment value that is not 0 or 1,
9827then the caller guarantees that both the source and destination pointers
9828are aligned to that boundary.
9829
9830If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
9831a :ref:`volatile operation <volatile>`. The detailed access behavior is not
9832very cleanly specified and it is unwise to depend on it.
9833
9834Semantics:
9835""""""""""
9836
9837The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
9838source location to the destination location, which are not allowed to
9839overlap. It copies "len" bytes of memory over. If the argument is known
9840to be aligned to some boundary, this can be specified as the fourth
Bill Wendling61163152013-10-18 23:26:55 +00009841argument, otherwise it should be set to 0 or 1 (both meaning no alignment).
Sean Silvab084af42012-12-07 10:36:55 +00009842
9843'``llvm.memmove``' Intrinsic
9844^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9845
9846Syntax:
9847"""""""
9848
9849This is an overloaded intrinsic. You can use llvm.memmove on any integer
9850bit width and for different address space. Not all targets support all
9851bit widths however.
9852
9853::
9854
9855 declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
9856 i32 <len>, i32 <align>, i1 <isvolatile>)
9857 declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
9858 i64 <len>, i32 <align>, i1 <isvolatile>)
9859
9860Overview:
9861"""""""""
9862
9863The '``llvm.memmove.*``' intrinsics move a block of memory from the
9864source location to the destination location. It is similar to the
9865'``llvm.memcpy``' intrinsic but allows the two memory locations to
9866overlap.
9867
9868Note that, unlike the standard libc function, the ``llvm.memmove.*``
9869intrinsics do not return a value, takes extra alignment/isvolatile
9870arguments and the pointers can be in specified address spaces.
9871
9872Arguments:
9873""""""""""
9874
9875The first argument is a pointer to the destination, the second is a
9876pointer to the source. The third argument is an integer argument
9877specifying the number of bytes to copy, the fourth argument is the
9878alignment of the source and destination locations, and the fifth is a
9879boolean indicating a volatile access.
9880
9881If the call to this intrinsic has an alignment value that is not 0 or 1,
9882then the caller guarantees that the source and destination pointers are
9883aligned to that boundary.
9884
9885If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
9886is a :ref:`volatile operation <volatile>`. The detailed access behavior is
9887not very cleanly specified and it is unwise to depend on it.
9888
9889Semantics:
9890""""""""""
9891
9892The '``llvm.memmove.*``' intrinsics copy a block of memory from the
9893source location to the destination location, which may overlap. It
9894copies "len" bytes of memory over. If the argument is known to be
9895aligned to some boundary, this can be specified as the fourth argument,
Bill Wendling61163152013-10-18 23:26:55 +00009896otherwise it should be set to 0 or 1 (both meaning no alignment).
Sean Silvab084af42012-12-07 10:36:55 +00009897
9898'``llvm.memset.*``' Intrinsics
9899^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9900
9901Syntax:
9902"""""""
9903
9904This is an overloaded intrinsic. You can use llvm.memset on any integer
9905bit width and for different address spaces. However, not all targets
9906support all bit widths.
9907
9908::
9909
9910 declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
9911 i32 <len>, i32 <align>, i1 <isvolatile>)
9912 declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
9913 i64 <len>, i32 <align>, i1 <isvolatile>)
9914
9915Overview:
9916"""""""""
9917
9918The '``llvm.memset.*``' intrinsics fill a block of memory with a
9919particular byte value.
9920
9921Note that, unlike the standard libc function, the ``llvm.memset``
9922intrinsic does not return a value and takes extra alignment/volatile
9923arguments. Also, the destination can be in an arbitrary address space.
9924
9925Arguments:
9926""""""""""
9927
9928The first argument is a pointer to the destination to fill, the second
9929is the byte value with which to fill it, the third argument is an
9930integer argument specifying the number of bytes to fill, and the fourth
9931argument is the known alignment of the destination location.
9932
9933If the call to this intrinsic has an alignment value that is not 0 or 1,
9934then the caller guarantees that the destination pointer is aligned to
9935that boundary.
9936
9937If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
9938a :ref:`volatile operation <volatile>`. The detailed access behavior is not
9939very cleanly specified and it is unwise to depend on it.
9940
9941Semantics:
9942""""""""""
9943
9944The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
9945at the destination location. If the argument is known to be aligned to
9946some boundary, this can be specified as the fourth argument, otherwise
Bill Wendling61163152013-10-18 23:26:55 +00009947it should be set to 0 or 1 (both meaning no alignment).
Sean Silvab084af42012-12-07 10:36:55 +00009948
9949'``llvm.sqrt.*``' Intrinsic
9950^^^^^^^^^^^^^^^^^^^^^^^^^^^
9951
9952Syntax:
9953"""""""
9954
9955This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
9956floating point or vector of floating point type. Not all targets support
9957all types however.
9958
9959::
9960
9961 declare float @llvm.sqrt.f32(float %Val)
9962 declare double @llvm.sqrt.f64(double %Val)
9963 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
9964 declare fp128 @llvm.sqrt.f128(fp128 %Val)
9965 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
9966
9967Overview:
9968"""""""""
9969
9970The '``llvm.sqrt``' intrinsics return the sqrt of the specified operand,
9971returning the same value as the libm '``sqrt``' functions would. Unlike
9972``sqrt`` in libm, however, ``llvm.sqrt`` has undefined behavior for
9973negative numbers other than -0.0 (which allows for better optimization,
9974because there is no need to worry about errno being set).
9975``llvm.sqrt(-0.0)`` is defined to return -0.0 like IEEE sqrt.
9976
9977Arguments:
9978""""""""""
9979
9980The argument and return value are floating point numbers of the same
9981type.
9982
9983Semantics:
9984""""""""""
9985
9986This function returns the sqrt of the specified operand if it is a
9987nonnegative floating point number.
9988
9989'``llvm.powi.*``' Intrinsic
9990^^^^^^^^^^^^^^^^^^^^^^^^^^^
9991
9992Syntax:
9993"""""""
9994
9995This is an overloaded intrinsic. You can use ``llvm.powi`` on any
9996floating point or vector of floating point type. Not all targets support
9997all types however.
9998
9999::
10000
10001 declare float @llvm.powi.f32(float %Val, i32 %power)
10002 declare double @llvm.powi.f64(double %Val, i32 %power)
10003 declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power)
10004 declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power)
10005 declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power)
10006
10007Overview:
10008"""""""""
10009
10010The '``llvm.powi.*``' intrinsics return the first operand raised to the
10011specified (positive or negative) power. The order of evaluation of
10012multiplications is not defined. When a vector of floating point type is
10013used, the second argument remains a scalar integer value.
10014
10015Arguments:
10016""""""""""
10017
10018The second argument is an integer power, and the first is a value to
10019raise to that power.
10020
10021Semantics:
10022""""""""""
10023
10024This function returns the first value raised to the second power with an
10025unspecified sequence of rounding operations.
10026
10027'``llvm.sin.*``' Intrinsic
10028^^^^^^^^^^^^^^^^^^^^^^^^^^
10029
10030Syntax:
10031"""""""
10032
10033This is an overloaded intrinsic. You can use ``llvm.sin`` on any
10034floating point or vector of floating point type. Not all targets support
10035all types however.
10036
10037::
10038
10039 declare float @llvm.sin.f32(float %Val)
10040 declare double @llvm.sin.f64(double %Val)
10041 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val)
10042 declare fp128 @llvm.sin.f128(fp128 %Val)
10043 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val)
10044
10045Overview:
10046"""""""""
10047
10048The '``llvm.sin.*``' intrinsics return the sine of the operand.
10049
10050Arguments:
10051""""""""""
10052
10053The argument and return value are floating point numbers of the same
10054type.
10055
10056Semantics:
10057""""""""""
10058
10059This function returns the sine of the specified operand, returning the
10060same values as the libm ``sin`` functions would, and handles error
10061conditions in the same way.
10062
10063'``llvm.cos.*``' Intrinsic
10064^^^^^^^^^^^^^^^^^^^^^^^^^^
10065
10066Syntax:
10067"""""""
10068
10069This is an overloaded intrinsic. You can use ``llvm.cos`` on any
10070floating point or vector of floating point type. Not all targets support
10071all types however.
10072
10073::
10074
10075 declare float @llvm.cos.f32(float %Val)
10076 declare double @llvm.cos.f64(double %Val)
10077 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val)
10078 declare fp128 @llvm.cos.f128(fp128 %Val)
10079 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val)
10080
10081Overview:
10082"""""""""
10083
10084The '``llvm.cos.*``' intrinsics return the cosine of the operand.
10085
10086Arguments:
10087""""""""""
10088
10089The argument and return value are floating point numbers of the same
10090type.
10091
10092Semantics:
10093""""""""""
10094
10095This function returns the cosine of the specified operand, returning the
10096same values as the libm ``cos`` functions would, and handles error
10097conditions in the same way.
10098
10099'``llvm.pow.*``' Intrinsic
10100^^^^^^^^^^^^^^^^^^^^^^^^^^
10101
10102Syntax:
10103"""""""
10104
10105This is an overloaded intrinsic. You can use ``llvm.pow`` on any
10106floating point or vector of floating point type. Not all targets support
10107all types however.
10108
10109::
10110
10111 declare float @llvm.pow.f32(float %Val, float %Power)
10112 declare double @llvm.pow.f64(double %Val, double %Power)
10113 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power)
10114 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power)
10115 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power)
10116
10117Overview:
10118"""""""""
10119
10120The '``llvm.pow.*``' intrinsics return the first operand raised to the
10121specified (positive or negative) power.
10122
10123Arguments:
10124""""""""""
10125
10126The second argument is a floating point power, and the first is a value
10127to raise to that power.
10128
10129Semantics:
10130""""""""""
10131
10132This function returns the first value raised to the second power,
10133returning the same values as the libm ``pow`` functions would, and
10134handles error conditions in the same way.
10135
10136'``llvm.exp.*``' Intrinsic
10137^^^^^^^^^^^^^^^^^^^^^^^^^^
10138
10139Syntax:
10140"""""""
10141
10142This is an overloaded intrinsic. You can use ``llvm.exp`` on any
10143floating point or vector of floating point type. Not all targets support
10144all types however.
10145
10146::
10147
10148 declare float @llvm.exp.f32(float %Val)
10149 declare double @llvm.exp.f64(double %Val)
10150 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val)
10151 declare fp128 @llvm.exp.f128(fp128 %Val)
10152 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val)
10153
10154Overview:
10155"""""""""
10156
10157The '``llvm.exp.*``' intrinsics perform the exp function.
10158
10159Arguments:
10160""""""""""
10161
10162The argument and return value are floating point numbers of the same
10163type.
10164
10165Semantics:
10166""""""""""
10167
10168This function returns the same values as the libm ``exp`` functions
10169would, and handles error conditions in the same way.
10170
10171'``llvm.exp2.*``' Intrinsic
10172^^^^^^^^^^^^^^^^^^^^^^^^^^^
10173
10174Syntax:
10175"""""""
10176
10177This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
10178floating point or vector of floating point type. Not all targets support
10179all types however.
10180
10181::
10182
10183 declare float @llvm.exp2.f32(float %Val)
10184 declare double @llvm.exp2.f64(double %Val)
10185 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val)
10186 declare fp128 @llvm.exp2.f128(fp128 %Val)
10187 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val)
10188
10189Overview:
10190"""""""""
10191
10192The '``llvm.exp2.*``' intrinsics perform the exp2 function.
10193
10194Arguments:
10195""""""""""
10196
10197The argument and return value are floating point numbers of the same
10198type.
10199
10200Semantics:
10201""""""""""
10202
10203This function returns the same values as the libm ``exp2`` functions
10204would, and handles error conditions in the same way.
10205
10206'``llvm.log.*``' Intrinsic
10207^^^^^^^^^^^^^^^^^^^^^^^^^^
10208
10209Syntax:
10210"""""""
10211
10212This is an overloaded intrinsic. You can use ``llvm.log`` on any
10213floating point or vector of floating point type. Not all targets support
10214all types however.
10215
10216::
10217
10218 declare float @llvm.log.f32(float %Val)
10219 declare double @llvm.log.f64(double %Val)
10220 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val)
10221 declare fp128 @llvm.log.f128(fp128 %Val)
10222 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val)
10223
10224Overview:
10225"""""""""
10226
10227The '``llvm.log.*``' intrinsics perform the log function.
10228
10229Arguments:
10230""""""""""
10231
10232The argument and return value are floating point numbers of the same
10233type.
10234
10235Semantics:
10236""""""""""
10237
10238This function returns the same values as the libm ``log`` functions
10239would, and handles error conditions in the same way.
10240
10241'``llvm.log10.*``' Intrinsic
10242^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10243
10244Syntax:
10245"""""""
10246
10247This is an overloaded intrinsic. You can use ``llvm.log10`` on any
10248floating point or vector of floating point type. Not all targets support
10249all types however.
10250
10251::
10252
10253 declare float @llvm.log10.f32(float %Val)
10254 declare double @llvm.log10.f64(double %Val)
10255 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val)
10256 declare fp128 @llvm.log10.f128(fp128 %Val)
10257 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val)
10258
10259Overview:
10260"""""""""
10261
10262The '``llvm.log10.*``' intrinsics perform the log10 function.
10263
10264Arguments:
10265""""""""""
10266
10267The argument and return value are floating point numbers of the same
10268type.
10269
10270Semantics:
10271""""""""""
10272
10273This function returns the same values as the libm ``log10`` functions
10274would, and handles error conditions in the same way.
10275
10276'``llvm.log2.*``' Intrinsic
10277^^^^^^^^^^^^^^^^^^^^^^^^^^^
10278
10279Syntax:
10280"""""""
10281
10282This is an overloaded intrinsic. You can use ``llvm.log2`` on any
10283floating point or vector of floating point type. Not all targets support
10284all types however.
10285
10286::
10287
10288 declare float @llvm.log2.f32(float %Val)
10289 declare double @llvm.log2.f64(double %Val)
10290 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val)
10291 declare fp128 @llvm.log2.f128(fp128 %Val)
10292 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val)
10293
10294Overview:
10295"""""""""
10296
10297The '``llvm.log2.*``' intrinsics perform the log2 function.
10298
10299Arguments:
10300""""""""""
10301
10302The argument and return value are floating point numbers of the same
10303type.
10304
10305Semantics:
10306""""""""""
10307
10308This function returns the same values as the libm ``log2`` functions
10309would, and handles error conditions in the same way.
10310
10311'``llvm.fma.*``' Intrinsic
10312^^^^^^^^^^^^^^^^^^^^^^^^^^
10313
10314Syntax:
10315"""""""
10316
10317This is an overloaded intrinsic. You can use ``llvm.fma`` on any
10318floating point or vector of floating point type. Not all targets support
10319all types however.
10320
10321::
10322
10323 declare float @llvm.fma.f32(float %a, float %b, float %c)
10324 declare double @llvm.fma.f64(double %a, double %b, double %c)
10325 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
10326 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
10327 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
10328
10329Overview:
10330"""""""""
10331
10332The '``llvm.fma.*``' intrinsics perform the fused multiply-add
10333operation.
10334
10335Arguments:
10336""""""""""
10337
10338The argument and return value are floating point numbers of the same
10339type.
10340
10341Semantics:
10342""""""""""
10343
10344This function returns the same values as the libm ``fma`` functions
Matt Arsenaultee364ee2014-01-31 00:09:00 +000010345would, and does not set errno.
Sean Silvab084af42012-12-07 10:36:55 +000010346
10347'``llvm.fabs.*``' Intrinsic
10348^^^^^^^^^^^^^^^^^^^^^^^^^^^
10349
10350Syntax:
10351"""""""
10352
10353This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
10354floating point or vector of floating point type. Not all targets support
10355all types however.
10356
10357::
10358
10359 declare float @llvm.fabs.f32(float %Val)
10360 declare double @llvm.fabs.f64(double %Val)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010361 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val)
Sean Silvab084af42012-12-07 10:36:55 +000010362 declare fp128 @llvm.fabs.f128(fp128 %Val)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010363 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
Sean Silvab084af42012-12-07 10:36:55 +000010364
10365Overview:
10366"""""""""
10367
10368The '``llvm.fabs.*``' intrinsics return the absolute value of the
10369operand.
10370
10371Arguments:
10372""""""""""
10373
10374The argument and return value are floating point numbers of the same
10375type.
10376
10377Semantics:
10378""""""""""
10379
10380This function returns the same values as the libm ``fabs`` functions
10381would, and handles error conditions in the same way.
10382
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010383'``llvm.minnum.*``' Intrinsic
Matt Arsenault9886b0d2014-10-22 00:15:53 +000010384^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010385
10386Syntax:
10387"""""""
10388
10389This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
10390floating point or vector of floating point type. Not all targets support
10391all types however.
10392
10393::
10394
Matt Arsenault64313c92014-10-22 18:25:02 +000010395 declare float @llvm.minnum.f32(float %Val0, float %Val1)
10396 declare double @llvm.minnum.f64(double %Val0, double %Val1)
10397 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
10398 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
10399 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010400
10401Overview:
10402"""""""""
10403
10404The '``llvm.minnum.*``' intrinsics return the minimum of the two
10405arguments.
10406
10407
10408Arguments:
10409""""""""""
10410
10411The arguments and return value are floating point numbers of the same
10412type.
10413
10414Semantics:
10415""""""""""
10416
10417Follows the IEEE-754 semantics for minNum, which also match for libm's
10418fmin.
10419
10420If either operand is a NaN, returns the other non-NaN operand. Returns
10421NaN only if both operands are NaN. If the operands compare equal,
10422returns a value that compares equal to both operands. This means that
10423fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
10424
10425'``llvm.maxnum.*``' Intrinsic
Matt Arsenault9886b0d2014-10-22 00:15:53 +000010426^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010427
10428Syntax:
10429"""""""
10430
10431This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
10432floating point or vector of floating point type. Not all targets support
10433all types however.
10434
10435::
10436
Matt Arsenault64313c92014-10-22 18:25:02 +000010437 declare float @llvm.maxnum.f32(float %Val0, float %Val1l)
10438 declare double @llvm.maxnum.f64(double %Val0, double %Val1)
10439 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
10440 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
10441 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010442
10443Overview:
10444"""""""""
10445
10446The '``llvm.maxnum.*``' intrinsics return the maximum of the two
10447arguments.
10448
10449
10450Arguments:
10451""""""""""
10452
10453The arguments and return value are floating point numbers of the same
10454type.
10455
10456Semantics:
10457""""""""""
10458Follows the IEEE-754 semantics for maxNum, which also match for libm's
10459fmax.
10460
10461If either operand is a NaN, returns the other non-NaN operand. Returns
10462NaN only if both operands are NaN. If the operands compare equal,
10463returns a value that compares equal to both operands. This means that
10464fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
10465
Hal Finkel0c5c01aa2013-08-19 23:35:46 +000010466'``llvm.copysign.*``' Intrinsic
10467^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10468
10469Syntax:
10470"""""""
10471
10472This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
10473floating point or vector of floating point type. Not all targets support
10474all types however.
10475
10476::
10477
10478 declare float @llvm.copysign.f32(float %Mag, float %Sgn)
10479 declare double @llvm.copysign.f64(double %Mag, double %Sgn)
10480 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn)
10481 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
10482 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn)
10483
10484Overview:
10485"""""""""
10486
10487The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
10488first operand and the sign of the second operand.
10489
10490Arguments:
10491""""""""""
10492
10493The arguments and return value are floating point numbers of the same
10494type.
10495
10496Semantics:
10497""""""""""
10498
10499This function returns the same values as the libm ``copysign``
10500functions would, and handles error conditions in the same way.
10501
Sean Silvab084af42012-12-07 10:36:55 +000010502'``llvm.floor.*``' Intrinsic
10503^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10504
10505Syntax:
10506"""""""
10507
10508This is an overloaded intrinsic. You can use ``llvm.floor`` on any
10509floating point or vector of floating point type. Not all targets support
10510all types however.
10511
10512::
10513
10514 declare float @llvm.floor.f32(float %Val)
10515 declare double @llvm.floor.f64(double %Val)
10516 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val)
10517 declare fp128 @llvm.floor.f128(fp128 %Val)
10518 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val)
10519
10520Overview:
10521"""""""""
10522
10523The '``llvm.floor.*``' intrinsics return the floor of the operand.
10524
10525Arguments:
10526""""""""""
10527
10528The argument and return value are floating point numbers of the same
10529type.
10530
10531Semantics:
10532""""""""""
10533
10534This function returns the same values as the libm ``floor`` functions
10535would, and handles error conditions in the same way.
10536
10537'``llvm.ceil.*``' Intrinsic
10538^^^^^^^^^^^^^^^^^^^^^^^^^^^
10539
10540Syntax:
10541"""""""
10542
10543This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
10544floating point or vector of floating point type. Not all targets support
10545all types however.
10546
10547::
10548
10549 declare float @llvm.ceil.f32(float %Val)
10550 declare double @llvm.ceil.f64(double %Val)
10551 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val)
10552 declare fp128 @llvm.ceil.f128(fp128 %Val)
10553 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val)
10554
10555Overview:
10556"""""""""
10557
10558The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
10559
10560Arguments:
10561""""""""""
10562
10563The argument and return value are floating point numbers of the same
10564type.
10565
10566Semantics:
10567""""""""""
10568
10569This function returns the same values as the libm ``ceil`` functions
10570would, and handles error conditions in the same way.
10571
10572'``llvm.trunc.*``' Intrinsic
10573^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10574
10575Syntax:
10576"""""""
10577
10578This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
10579floating point or vector of floating point type. Not all targets support
10580all types however.
10581
10582::
10583
10584 declare float @llvm.trunc.f32(float %Val)
10585 declare double @llvm.trunc.f64(double %Val)
10586 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val)
10587 declare fp128 @llvm.trunc.f128(fp128 %Val)
10588 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val)
10589
10590Overview:
10591"""""""""
10592
10593The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
10594nearest integer not larger in magnitude than the operand.
10595
10596Arguments:
10597""""""""""
10598
10599The argument and return value are floating point numbers of the same
10600type.
10601
10602Semantics:
10603""""""""""
10604
10605This function returns the same values as the libm ``trunc`` functions
10606would, and handles error conditions in the same way.
10607
10608'``llvm.rint.*``' Intrinsic
10609^^^^^^^^^^^^^^^^^^^^^^^^^^^
10610
10611Syntax:
10612"""""""
10613
10614This is an overloaded intrinsic. You can use ``llvm.rint`` on any
10615floating point or vector of floating point type. Not all targets support
10616all types however.
10617
10618::
10619
10620 declare float @llvm.rint.f32(float %Val)
10621 declare double @llvm.rint.f64(double %Val)
10622 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val)
10623 declare fp128 @llvm.rint.f128(fp128 %Val)
10624 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val)
10625
10626Overview:
10627"""""""""
10628
10629The '``llvm.rint.*``' intrinsics returns the operand rounded to the
10630nearest integer. It may raise an inexact floating-point exception if the
10631operand isn't an integer.
10632
10633Arguments:
10634""""""""""
10635
10636The argument and return value are floating point numbers of the same
10637type.
10638
10639Semantics:
10640""""""""""
10641
10642This function returns the same values as the libm ``rint`` functions
10643would, and handles error conditions in the same way.
10644
10645'``llvm.nearbyint.*``' Intrinsic
10646^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10647
10648Syntax:
10649"""""""
10650
10651This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
10652floating point or vector of floating point type. Not all targets support
10653all types however.
10654
10655::
10656
10657 declare float @llvm.nearbyint.f32(float %Val)
10658 declare double @llvm.nearbyint.f64(double %Val)
10659 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val)
10660 declare fp128 @llvm.nearbyint.f128(fp128 %Val)
10661 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val)
10662
10663Overview:
10664"""""""""
10665
10666The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
10667nearest integer.
10668
10669Arguments:
10670""""""""""
10671
10672The argument and return value are floating point numbers of the same
10673type.
10674
10675Semantics:
10676""""""""""
10677
10678This function returns the same values as the libm ``nearbyint``
10679functions would, and handles error conditions in the same way.
10680
Hal Finkel171817e2013-08-07 22:49:12 +000010681'``llvm.round.*``' Intrinsic
10682^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10683
10684Syntax:
10685"""""""
10686
10687This is an overloaded intrinsic. You can use ``llvm.round`` on any
10688floating point or vector of floating point type. Not all targets support
10689all types however.
10690
10691::
10692
10693 declare float @llvm.round.f32(float %Val)
10694 declare double @llvm.round.f64(double %Val)
10695 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val)
10696 declare fp128 @llvm.round.f128(fp128 %Val)
10697 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val)
10698
10699Overview:
10700"""""""""
10701
10702The '``llvm.round.*``' intrinsics returns the operand rounded to the
10703nearest integer.
10704
10705Arguments:
10706""""""""""
10707
10708The argument and return value are floating point numbers of the same
10709type.
10710
10711Semantics:
10712""""""""""
10713
10714This function returns the same values as the libm ``round``
10715functions would, and handles error conditions in the same way.
10716
Sean Silvab084af42012-12-07 10:36:55 +000010717Bit Manipulation Intrinsics
10718---------------------------
10719
10720LLVM provides intrinsics for a few important bit manipulation
10721operations. These allow efficient code generation for some algorithms.
10722
James Molloy90111f72015-11-12 12:29:09 +000010723'``llvm.bitreverse.*``' Intrinsics
Akira Hatanaka7f5562b2015-11-13 21:09:57 +000010724^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
James Molloy90111f72015-11-12 12:29:09 +000010725
10726Syntax:
10727"""""""
10728
10729This is an overloaded intrinsic function. You can use bitreverse on any
10730integer type.
10731
10732::
10733
10734 declare i16 @llvm.bitreverse.i16(i16 <id>)
10735 declare i32 @llvm.bitreverse.i32(i32 <id>)
10736 declare i64 @llvm.bitreverse.i64(i64 <id>)
10737
10738Overview:
10739"""""""""
10740
10741The '``llvm.bitreverse``' family of intrinsics is used to reverse the
Matt Arsenaultde2d6a32016-03-07 21:54:52 +000010742bitpattern of an integer value; for example ``0b10110110`` becomes
10743``0b01101101``.
James Molloy90111f72015-11-12 12:29:09 +000010744
10745Semantics:
10746""""""""""
10747
10748The ``llvm.bitreverse.iN`` intrinsic returns an i16 value that has bit
10749``M`` in the input moved to bit ``N-M`` in the output.
10750
Sean Silvab084af42012-12-07 10:36:55 +000010751'``llvm.bswap.*``' Intrinsics
10752^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10753
10754Syntax:
10755"""""""
10756
10757This is an overloaded intrinsic function. You can use bswap on any
10758integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
10759
10760::
10761
10762 declare i16 @llvm.bswap.i16(i16 <id>)
10763 declare i32 @llvm.bswap.i32(i32 <id>)
10764 declare i64 @llvm.bswap.i64(i64 <id>)
10765
10766Overview:
10767"""""""""
10768
10769The '``llvm.bswap``' family of intrinsics is used to byte swap integer
10770values with an even number of bytes (positive multiple of 16 bits).
10771These are useful for performing operations on data that is not in the
10772target's native byte order.
10773
10774Semantics:
10775""""""""""
10776
10777The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
10778and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
10779intrinsic returns an i32 value that has the four bytes of the input i32
10780swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
10781returned i32 will have its bytes in 3, 2, 1, 0 order. The
10782``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
10783concept to additional even-byte lengths (6 bytes, 8 bytes and more,
10784respectively).
10785
10786'``llvm.ctpop.*``' Intrinsic
10787^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10788
10789Syntax:
10790"""""""
10791
10792This is an overloaded intrinsic. You can use llvm.ctpop on any integer
10793bit width, or on any vector with integer elements. Not all targets
10794support all bit widths or vector types, however.
10795
10796::
10797
10798 declare i8 @llvm.ctpop.i8(i8 <src>)
10799 declare i16 @llvm.ctpop.i16(i16 <src>)
10800 declare i32 @llvm.ctpop.i32(i32 <src>)
10801 declare i64 @llvm.ctpop.i64(i64 <src>)
10802 declare i256 @llvm.ctpop.i256(i256 <src>)
10803 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
10804
10805Overview:
10806"""""""""
10807
10808The '``llvm.ctpop``' family of intrinsics counts the number of bits set
10809in a value.
10810
10811Arguments:
10812""""""""""
10813
10814The only argument is the value to be counted. The argument may be of any
10815integer type, or a vector with integer elements. The return type must
10816match the argument type.
10817
10818Semantics:
10819""""""""""
10820
10821The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
10822each element of a vector.
10823
10824'``llvm.ctlz.*``' Intrinsic
10825^^^^^^^^^^^^^^^^^^^^^^^^^^^
10826
10827Syntax:
10828"""""""
10829
10830This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
10831integer bit width, or any vector whose elements are integers. Not all
10832targets support all bit widths or vector types, however.
10833
10834::
10835
10836 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>)
10837 declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>)
10838 declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>)
10839 declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>)
10840 declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
Alexey Samsonovc4b18302016-03-17 23:08:01 +000010841 declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
Sean Silvab084af42012-12-07 10:36:55 +000010842
10843Overview:
10844"""""""""
10845
10846The '``llvm.ctlz``' family of intrinsic functions counts the number of
10847leading zeros in a variable.
10848
10849Arguments:
10850""""""""""
10851
10852The first argument is the value to be counted. This argument may be of
Hal Finkel5dd82782015-01-05 04:05:21 +000010853any integer type, or a vector with integer element type. The return
Sean Silvab084af42012-12-07 10:36:55 +000010854type must match the first argument type.
10855
10856The second argument must be a constant and is a flag to indicate whether
10857the intrinsic should ensure that a zero as the first argument produces a
10858defined result. Historically some architectures did not provide a
10859defined result for zero values as efficiently, and many algorithms are
10860now predicated on avoiding zero-value inputs.
10861
10862Semantics:
10863""""""""""
10864
10865The '``llvm.ctlz``' intrinsic counts the leading (most significant)
10866zeros in a variable, or within each element of the vector. If
10867``src == 0`` then the result is the size in bits of the type of ``src``
10868if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
10869``llvm.ctlz(i32 2) = 30``.
10870
10871'``llvm.cttz.*``' Intrinsic
10872^^^^^^^^^^^^^^^^^^^^^^^^^^^
10873
10874Syntax:
10875"""""""
10876
10877This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
10878integer bit width, or any vector of integer elements. Not all targets
10879support all bit widths or vector types, however.
10880
10881::
10882
10883 declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>)
10884 declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>)
10885 declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>)
10886 declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>)
10887 declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
Alexey Samsonovc4b18302016-03-17 23:08:01 +000010888 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
Sean Silvab084af42012-12-07 10:36:55 +000010889
10890Overview:
10891"""""""""
10892
10893The '``llvm.cttz``' family of intrinsic functions counts the number of
10894trailing zeros.
10895
10896Arguments:
10897""""""""""
10898
10899The first argument is the value to be counted. This argument may be of
Hal Finkel5dd82782015-01-05 04:05:21 +000010900any integer type, or a vector with integer element type. The return
Sean Silvab084af42012-12-07 10:36:55 +000010901type must match the first argument type.
10902
10903The second argument must be a constant and is a flag to indicate whether
10904the intrinsic should ensure that a zero as the first argument produces a
10905defined result. Historically some architectures did not provide a
10906defined result for zero values as efficiently, and many algorithms are
10907now predicated on avoiding zero-value inputs.
10908
10909Semantics:
10910""""""""""
10911
10912The '``llvm.cttz``' intrinsic counts the trailing (least significant)
10913zeros in a variable, or within each element of a vector. If ``src == 0``
10914then the result is the size in bits of the type of ``src`` if
10915``is_zero_undef == 0`` and ``undef`` otherwise. For example,
10916``llvm.cttz(2) = 1``.
10917
Philip Reames34843ae2015-03-05 05:55:55 +000010918.. _int_overflow:
10919
Sean Silvab084af42012-12-07 10:36:55 +000010920Arithmetic with Overflow Intrinsics
10921-----------------------------------
10922
John Regehr6a493f22016-05-12 20:55:09 +000010923LLVM provides intrinsics for fast arithmetic overflow checking.
10924
10925Each of these intrinsics returns a two-element struct. The first
10926element of this struct contains the result of the corresponding
10927arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
10928the result. Therefore, for example, the first element of the struct
10929returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
10930result of a 32-bit ``add`` instruction with the same operands, where
10931the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
10932
10933The second element of the result is an ``i1`` that is 1 if the
10934arithmetic operation overflowed and 0 otherwise. An operation
10935overflows if, for any values of its operands ``A`` and ``B`` and for
10936any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
10937not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
10938``sext`` for signed overflow and ``zext`` for unsigned overflow, and
10939``op`` is the underlying arithmetic operation.
10940
10941The behavior of these intrinsics is well-defined for all argument
10942values.
Sean Silvab084af42012-12-07 10:36:55 +000010943
10944'``llvm.sadd.with.overflow.*``' Intrinsics
10945^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10946
10947Syntax:
10948"""""""
10949
10950This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
10951on any integer bit width.
10952
10953::
10954
10955 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
10956 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
10957 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
10958
10959Overview:
10960"""""""""
10961
10962The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
10963a signed addition of the two arguments, and indicate whether an overflow
10964occurred during the signed summation.
10965
10966Arguments:
10967""""""""""
10968
10969The arguments (%a and %b) and the first element of the result structure
10970may be of integer types of any bit width, but they must have the same
10971bit width. The second element of the result structure must be of type
10972``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
10973addition.
10974
10975Semantics:
10976""""""""""
10977
10978The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000010979a signed addition of the two variables. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000010980first element of which is the signed summation, and the second element
10981of which is a bit specifying if the signed summation resulted in an
10982overflow.
10983
10984Examples:
10985"""""""""
10986
10987.. code-block:: llvm
10988
10989 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
10990 %sum = extractvalue {i32, i1} %res, 0
10991 %obit = extractvalue {i32, i1} %res, 1
10992 br i1 %obit, label %overflow, label %normal
10993
10994'``llvm.uadd.with.overflow.*``' Intrinsics
10995^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10996
10997Syntax:
10998"""""""
10999
11000This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
11001on any integer bit width.
11002
11003::
11004
11005 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
11006 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
11007 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
11008
11009Overview:
11010"""""""""
11011
11012The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
11013an unsigned addition of the two arguments, and indicate whether a carry
11014occurred during the unsigned summation.
11015
11016Arguments:
11017""""""""""
11018
11019The arguments (%a and %b) and the first element of the result structure
11020may be of integer types of any bit width, but they must have the same
11021bit width. The second element of the result structure must be of type
11022``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
11023addition.
11024
11025Semantics:
11026""""""""""
11027
11028The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011029an unsigned addition of the two arguments. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000011030first element of which is the sum, and the second element of which is a
11031bit specifying if the unsigned summation resulted in a carry.
11032
11033Examples:
11034"""""""""
11035
11036.. code-block:: llvm
11037
11038 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
11039 %sum = extractvalue {i32, i1} %res, 0
11040 %obit = extractvalue {i32, i1} %res, 1
11041 br i1 %obit, label %carry, label %normal
11042
11043'``llvm.ssub.with.overflow.*``' Intrinsics
11044^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11045
11046Syntax:
11047"""""""
11048
11049This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
11050on any integer bit width.
11051
11052::
11053
11054 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
11055 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
11056 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
11057
11058Overview:
11059"""""""""
11060
11061The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
11062a signed subtraction of the two arguments, and indicate whether an
11063overflow occurred during the signed subtraction.
11064
11065Arguments:
11066""""""""""
11067
11068The arguments (%a and %b) and the first element of the result structure
11069may be of integer types of any bit width, but they must have the same
11070bit width. The second element of the result structure must be of type
11071``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
11072subtraction.
11073
11074Semantics:
11075""""""""""
11076
11077The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011078a signed subtraction of the two arguments. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000011079first element of which is the subtraction, and the second element of
11080which is a bit specifying if the signed subtraction resulted in an
11081overflow.
11082
11083Examples:
11084"""""""""
11085
11086.. code-block:: llvm
11087
11088 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
11089 %sum = extractvalue {i32, i1} %res, 0
11090 %obit = extractvalue {i32, i1} %res, 1
11091 br i1 %obit, label %overflow, label %normal
11092
11093'``llvm.usub.with.overflow.*``' Intrinsics
11094^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11095
11096Syntax:
11097"""""""
11098
11099This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
11100on any integer bit width.
11101
11102::
11103
11104 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
11105 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
11106 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
11107
11108Overview:
11109"""""""""
11110
11111The '``llvm.usub.with.overflow``' family of intrinsic functions perform
11112an unsigned subtraction of the two arguments, and indicate whether an
11113overflow occurred during the unsigned subtraction.
11114
11115Arguments:
11116""""""""""
11117
11118The arguments (%a and %b) and the first element of the result structure
11119may be of integer types of any bit width, but they must have the same
11120bit width. The second element of the result structure must be of type
11121``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
11122subtraction.
11123
11124Semantics:
11125""""""""""
11126
11127The '``llvm.usub.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011128an unsigned subtraction of the two arguments. They return a structure ---
Sean Silvab084af42012-12-07 10:36:55 +000011129the first element of which is the subtraction, and the second element of
11130which is a bit specifying if the unsigned subtraction resulted in an
11131overflow.
11132
11133Examples:
11134"""""""""
11135
11136.. code-block:: llvm
11137
11138 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
11139 %sum = extractvalue {i32, i1} %res, 0
11140 %obit = extractvalue {i32, i1} %res, 1
11141 br i1 %obit, label %overflow, label %normal
11142
11143'``llvm.smul.with.overflow.*``' Intrinsics
11144^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11145
11146Syntax:
11147"""""""
11148
11149This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
11150on any integer bit width.
11151
11152::
11153
11154 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
11155 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
11156 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
11157
11158Overview:
11159"""""""""
11160
11161The '``llvm.smul.with.overflow``' family of intrinsic functions perform
11162a signed multiplication of the two arguments, and indicate whether an
11163overflow occurred during the signed multiplication.
11164
11165Arguments:
11166""""""""""
11167
11168The arguments (%a and %b) and the first element of the result structure
11169may be of integer types of any bit width, but they must have the same
11170bit width. The second element of the result structure must be of type
11171``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
11172multiplication.
11173
11174Semantics:
11175""""""""""
11176
11177The '``llvm.smul.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011178a signed multiplication of the two arguments. They return a structure ---
Sean Silvab084af42012-12-07 10:36:55 +000011179the first element of which is the multiplication, and the second element
11180of which is a bit specifying if the signed multiplication resulted in an
11181overflow.
11182
11183Examples:
11184"""""""""
11185
11186.. code-block:: llvm
11187
11188 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
11189 %sum = extractvalue {i32, i1} %res, 0
11190 %obit = extractvalue {i32, i1} %res, 1
11191 br i1 %obit, label %overflow, label %normal
11192
11193'``llvm.umul.with.overflow.*``' Intrinsics
11194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11195
11196Syntax:
11197"""""""
11198
11199This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
11200on any integer bit width.
11201
11202::
11203
11204 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
11205 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
11206 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
11207
11208Overview:
11209"""""""""
11210
11211The '``llvm.umul.with.overflow``' family of intrinsic functions perform
11212a unsigned multiplication of the two arguments, and indicate whether an
11213overflow occurred during the unsigned multiplication.
11214
11215Arguments:
11216""""""""""
11217
11218The arguments (%a and %b) and the first element of the result structure
11219may be of integer types of any bit width, but they must have the same
11220bit width. The second element of the result structure must be of type
11221``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
11222multiplication.
11223
11224Semantics:
11225""""""""""
11226
11227The '``llvm.umul.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011228an unsigned multiplication of the two arguments. They return a structure ---
11229the first element of which is the multiplication, and the second
Sean Silvab084af42012-12-07 10:36:55 +000011230element of which is a bit specifying if the unsigned multiplication
11231resulted in an overflow.
11232
11233Examples:
11234"""""""""
11235
11236.. code-block:: llvm
11237
11238 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
11239 %sum = extractvalue {i32, i1} %res, 0
11240 %obit = extractvalue {i32, i1} %res, 1
11241 br i1 %obit, label %overflow, label %normal
11242
11243Specialised Arithmetic Intrinsics
11244---------------------------------
11245
Owen Anderson1056a922015-07-11 07:01:27 +000011246'``llvm.canonicalize.*``' Intrinsic
11247^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11248
11249Syntax:
11250"""""""
11251
11252::
11253
11254 declare float @llvm.canonicalize.f32(float %a)
11255 declare double @llvm.canonicalize.f64(double %b)
11256
11257Overview:
11258"""""""""
11259
11260The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
Sean Silvaa1190322015-08-06 22:56:48 +000011261encoding of a floating point number. This canonicalization is useful for
Owen Anderson1056a922015-07-11 07:01:27 +000011262implementing certain numeric primitives such as frexp. The canonical encoding is
11263defined by IEEE-754-2008 to be:
11264
11265::
11266
11267 2.1.8 canonical encoding: The preferred encoding of a floating-point
Sean Silvaa1190322015-08-06 22:56:48 +000011268 representation in a format. Applied to declets, significands of finite
Owen Anderson1056a922015-07-11 07:01:27 +000011269 numbers, infinities, and NaNs, especially in decimal formats.
11270
11271This operation can also be considered equivalent to the IEEE-754-2008
Sean Silvaa1190322015-08-06 22:56:48 +000011272conversion of a floating-point value to the same format. NaNs are handled
Owen Anderson1056a922015-07-11 07:01:27 +000011273according to section 6.2.
11274
11275Examples of non-canonical encodings:
11276
Sean Silvaa1190322015-08-06 22:56:48 +000011277- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
Owen Anderson1056a922015-07-11 07:01:27 +000011278 converted to a canonical representation per hardware-specific protocol.
11279- Many normal decimal floating point numbers have non-canonical alternative
11280 encodings.
11281- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
Sanjay Patelcc330962016-02-24 23:44:19 +000011282 These are treated as non-canonical encodings of zero and will be flushed to
Owen Anderson1056a922015-07-11 07:01:27 +000011283 a zero of the same sign by this operation.
11284
11285Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
11286default exception handling must signal an invalid exception, and produce a
11287quiet NaN result.
11288
11289This function should always be implementable as multiplication by 1.0, provided
Sean Silvaa1190322015-08-06 22:56:48 +000011290that the compiler does not constant fold the operation. Likewise, division by
112911.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
Owen Anderson1056a922015-07-11 07:01:27 +000011292-0.0 is also sufficient provided that the rounding mode is not -Infinity.
11293
Sean Silvaa1190322015-08-06 22:56:48 +000011294``@llvm.canonicalize`` must preserve the equality relation. That is:
Owen Anderson1056a922015-07-11 07:01:27 +000011295
11296- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
11297- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
11298 to ``(x == y)``
11299
11300Additionally, the sign of zero must be conserved:
11301``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
11302
11303The payload bits of a NaN must be conserved, with two exceptions.
11304First, environments which use only a single canonical representation of NaN
Sean Silvaa1190322015-08-06 22:56:48 +000011305must perform said canonicalization. Second, SNaNs must be quieted per the
Owen Anderson1056a922015-07-11 07:01:27 +000011306usual methods.
11307
11308The canonicalization operation may be optimized away if:
11309
Sean Silvaa1190322015-08-06 22:56:48 +000011310- The input is known to be canonical. For example, it was produced by a
Owen Anderson1056a922015-07-11 07:01:27 +000011311 floating-point operation that is required by the standard to be canonical.
11312- The result is consumed only by (or fused with) other floating-point
Sean Silvaa1190322015-08-06 22:56:48 +000011313 operations. That is, the bits of the floating point value are not examined.
Owen Anderson1056a922015-07-11 07:01:27 +000011314
Sean Silvab084af42012-12-07 10:36:55 +000011315'``llvm.fmuladd.*``' Intrinsic
11316^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11317
11318Syntax:
11319"""""""
11320
11321::
11322
11323 declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
11324 declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
11325
11326Overview:
11327"""""""""
11328
11329The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
Lang Hames045f4392013-01-17 00:00:49 +000011330expressions that can be fused if the code generator determines that (a) the
11331target instruction set has support for a fused operation, and (b) that the
11332fused operation is more efficient than the equivalent, separate pair of mul
11333and add instructions.
Sean Silvab084af42012-12-07 10:36:55 +000011334
11335Arguments:
11336""""""""""
11337
11338The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
11339multiplicands, a and b, and an addend c.
11340
11341Semantics:
11342""""""""""
11343
11344The expression:
11345
11346::
11347
11348 %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
11349
11350is equivalent to the expression a \* b + c, except that rounding will
11351not be performed between the multiplication and addition steps if the
11352code generator fuses the operations. Fusion is not guaranteed, even if
11353the target platform supports it. If a fused multiply-add is required the
Matt Arsenaultee364ee2014-01-31 00:09:00 +000011354corresponding llvm.fma.\* intrinsic function should be used
11355instead. This never sets errno, just as '``llvm.fma.*``'.
Sean Silvab084af42012-12-07 10:36:55 +000011356
11357Examples:
11358"""""""""
11359
11360.. code-block:: llvm
11361
Tim Northover675a0962014-06-13 14:24:23 +000011362 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
Sean Silvab084af42012-12-07 10:36:55 +000011363
11364Half Precision Floating Point Intrinsics
11365----------------------------------------
11366
11367For most target platforms, half precision floating point is a
11368storage-only format. This means that it is a dense encoding (in memory)
11369but does not support computation in the format.
11370
11371This means that code must first load the half-precision floating point
11372value as an i16, then convert it to float with
11373:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
11374then be performed on the float value (including extending to double
11375etc). To store the value back to memory, it is first converted to float
11376if needed, then converted to i16 with
11377:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
11378i16 value.
11379
11380.. _int_convert_to_fp16:
11381
11382'``llvm.convert.to.fp16``' Intrinsic
11383^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11384
11385Syntax:
11386"""""""
11387
11388::
11389
Tim Northoverfd7e4242014-07-17 10:51:23 +000011390 declare i16 @llvm.convert.to.fp16.f32(float %a)
11391 declare i16 @llvm.convert.to.fp16.f64(double %a)
Sean Silvab084af42012-12-07 10:36:55 +000011392
11393Overview:
11394"""""""""
11395
Tim Northoverfd7e4242014-07-17 10:51:23 +000011396The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
11397conventional floating point type to half precision floating point format.
Sean Silvab084af42012-12-07 10:36:55 +000011398
11399Arguments:
11400""""""""""
11401
11402The intrinsic function contains single argument - the value to be
11403converted.
11404
11405Semantics:
11406""""""""""
11407
Tim Northoverfd7e4242014-07-17 10:51:23 +000011408The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
11409conventional floating point format to half precision floating point format. The
11410return value is an ``i16`` which contains the converted number.
Sean Silvab084af42012-12-07 10:36:55 +000011411
11412Examples:
11413"""""""""
11414
11415.. code-block:: llvm
11416
Tim Northoverfd7e4242014-07-17 10:51:23 +000011417 %res = call i16 @llvm.convert.to.fp16.f32(float %a)
Sean Silvab084af42012-12-07 10:36:55 +000011418 store i16 %res, i16* @x, align 2
11419
11420.. _int_convert_from_fp16:
11421
11422'``llvm.convert.from.fp16``' Intrinsic
11423^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11424
11425Syntax:
11426"""""""
11427
11428::
11429
Tim Northoverfd7e4242014-07-17 10:51:23 +000011430 declare float @llvm.convert.from.fp16.f32(i16 %a)
11431 declare double @llvm.convert.from.fp16.f64(i16 %a)
Sean Silvab084af42012-12-07 10:36:55 +000011432
11433Overview:
11434"""""""""
11435
11436The '``llvm.convert.from.fp16``' intrinsic function performs a
11437conversion from half precision floating point format to single precision
11438floating point format.
11439
11440Arguments:
11441""""""""""
11442
11443The intrinsic function contains single argument - the value to be
11444converted.
11445
11446Semantics:
11447""""""""""
11448
11449The '``llvm.convert.from.fp16``' intrinsic function performs a
11450conversion from half single precision floating point format to single
11451precision floating point format. The input half-float value is
11452represented by an ``i16`` value.
11453
11454Examples:
11455"""""""""
11456
11457.. code-block:: llvm
11458
David Blaikiec7aabbb2015-03-04 22:06:14 +000011459 %a = load i16, i16* @x, align 2
Matt Arsenault3e3ddda2014-07-10 03:22:16 +000011460 %res = call float @llvm.convert.from.fp16(i16 %a)
Sean Silvab084af42012-12-07 10:36:55 +000011461
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +000011462.. _dbg_intrinsics:
11463
Sean Silvab084af42012-12-07 10:36:55 +000011464Debugger Intrinsics
11465-------------------
11466
11467The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
11468prefix), are described in the `LLVM Source Level
11469Debugging <SourceLevelDebugging.html#format_common_intrinsics>`_
11470document.
11471
11472Exception Handling Intrinsics
11473-----------------------------
11474
11475The LLVM exception handling intrinsics (which all start with
11476``llvm.eh.`` prefix), are described in the `LLVM Exception
11477Handling <ExceptionHandling.html#format_common_intrinsics>`_ document.
11478
11479.. _int_trampoline:
11480
11481Trampoline Intrinsics
11482---------------------
11483
11484These intrinsics make it possible to excise one parameter, marked with
11485the :ref:`nest <nest>` attribute, from a function. The result is a
11486callable function pointer lacking the nest parameter - the caller does
11487not need to provide a value for it. Instead, the value to use is stored
11488in advance in a "trampoline", a block of memory usually allocated on the
11489stack, which also contains code to splice the nest value into the
11490argument list. This is used to implement the GCC nested function address
11491extension.
11492
11493For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
11494then the resulting function pointer has signature ``i32 (i32, i32)*``.
11495It can be created as follows:
11496
11497.. code-block:: llvm
11498
11499 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
David Blaikie16a97eb2015-03-04 22:02:58 +000011500 %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
Sean Silvab084af42012-12-07 10:36:55 +000011501 call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
11502 %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
11503 %fp = bitcast i8* %p to i32 (i32, i32)*
11504
11505The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
11506``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
11507
11508.. _int_it:
11509
11510'``llvm.init.trampoline``' Intrinsic
11511^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11512
11513Syntax:
11514"""""""
11515
11516::
11517
11518 declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
11519
11520Overview:
11521"""""""""
11522
11523This fills the memory pointed to by ``tramp`` with executable code,
11524turning it into a trampoline.
11525
11526Arguments:
11527""""""""""
11528
11529The ``llvm.init.trampoline`` intrinsic takes three arguments, all
11530pointers. The ``tramp`` argument must point to a sufficiently large and
11531sufficiently aligned block of memory; this memory is written to by the
11532intrinsic. Note that the size and the alignment are target-specific -
11533LLVM currently provides no portable way of determining them, so a
11534front-end that generates this intrinsic needs to have some
11535target-specific knowledge. The ``func`` argument must hold a function
11536bitcast to an ``i8*``.
11537
11538Semantics:
11539""""""""""
11540
11541The block of memory pointed to by ``tramp`` is filled with target
11542dependent code, turning it into a function. Then ``tramp`` needs to be
11543passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
11544be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
11545function's signature is the same as that of ``func`` with any arguments
11546marked with the ``nest`` attribute removed. At most one such ``nest``
11547argument is allowed, and it must be of pointer type. Calling the new
11548function is equivalent to calling ``func`` with the same argument list,
11549but with ``nval`` used for the missing ``nest`` argument. If, after
11550calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
11551modified, then the effect of any later call to the returned function
11552pointer is undefined.
11553
11554.. _int_at:
11555
11556'``llvm.adjust.trampoline``' Intrinsic
11557^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11558
11559Syntax:
11560"""""""
11561
11562::
11563
11564 declare i8* @llvm.adjust.trampoline(i8* <tramp>)
11565
11566Overview:
11567"""""""""
11568
11569This performs any required machine-specific adjustment to the address of
11570a trampoline (passed as ``tramp``).
11571
11572Arguments:
11573""""""""""
11574
11575``tramp`` must point to a block of memory which already has trampoline
11576code filled in by a previous call to
11577:ref:`llvm.init.trampoline <int_it>`.
11578
11579Semantics:
11580""""""""""
11581
11582On some architectures the address of the code to be executed needs to be
Sanjay Patel69bf48e2014-07-04 19:40:43 +000011583different than the address where the trampoline is actually stored. This
Sean Silvab084af42012-12-07 10:36:55 +000011584intrinsic returns the executable address corresponding to ``tramp``
11585after performing the required machine specific adjustments. The pointer
11586returned can then be :ref:`bitcast and executed <int_trampoline>`.
11587
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011588.. _int_mload_mstore:
11589
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011590Masked Vector Load and Store Intrinsics
11591---------------------------------------
11592
11593LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
11594
11595.. _int_mload:
11596
11597'``llvm.masked.load.*``' Intrinsics
11598^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11599
11600Syntax:
11601"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011602This is an overloaded intrinsic. The loaded data is a vector of any integer, floating point or pointer data type.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011603
11604::
11605
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011606 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
11607 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011608 ;; The data is a vector of pointers to double
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011609 declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011610 ;; The data is a vector of function pointers
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011611 declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011612
11613Overview:
11614"""""""""
11615
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011616Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011617
11618
11619Arguments:
11620""""""""""
11621
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011622The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011623
11624
11625Semantics:
11626""""""""""
11627
11628The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
11629The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
11630
11631
11632::
11633
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011634 %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
Mehdi Amini4a121fa2015-03-14 22:04:06 +000011635
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011636 ;; The result of the two following instructions is identical aside from potential memory access exception
David Blaikiec7aabbb2015-03-04 22:06:14 +000011637 %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
Elena Demikhovskye86c8c82014-12-29 09:47:51 +000011638 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011639
11640.. _int_mstore:
11641
11642'``llvm.masked.store.*``' Intrinsics
11643^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11644
11645Syntax:
11646"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011647This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating point or pointer data type.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011648
11649::
11650
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011651 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
11652 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011653 ;; The data is a vector of pointers to double
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011654 declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011655 ;; The data is a vector of function pointers
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011656 declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011657
11658Overview:
11659"""""""""
11660
11661Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
11662
11663Arguments:
11664""""""""""
11665
11666The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
11667
11668
11669Semantics:
11670""""""""""
11671
11672The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
11673The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
11674
11675::
11676
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000011677 call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask)
Mehdi Amini4a121fa2015-03-14 22:04:06 +000011678
Elena Demikhovskye86c8c82014-12-29 09:47:51 +000011679 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
David Blaikiec7aabbb2015-03-04 22:06:14 +000011680 %oldval = load <16 x float>, <16 x float>* %ptr, align 4
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000011681 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
11682 store <16 x float> %res, <16 x float>* %ptr, align 4
11683
11684
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011685Masked Vector Gather and Scatter Intrinsics
11686-------------------------------------------
11687
11688LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
11689
11690.. _int_mgather:
11691
11692'``llvm.masked.gather.*``' Intrinsics
11693^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11694
11695Syntax:
11696"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011697This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating point or pointer data type gathered together into one vector.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011698
11699::
11700
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011701 declare <16 x float> @llvm.masked.gather.v16f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
11702 declare <2 x double> @llvm.masked.gather.v2f64 (<2 x double*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
11703 declare <8 x float*> @llvm.masked.gather.v8p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011704
11705Overview:
11706"""""""""
11707
11708Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
11709
11710
11711Arguments:
11712""""""""""
11713
11714The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
11715
11716
11717Semantics:
11718""""""""""
11719
11720The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
11721The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
11722
11723
11724::
11725
11726 %res = call <4 x double> @llvm.masked.gather.v4f64 (<4 x double*> %ptrs, i32 8, <4 x i1>%mask, <4 x double> <true, true, true, true>)
11727
11728 ;; The gather with all-true mask is equivalent to the following instruction sequence
11729 %ptr0 = extractelement <4 x double*> %ptrs, i32 0
11730 %ptr1 = extractelement <4 x double*> %ptrs, i32 1
11731 %ptr2 = extractelement <4 x double*> %ptrs, i32 2
11732 %ptr3 = extractelement <4 x double*> %ptrs, i32 3
11733
11734 %val0 = load double, double* %ptr0, align 8
11735 %val1 = load double, double* %ptr1, align 8
11736 %val2 = load double, double* %ptr2, align 8
11737 %val3 = load double, double* %ptr3, align 8
11738
11739 %vec0 = insertelement <4 x double>undef, %val0, 0
11740 %vec01 = insertelement <4 x double>%vec0, %val1, 1
11741 %vec012 = insertelement <4 x double>%vec01, %val2, 2
11742 %vec0123 = insertelement <4 x double>%vec012, %val3, 3
11743
11744.. _int_mscatter:
11745
11746'``llvm.masked.scatter.*``' Intrinsics
11747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11748
11749Syntax:
11750"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011751This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011752
11753::
11754
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000011755 declare void @llvm.masked.scatter.v8i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>)
11756 declare void @llvm.masked.scatter.v16f32 (<16 x float> <value>, <16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
11757 declare void @llvm.masked.scatter.v4p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011758
11759Overview:
11760"""""""""
11761
11762Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
11763
11764Arguments:
11765""""""""""
11766
11767The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
11768
11769
11770Semantics:
11771""""""""""
11772
Bruce Mitchenere9ffb452015-09-12 01:17:08 +000011773The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011774
11775::
11776
Sylvestre Ledru84666a12016-02-14 20:16:22 +000011777 ;; This instruction unconditionally stores data vector in multiple addresses
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000011778 call @llvm.masked.scatter.v8i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>)
11779
11780 ;; It is equivalent to a list of scalar stores
11781 %val0 = extractelement <8 x i32> %value, i32 0
11782 %val1 = extractelement <8 x i32> %value, i32 1
11783 ..
11784 %val7 = extractelement <8 x i32> %value, i32 7
11785 %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
11786 %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
11787 ..
11788 %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
11789 ;; Note: the order of the following stores is important when they overlap:
11790 store i32 %val0, i32* %ptr0, align 4
11791 store i32 %val1, i32* %ptr1, align 4
11792 ..
11793 store i32 %val7, i32* %ptr7, align 4
11794
11795
Sean Silvab084af42012-12-07 10:36:55 +000011796Memory Use Markers
11797------------------
11798
Sanjay Patel69bf48e2014-07-04 19:40:43 +000011799This class of intrinsics provides information about the lifetime of
Sean Silvab084af42012-12-07 10:36:55 +000011800memory objects and ranges where variables are immutable.
11801
Reid Klecknera534a382013-12-19 02:14:12 +000011802.. _int_lifestart:
11803
Sean Silvab084af42012-12-07 10:36:55 +000011804'``llvm.lifetime.start``' Intrinsic
11805^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11806
11807Syntax:
11808"""""""
11809
11810::
11811
11812 declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
11813
11814Overview:
11815"""""""""
11816
11817The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
11818object's lifetime.
11819
11820Arguments:
11821""""""""""
11822
11823The first argument is a constant integer representing the size of the
11824object, or -1 if it is variable sized. The second argument is a pointer
11825to the object.
11826
11827Semantics:
11828""""""""""
11829
11830This intrinsic indicates that before this point in the code, the value
11831of the memory pointed to by ``ptr`` is dead. This means that it is known
11832to never be used and has an undefined value. A load from the pointer
11833that precedes this intrinsic can be replaced with ``'undef'``.
11834
Reid Klecknera534a382013-12-19 02:14:12 +000011835.. _int_lifeend:
11836
Sean Silvab084af42012-12-07 10:36:55 +000011837'``llvm.lifetime.end``' Intrinsic
11838^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11839
11840Syntax:
11841"""""""
11842
11843::
11844
11845 declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
11846
11847Overview:
11848"""""""""
11849
11850The '``llvm.lifetime.end``' intrinsic specifies the end of a memory
11851object's lifetime.
11852
11853Arguments:
11854""""""""""
11855
11856The first argument is a constant integer representing the size of the
11857object, or -1 if it is variable sized. The second argument is a pointer
11858to the object.
11859
11860Semantics:
11861""""""""""
11862
11863This intrinsic indicates that after this point in the code, the value of
11864the memory pointed to by ``ptr`` is dead. This means that it is known to
11865never be used and has an undefined value. Any stores into the memory
11866object following this intrinsic may be removed as dead.
11867
11868'``llvm.invariant.start``' Intrinsic
11869^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11870
11871Syntax:
11872"""""""
Anna Thomas0be4a0e2016-07-22 17:49:40 +000011873This is an overloaded intrinsic. The memory object can belong to any address space.
Sean Silvab084af42012-12-07 10:36:55 +000011874
11875::
11876
Anna Thomas0be4a0e2016-07-22 17:49:40 +000011877 declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
Sean Silvab084af42012-12-07 10:36:55 +000011878
11879Overview:
11880"""""""""
11881
11882The '``llvm.invariant.start``' intrinsic specifies that the contents of
11883a memory object will not change.
11884
11885Arguments:
11886""""""""""
11887
11888The first argument is a constant integer representing the size of the
11889object, or -1 if it is variable sized. The second argument is a pointer
11890to the object.
11891
11892Semantics:
11893""""""""""
11894
11895This intrinsic indicates that until an ``llvm.invariant.end`` that uses
11896the return value, the referenced memory location is constant and
11897unchanging.
11898
11899'``llvm.invariant.end``' Intrinsic
11900^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11901
11902Syntax:
11903"""""""
Anna Thomas0be4a0e2016-07-22 17:49:40 +000011904This is an overloaded intrinsic. The memory object can belong to any address space.
Sean Silvab084af42012-12-07 10:36:55 +000011905
11906::
11907
Anna Thomas0be4a0e2016-07-22 17:49:40 +000011908 declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
Sean Silvab084af42012-12-07 10:36:55 +000011909
11910Overview:
11911"""""""""
11912
11913The '``llvm.invariant.end``' intrinsic specifies that the contents of a
11914memory object are mutable.
11915
11916Arguments:
11917""""""""""
11918
11919The first argument is the matching ``llvm.invariant.start`` intrinsic.
11920The second argument is a constant integer representing the size of the
11921object, or -1 if it is variable sized and the third argument is a
11922pointer to the object.
11923
11924Semantics:
11925""""""""""
11926
11927This intrinsic indicates that the memory is mutable again.
11928
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000011929'``llvm.invariant.group.barrier``' Intrinsic
11930^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11931
11932Syntax:
11933"""""""
11934
11935::
11936
11937 declare i8* @llvm.invariant.group.barrier(i8* <ptr>)
11938
11939Overview:
11940"""""""""
11941
11942The '``llvm.invariant.group.barrier``' intrinsic can be used when an invariant
11943established by invariant.group metadata no longer holds, to obtain a new pointer
11944value that does not carry the invariant information.
11945
11946
11947Arguments:
11948""""""""""
11949
11950The ``llvm.invariant.group.barrier`` takes only one argument, which is
11951the pointer to the memory for which the ``invariant.group`` no longer holds.
11952
11953Semantics:
11954""""""""""
11955
11956Returns another pointer that aliases its argument but which is considered different
11957for the purposes of ``load``/``store`` ``invariant.group`` metadata.
11958
Sean Silvab084af42012-12-07 10:36:55 +000011959General Intrinsics
11960------------------
11961
11962This class of intrinsics is designed to be generic and has no specific
11963purpose.
11964
11965'``llvm.var.annotation``' Intrinsic
11966^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11967
11968Syntax:
11969"""""""
11970
11971::
11972
11973 declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
11974
11975Overview:
11976"""""""""
11977
11978The '``llvm.var.annotation``' intrinsic.
11979
11980Arguments:
11981""""""""""
11982
11983The first argument is a pointer to a value, the second is a pointer to a
11984global string, the third is a pointer to a global string which is the
11985source file name, and the last argument is the line number.
11986
11987Semantics:
11988""""""""""
11989
11990This intrinsic allows annotation of local variables with arbitrary
11991strings. This can be useful for special purpose optimizations that want
11992to look for these annotations. These have no other defined use; they are
11993ignored by code generation and optimization.
11994
Michael Gottesman88d18832013-03-26 00:34:27 +000011995'``llvm.ptr.annotation.*``' Intrinsic
11996^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11997
11998Syntax:
11999"""""""
12000
12001This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
12002pointer to an integer of any width. *NOTE* you must specify an address space for
12003the pointer. The identifier for the default address space is the integer
12004'``0``'.
12005
12006::
12007
12008 declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
12009 declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>)
12010 declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>)
12011 declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>)
12012 declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>)
12013
12014Overview:
12015"""""""""
12016
12017The '``llvm.ptr.annotation``' intrinsic.
12018
12019Arguments:
12020""""""""""
12021
12022The first argument is a pointer to an integer value of arbitrary bitwidth
12023(result of some expression), the second is a pointer to a global string, the
12024third is a pointer to a global string which is the source file name, and the
12025last argument is the line number. It returns the value of the first argument.
12026
12027Semantics:
12028""""""""""
12029
12030This intrinsic allows annotation of a pointer to an integer with arbitrary
12031strings. This can be useful for special purpose optimizations that want to look
12032for these annotations. These have no other defined use; they are ignored by code
12033generation and optimization.
12034
Sean Silvab084af42012-12-07 10:36:55 +000012035'``llvm.annotation.*``' Intrinsic
12036^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12037
12038Syntax:
12039"""""""
12040
12041This is an overloaded intrinsic. You can use '``llvm.annotation``' on
12042any integer bit width.
12043
12044::
12045
12046 declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>)
12047 declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>)
12048 declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>)
12049 declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>)
12050 declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>)
12051
12052Overview:
12053"""""""""
12054
12055The '``llvm.annotation``' intrinsic.
12056
12057Arguments:
12058""""""""""
12059
12060The first argument is an integer value (result of some expression), the
12061second is a pointer to a global string, the third is a pointer to a
12062global string which is the source file name, and the last argument is
12063the line number. It returns the value of the first argument.
12064
12065Semantics:
12066""""""""""
12067
12068This intrinsic allows annotations to be put on arbitrary expressions
12069with arbitrary strings. This can be useful for special purpose
12070optimizations that want to look for these annotations. These have no
12071other defined use; they are ignored by code generation and optimization.
12072
12073'``llvm.trap``' Intrinsic
12074^^^^^^^^^^^^^^^^^^^^^^^^^
12075
12076Syntax:
12077"""""""
12078
12079::
12080
12081 declare void @llvm.trap() noreturn nounwind
12082
12083Overview:
12084"""""""""
12085
12086The '``llvm.trap``' intrinsic.
12087
12088Arguments:
12089""""""""""
12090
12091None.
12092
12093Semantics:
12094""""""""""
12095
12096This intrinsic is lowered to the target dependent trap instruction. If
12097the target does not have a trap instruction, this intrinsic will be
12098lowered to a call of the ``abort()`` function.
12099
12100'``llvm.debugtrap``' Intrinsic
12101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12102
12103Syntax:
12104"""""""
12105
12106::
12107
12108 declare void @llvm.debugtrap() nounwind
12109
12110Overview:
12111"""""""""
12112
12113The '``llvm.debugtrap``' intrinsic.
12114
12115Arguments:
12116""""""""""
12117
12118None.
12119
12120Semantics:
12121""""""""""
12122
12123This intrinsic is lowered to code which is intended to cause an
12124execution trap with the intention of requesting the attention of a
12125debugger.
12126
12127'``llvm.stackprotector``' Intrinsic
12128^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12129
12130Syntax:
12131"""""""
12132
12133::
12134
12135 declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
12136
12137Overview:
12138"""""""""
12139
12140The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
12141onto the stack at ``slot``. The stack slot is adjusted to ensure that it
12142is placed on the stack before local variables.
12143
12144Arguments:
12145""""""""""
12146
12147The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
12148The first argument is the value loaded from the stack guard
12149``@__stack_chk_guard``. The second variable is an ``alloca`` that has
12150enough space to hold the value of the guard.
12151
12152Semantics:
12153""""""""""
12154
Michael Gottesmandafc7d92013-08-12 18:35:32 +000012155This intrinsic causes the prologue/epilogue inserter to force the position of
12156the ``AllocaInst`` stack slot to be before local variables on the stack. This is
12157to ensure that if a local variable on the stack is overwritten, it will destroy
12158the value of the guard. When the function exits, the guard on the stack is
12159checked against the original guard by ``llvm.stackprotectorcheck``. If they are
12160different, then ``llvm.stackprotectorcheck`` causes the program to abort by
12161calling the ``__stack_chk_fail()`` function.
12162
Tim Shene885d5e2016-04-19 19:40:37 +000012163'``llvm.stackguard``' Intrinsic
12164^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12165
12166Syntax:
12167"""""""
12168
12169::
12170
12171 declare i8* @llvm.stackguard()
12172
12173Overview:
12174"""""""""
12175
12176The ``llvm.stackguard`` intrinsic returns the system stack guard value.
12177
12178It should not be generated by frontends, since it is only for internal usage.
12179The reason why we create this intrinsic is that we still support IR form Stack
12180Protector in FastISel.
12181
12182Arguments:
12183""""""""""
12184
12185None.
12186
12187Semantics:
12188""""""""""
12189
12190On some platforms, the value returned by this intrinsic remains unchanged
12191between loads in the same thread. On other platforms, it returns the same
12192global variable value, if any, e.g. ``@__stack_chk_guard``.
12193
12194Currently some platforms have IR-level customized stack guard loading (e.g.
12195X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
12196in the future.
12197
Sean Silvab084af42012-12-07 10:36:55 +000012198'``llvm.objectsize``' Intrinsic
12199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12200
12201Syntax:
12202"""""""
12203
12204::
12205
12206 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>)
12207 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>)
12208
12209Overview:
12210"""""""""
12211
12212The ``llvm.objectsize`` intrinsic is designed to provide information to
12213the optimizers to determine at compile time whether a) an operation
12214(like memcpy) will overflow a buffer that corresponds to an object, or
12215b) that a runtime check for overflow isn't necessary. An object in this
12216context means an allocation of a specific class, structure, array, or
12217other object.
12218
12219Arguments:
12220""""""""""
12221
12222The ``llvm.objectsize`` intrinsic takes two arguments. The first
12223argument is a pointer to or into the ``object``. The second argument is
12224a boolean and determines whether ``llvm.objectsize`` returns 0 (if true)
12225or -1 (if false) when the object size is unknown. The second argument
12226only accepts constants.
12227
12228Semantics:
12229""""""""""
12230
12231The ``llvm.objectsize`` intrinsic is lowered to a constant representing
12232the size of the object concerned. If the size cannot be determined at
12233compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending
12234on the ``min`` argument).
12235
12236'``llvm.expect``' Intrinsic
12237^^^^^^^^^^^^^^^^^^^^^^^^^^^
12238
12239Syntax:
12240"""""""
12241
Duncan P. N. Exon Smith1ff08e32014-02-02 22:43:55 +000012242This is an overloaded intrinsic. You can use ``llvm.expect`` on any
12243integer bit width.
12244
Sean Silvab084af42012-12-07 10:36:55 +000012245::
12246
Duncan P. N. Exon Smith1ff08e32014-02-02 22:43:55 +000012247 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
Sean Silvab084af42012-12-07 10:36:55 +000012248 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
12249 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
12250
12251Overview:
12252"""""""""
12253
12254The ``llvm.expect`` intrinsic provides information about expected (the
12255most probable) value of ``val``, which can be used by optimizers.
12256
12257Arguments:
12258""""""""""
12259
12260The ``llvm.expect`` intrinsic takes two arguments. The first argument is
12261a value. The second argument is an expected value, this needs to be a
12262constant value, variables are not allowed.
12263
12264Semantics:
12265""""""""""
12266
12267This intrinsic is lowered to the ``val``.
12268
Philip Reamese0e90832015-04-26 22:23:12 +000012269.. _int_assume:
12270
Hal Finkel93046912014-07-25 21:13:35 +000012271'``llvm.assume``' Intrinsic
12272^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12273
12274Syntax:
12275"""""""
12276
12277::
12278
12279 declare void @llvm.assume(i1 %cond)
12280
12281Overview:
12282"""""""""
12283
12284The ``llvm.assume`` allows the optimizer to assume that the provided
12285condition is true. This information can then be used in simplifying other parts
12286of the code.
12287
12288Arguments:
12289""""""""""
12290
12291The condition which the optimizer may assume is always true.
12292
12293Semantics:
12294""""""""""
12295
12296The intrinsic allows the optimizer to assume that the provided condition is
12297always true whenever the control flow reaches the intrinsic call. No code is
12298generated for this intrinsic, and instructions that contribute only to the
12299provided condition are not used for code generation. If the condition is
12300violated during execution, the behavior is undefined.
12301
Sanjay Patel1ed2bb52015-01-14 16:03:58 +000012302Note that the optimizer might limit the transformations performed on values
Hal Finkel93046912014-07-25 21:13:35 +000012303used by the ``llvm.assume`` intrinsic in order to preserve the instructions
12304only used to form the intrinsic's input argument. This might prove undesirable
Sanjay Patel1ed2bb52015-01-14 16:03:58 +000012305if the extra information provided by the ``llvm.assume`` intrinsic does not cause
Hal Finkel93046912014-07-25 21:13:35 +000012306sufficient overall improvement in code quality. For this reason,
12307``llvm.assume`` should not be used to document basic mathematical invariants
12308that the optimizer can otherwise deduce or facts that are of little use to the
12309optimizer.
12310
Peter Collingbourne7efd7502016-06-24 21:21:32 +000012311.. _type.test:
Peter Collingbournee6909c82015-02-20 20:30:47 +000012312
Peter Collingbourne7efd7502016-06-24 21:21:32 +000012313'``llvm.type.test``' Intrinsic
Peter Collingbournee6909c82015-02-20 20:30:47 +000012314^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12315
12316Syntax:
12317"""""""
12318
12319::
12320
Peter Collingbourne7efd7502016-06-24 21:21:32 +000012321 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
Peter Collingbournee6909c82015-02-20 20:30:47 +000012322
12323
12324Arguments:
12325""""""""""
12326
12327The first argument is a pointer to be tested. The second argument is a
Peter Collingbourne7efd7502016-06-24 21:21:32 +000012328metadata object representing a :doc:`type identifier <TypeMetadata>`.
Peter Collingbournee6909c82015-02-20 20:30:47 +000012329
12330Overview:
12331"""""""""
12332
Peter Collingbourne7efd7502016-06-24 21:21:32 +000012333The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
12334with the given type identifier.
Peter Collingbournee6909c82015-02-20 20:30:47 +000012335
Peter Collingbourne0312f612016-06-25 00:23:04 +000012336'``llvm.type.checked.load``' Intrinsic
12337^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12338
12339Syntax:
12340"""""""
12341
12342::
12343
12344 declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
12345
12346
12347Arguments:
12348""""""""""
12349
12350The first argument is a pointer from which to load a function pointer. The
12351second argument is the byte offset from which to load the function pointer. The
12352third argument is a metadata object representing a :doc:`type identifier
12353<TypeMetadata>`.
12354
12355Overview:
12356"""""""""
12357
12358The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
12359virtual table pointer using type metadata. This intrinsic is used to implement
12360control flow integrity in conjunction with virtual call optimization. The
12361virtual call optimization pass will optimize away ``llvm.type.checked.load``
12362intrinsics associated with devirtualized calls, thereby removing the type
12363check in cases where it is not needed to enforce the control flow integrity
12364constraint.
12365
12366If the given pointer is associated with a type metadata identifier, this
12367function returns true as the second element of its return value. (Note that
12368the function may also return true if the given pointer is not associated
12369with a type metadata identifier.) If the function's return value's second
12370element is true, the following rules apply to the first element:
12371
12372- If the given pointer is associated with the given type metadata identifier,
12373 it is the function pointer loaded from the given byte offset from the given
12374 pointer.
12375
12376- If the given pointer is not associated with the given type metadata
12377 identifier, it is one of the following (the choice of which is unspecified):
12378
12379 1. The function pointer that would have been loaded from an arbitrarily chosen
12380 (through an unspecified mechanism) pointer associated with the type
12381 metadata.
12382
12383 2. If the function has a non-void return type, a pointer to a function that
12384 returns an unspecified value without causing side effects.
12385
12386If the function's return value's second element is false, the value of the
12387first element is undefined.
12388
12389
Sean Silvab084af42012-12-07 10:36:55 +000012390'``llvm.donothing``' Intrinsic
12391^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12392
12393Syntax:
12394"""""""
12395
12396::
12397
12398 declare void @llvm.donothing() nounwind readnone
12399
12400Overview:
12401"""""""""
12402
Juergen Ributzkac9161192014-10-23 22:36:13 +000012403The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
Sanjoy Das7a4c94d2016-02-26 03:33:59 +000012404three intrinsics (besides ``llvm.experimental.patchpoint`` and
12405``llvm.experimental.gc.statepoint``) that can be called with an invoke
12406instruction.
Sean Silvab084af42012-12-07 10:36:55 +000012407
12408Arguments:
12409""""""""""
12410
12411None.
12412
12413Semantics:
12414""""""""""
12415
12416This intrinsic does nothing, and it's removed by optimizers and ignored
12417by codegen.
Andrew Trick5e029ce2013-12-24 02:57:25 +000012418
Sanjoy Dasb51325d2016-03-11 19:08:34 +000012419'``llvm.experimental.deoptimize``' Intrinsic
12420^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12421
12422Syntax:
12423"""""""
12424
12425::
12426
12427 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
12428
12429Overview:
12430"""""""""
12431
12432This intrinsic, together with :ref:`deoptimization operand bundles
12433<deopt_opbundles>`, allow frontends to express transfer of control and
12434frame-local state from the currently executing (typically more specialized,
12435hence faster) version of a function into another (typically more generic, hence
12436slower) version.
12437
12438In languages with a fully integrated managed runtime like Java and JavaScript
12439this intrinsic can be used to implement "uncommon trap" or "side exit" like
12440functionality. In unmanaged languages like C and C++, this intrinsic can be
12441used to represent the slow paths of specialized functions.
12442
12443
12444Arguments:
12445""""""""""
12446
12447The intrinsic takes an arbitrary number of arguments, whose meaning is
12448decided by the :ref:`lowering strategy<deoptimize_lowering>`.
12449
12450Semantics:
12451""""""""""
12452
12453The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
12454deoptimization continuation (denoted using a :ref:`deoptimization
12455operand bundle <deopt_opbundles>`) and returns the value returned by
12456the deoptimization continuation. Defining the semantic properties of
12457the continuation itself is out of scope of the language reference --
12458as far as LLVM is concerned, the deoptimization continuation can
12459invoke arbitrary side effects, including reading from and writing to
12460the entire heap.
12461
12462Deoptimization continuations expressed using ``"deopt"`` operand bundles always
12463continue execution to the end of the physical frame containing them, so all
12464calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
12465
12466 - ``@llvm.experimental.deoptimize`` cannot be invoked.
12467 - The call must immediately precede a :ref:`ret <i_ret>` instruction.
12468 - The ``ret`` instruction must return the value produced by the
12469 ``@llvm.experimental.deoptimize`` call if there is one, or void.
12470
12471Note that the above restrictions imply that the return type for a call to
12472``@llvm.experimental.deoptimize`` will match the return type of its immediate
12473caller.
12474
12475The inliner composes the ``"deopt"`` continuations of the caller into the
12476``"deopt"`` continuations present in the inlinee, and also updates calls to this
12477intrinsic to return directly from the frame of the function it inlined into.
12478
Sanjoy Dase0aa4142016-05-12 01:17:38 +000012479All declarations of ``@llvm.experimental.deoptimize`` must share the
12480same calling convention.
12481
Sanjoy Dasb51325d2016-03-11 19:08:34 +000012482.. _deoptimize_lowering:
12483
12484Lowering:
12485"""""""""
12486
Sanjoy Dasdf9ae702016-03-24 20:23:29 +000012487Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
12488symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
12489ensure that this symbol is defined). The call arguments to
12490``@llvm.experimental.deoptimize`` are lowered as if they were formal
12491arguments of the specified types, and not as varargs.
12492
Sanjoy Dasb51325d2016-03-11 19:08:34 +000012493
Sanjoy Das021de052016-03-31 00:18:46 +000012494'``llvm.experimental.guard``' Intrinsic
12495^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12496
12497Syntax:
12498"""""""
12499
12500::
12501
12502 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
12503
12504Overview:
12505"""""""""
12506
12507This intrinsic, together with :ref:`deoptimization operand bundles
12508<deopt_opbundles>`, allows frontends to express guards or checks on
12509optimistic assumptions made during compilation. The semantics of
12510``@llvm.experimental.guard`` is defined in terms of
12511``@llvm.experimental.deoptimize`` -- its body is defined to be
12512equivalent to:
12513
Renato Golin124f2592016-07-20 12:16:38 +000012514.. code-block:: text
Sanjoy Das021de052016-03-31 00:18:46 +000012515
Renato Golin124f2592016-07-20 12:16:38 +000012516 define void @llvm.experimental.guard(i1 %pred, <args...>) {
12517 %realPred = and i1 %pred, undef
12518 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
Sanjoy Das021de052016-03-31 00:18:46 +000012519
Renato Golin124f2592016-07-20 12:16:38 +000012520 leave:
12521 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
12522 ret void
Sanjoy Das021de052016-03-31 00:18:46 +000012523
Renato Golin124f2592016-07-20 12:16:38 +000012524 continue:
12525 ret void
12526 }
Sanjoy Das021de052016-03-31 00:18:46 +000012527
Sanjoy Das47cf2af2016-04-30 00:55:59 +000012528
12529with the optional ``[, !make.implicit !{}]`` present if and only if it
12530is present on the call site. For more details on ``!make.implicit``,
12531see :doc:`FaultMaps`.
12532
Sanjoy Das021de052016-03-31 00:18:46 +000012533In words, ``@llvm.experimental.guard`` executes the attached
12534``"deopt"`` continuation if (but **not** only if) its first argument
12535is ``false``. Since the optimizer is allowed to replace the ``undef``
12536with an arbitrary value, it can optimize guard to fail "spuriously",
12537i.e. without the original condition being false (hence the "not only
12538if"); and this allows for "check widening" type optimizations.
12539
12540``@llvm.experimental.guard`` cannot be invoked.
12541
12542
Peter Collingbourne7dd8dbf2016-04-22 21:18:02 +000012543'``llvm.load.relative``' Intrinsic
12544^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12545
12546Syntax:
12547"""""""
12548
12549::
12550
12551 declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
12552
12553Overview:
12554"""""""""
12555
12556This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
12557adds ``%ptr`` to that value and returns it. The constant folder specifically
12558recognizes the form of this intrinsic and the constant initializers it may
12559load from; if a loaded constant initializer is known to have the form
12560``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
12561
12562LLVM provides that the calculation of such a constant initializer will
12563not overflow at link time under the medium code model if ``x`` is an
12564``unnamed_addr`` function. However, it does not provide this guarantee for
12565a constant initializer folded into a function body. This intrinsic can be
12566used to avoid the possibility of overflows when loading from such a constant.
12567
Andrew Trick5e029ce2013-12-24 02:57:25 +000012568Stack Map Intrinsics
12569--------------------
12570
12571LLVM provides experimental intrinsics to support runtime patching
12572mechanisms commonly desired in dynamic language JITs. These intrinsics
12573are described in :doc:`StackMaps`.