blob: 9c10bbb0d611dd822c4f2a8e9500e6af83be988d [file] [log] [blame]
Sean Silvab084af42012-12-07 10:36:55 +00001==============================
2LLVM Language Reference Manual
3==============================
4
5.. contents::
6 :local:
Rafael Espindola08013342013-12-07 19:34:20 +00007 :depth: 4
Sean Silvab084af42012-12-07 10:36:55 +00008
Sean Silvab084af42012-12-07 10:36:55 +00009Abstract
10========
11
12This document is a reference manual for the LLVM assembly language. LLVM
13is a Static Single Assignment (SSA) based representation that provides
14type safety, low-level operations, flexibility, and the capability of
15representing 'all' high-level languages cleanly. It is the common code
16representation used throughout all phases of the LLVM compilation
17strategy.
18
19Introduction
20============
21
22The LLVM code representation is designed to be used in three different
23forms: as an in-memory compiler IR, as an on-disk bitcode representation
24(suitable for fast loading by a Just-In-Time compiler), and as a human
25readable assembly language representation. This allows LLVM to provide a
26powerful intermediate representation for efficient compiler
27transformations and analysis, while providing a natural means to debug
28and visualize the transformations. The three different forms of LLVM are
29all equivalent. This document describes the human readable
30representation and notation.
31
32The LLVM representation aims to be light-weight and low-level while
33being expressive, typed, and extensible at the same time. It aims to be
34a "universal IR" of sorts, by being at a low enough level that
35high-level ideas may be cleanly mapped to it (similar to how
36microprocessors are "universal IR's", allowing many source languages to
37be mapped to them). By providing type information, LLVM can be used as
38the target of optimizations: for example, through pointer analysis, it
39can be proven that a C automatic variable is never accessed outside of
40the current function, allowing it to be promoted to a simple SSA value
41instead of a memory location.
42
43.. _wellformed:
44
45Well-Formedness
46---------------
47
48It is important to note that this document describes 'well formed' LLVM
49assembly language. There is a difference between what the parser accepts
50and what is considered 'well formed'. For example, the following
51instruction is syntactically okay, but not well formed:
52
53.. code-block:: llvm
54
55 %x = add i32 1, %x
56
57because the definition of ``%x`` does not dominate all of its uses. The
58LLVM infrastructure provides a verification pass that may be used to
59verify that an LLVM module is well formed. This pass is automatically
60run by the parser after parsing input assembly and by the optimizer
61before it outputs bitcode. The violations pointed out by the verifier
62pass indicate bugs in transformation passes or input to the parser.
63
64.. _identifiers:
65
66Identifiers
67===========
68
69LLVM identifiers come in two basic types: global and local. Global
70identifiers (functions, global variables) begin with the ``'@'``
71character. Local identifiers (register names, types) begin with the
72``'%'`` character. Additionally, there are three different formats for
73identifiers, for different purposes:
74
75#. Named values are represented as a string of characters with their
76 prefix. For example, ``%foo``, ``@DivisionByZero``,
77 ``%a.really.long.identifier``. The actual regular expression used is
Sean Silva9d01a5b2015-01-07 21:35:14 +000078 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
Sean Silvab084af42012-12-07 10:36:55 +000079 characters in their names can be surrounded with quotes. Special
80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81 code for the character in hexadecimal. In this way, any character can
Hans Wennborg85e06532014-07-30 20:02:08 +000082 be used in a name value, even quotes themselves. The ``"\01"`` prefix
Hans Wennborg2cfcc012018-05-22 10:14:07 +000083 can be used on global values to suppress mangling.
Sean Silvab084af42012-12-07 10:36:55 +000084#. Unnamed values are represented as an unsigned numeric value with
85 their prefix. For example, ``%12``, ``@2``, ``%44``.
Sean Silvaa1190322015-08-06 22:56:48 +000086#. Constants, which are described in the section Constants_ below.
Sean Silvab084af42012-12-07 10:36:55 +000087
88LLVM requires that values start with a prefix for two reasons: Compilers
89don't need to worry about name clashes with reserved words, and the set
90of reserved words may be expanded in the future without penalty.
91Additionally, unnamed identifiers allow a compiler to quickly come up
92with a temporary variable without having to avoid symbol table
93conflicts.
94
95Reserved words in LLVM are very similar to reserved words in other
96languages. There are keywords for different opcodes ('``add``',
97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98'``i32``', etc...), and others. These reserved words cannot conflict
99with variable names, because none of them start with a prefix character
100(``'%'`` or ``'@'``).
101
102Here is an example of LLVM code to multiply the integer variable
103'``%X``' by 8:
104
105The easy way:
106
107.. code-block:: llvm
108
109 %result = mul i32 %X, 8
110
111After strength reduction:
112
113.. code-block:: llvm
114
Dmitri Gribenko675911d2013-01-26 13:30:13 +0000115 %result = shl i32 %X, 3
Sean Silvab084af42012-12-07 10:36:55 +0000116
117And the hard way:
118
119.. code-block:: llvm
120
Tim Northover675a0962014-06-13 14:24:23 +0000121 %0 = add i32 %X, %X ; yields i32:%0
122 %1 = add i32 %0, %0 ; yields i32:%1
Sean Silvab084af42012-12-07 10:36:55 +0000123 %result = add i32 %1, %1
124
125This last way of multiplying ``%X`` by 8 illustrates several important
126lexical features of LLVM:
127
128#. Comments are delimited with a '``;``' and go until the end of line.
129#. Unnamed temporaries are created when the result of a computation is
130 not assigned to a named value.
Sean Silva8ca11782013-05-20 23:31:12 +0000131#. Unnamed temporaries are numbered sequentially (using a per-function
Dan Liew2661dfc2014-08-20 15:06:30 +0000132 incrementing counter, starting with 0). Note that basic blocks and unnamed
133 function parameters are included in this numbering. For example, if the
134 entry basic block is not given a label name and all function parameters are
135 named, then it will get number 0.
Sean Silvab084af42012-12-07 10:36:55 +0000136
137It also shows a convention that we follow in this document. When
138demonstrating instructions, we will follow an instruction with a comment
139that defines the type and name of value produced.
140
141High Level Structure
142====================
143
144Module Structure
145----------------
146
147LLVM programs are composed of ``Module``'s, each of which is a
148translation unit of the input programs. Each module consists of
149functions, global variables, and symbol table entries. Modules may be
150combined together with the LLVM linker, which merges function (and
151global variable) definitions, resolves forward declarations, and merges
152symbol table entries. Here is an example of the "hello world" module:
153
154.. code-block:: llvm
155
Michael Liaoa7699082013-03-06 18:24:34 +0000156 ; Declare the string constant as a global constant.
157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
Sean Silvab084af42012-12-07 10:36:55 +0000158
Michael Liaoa7699082013-03-06 18:24:34 +0000159 ; External declaration of the puts function
160 declare i32 @puts(i8* nocapture) nounwind
Sean Silvab084af42012-12-07 10:36:55 +0000161
162 ; Definition of main function
Michael Liaoa7699082013-03-06 18:24:34 +0000163 define i32 @main() { ; i32()*
George Burgess IVfbc34982017-05-20 04:52:29 +0000164 ; Convert [13 x i8]* to i8*...
David Blaikie16a97eb2015-03-04 22:02:58 +0000165 %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
Sean Silvab084af42012-12-07 10:36:55 +0000166
Michael Liaoa7699082013-03-06 18:24:34 +0000167 ; Call puts function to write out the string to stdout.
Sean Silvab084af42012-12-07 10:36:55 +0000168 call i32 @puts(i8* %cast210)
Michael Liaoa7699082013-03-06 18:24:34 +0000169 ret i32 0
Sean Silvab084af42012-12-07 10:36:55 +0000170 }
171
172 ; Named metadata
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +0000173 !0 = !{i32 42, null, !"string"}
Nick Lewyckya0de40a2014-08-13 04:54:05 +0000174 !foo = !{!0}
Sean Silvab084af42012-12-07 10:36:55 +0000175
176This example is made up of a :ref:`global variable <globalvars>` named
177"``.str``", an external declaration of the "``puts``" function, a
178:ref:`function definition <functionstructure>` for "``main``" and
179:ref:`named metadata <namedmetadatastructure>` "``foo``".
180
181In general, a module is made up of a list of global values (where both
182functions and global variables are global values). Global values are
183represented by a pointer to a memory location (in this case, a pointer
184to an array of char, and a pointer to a function), and have one of the
185following :ref:`linkage types <linkage>`.
186
187.. _linkage:
188
189Linkage Types
190-------------
191
192All Global Variables and Functions have one of the following types of
193linkage:
194
195``private``
196 Global values with "``private``" linkage are only directly
197 accessible by objects in the current module. In particular, linking
Sylvestre Ledru0604c5c2017-03-04 14:01:38 +0000198 code into a module with a private global value may cause the
Sean Silvab084af42012-12-07 10:36:55 +0000199 private to be renamed as necessary to avoid collisions. Because the
200 symbol is private to the module, all references can be updated. This
201 doesn't show up in any symbol table in the object file.
Sean Silvab084af42012-12-07 10:36:55 +0000202``internal``
203 Similar to private, but the value shows as a local symbol
204 (``STB_LOCAL`` in the case of ELF) in the object file. This
205 corresponds to the notion of the '``static``' keyword in C.
206``available_externally``
Peter Collingbourne45cd0c32015-12-14 19:22:37 +0000207 Globals with "``available_externally``" linkage are never emitted into
208 the object file corresponding to the LLVM module. From the linker's
209 perspective, an ``available_externally`` global is equivalent to
210 an external declaration. They exist to allow inlining and other
211 optimizations to take place given knowledge of the definition of the
212 global, which is known to be somewhere outside the module. Globals
213 with ``available_externally`` linkage are allowed to be discarded at
214 will, and allow inlining and other optimizations. This linkage type is
215 only allowed on definitions, not declarations.
Sean Silvab084af42012-12-07 10:36:55 +0000216``linkonce``
217 Globals with "``linkonce``" linkage are merged with other globals of
218 the same name when linkage occurs. This can be used to implement
219 some forms of inline functions, templates, or other code which must
220 be generated in each translation unit that uses it, but where the
221 body may be overridden with a more definitive definition later.
222 Unreferenced ``linkonce`` globals are allowed to be discarded. Note
223 that ``linkonce`` linkage does not actually allow the optimizer to
224 inline the body of this function into callers because it doesn't
225 know if this definition of the function is the definitive definition
226 within the program or whether it will be overridden by a stronger
227 definition. To enable inlining and other optimizations, use
228 "``linkonce_odr``" linkage.
229``weak``
230 "``weak``" linkage has the same merging semantics as ``linkonce``
231 linkage, except that unreferenced globals with ``weak`` linkage may
232 not be discarded. This is used for globals that are declared "weak"
233 in C source code.
234``common``
235 "``common``" linkage is most similar to "``weak``" linkage, but they
236 are used for tentative definitions in C, such as "``int X;``" at
237 global scope. Symbols with "``common``" linkage are merged in the
238 same way as ``weak symbols``, and they may not be deleted if
239 unreferenced. ``common`` symbols may not have an explicit section,
240 must have a zero initializer, and may not be marked
241 ':ref:`constant <globalvars>`'. Functions and aliases may not have
242 common linkage.
243
244.. _linkage_appending:
245
246``appending``
247 "``appending``" linkage may only be applied to global variables of
248 pointer to array type. When two global variables with appending
249 linkage are linked together, the two global arrays are appended
250 together. This is the LLVM, typesafe, equivalent of having the
251 system linker append together "sections" with identical names when
252 .o files are linked.
Rafael Espindolae64619c2016-05-16 21:14:24 +0000253
254 Unfortunately this doesn't correspond to any feature in .o files, so it
255 can only be used for variables like ``llvm.global_ctors`` which llvm
256 interprets specially.
257
Sean Silvab084af42012-12-07 10:36:55 +0000258``extern_weak``
259 The semantics of this linkage follow the ELF object file model: the
260 symbol is weak until linked, if not linked, the symbol becomes null
261 instead of being an undefined reference.
262``linkonce_odr``, ``weak_odr``
263 Some languages allow differing globals to be merged, such as two
264 functions with different semantics. Other languages, such as
265 ``C++``, ensure that only equivalent globals are ever merged (the
Sean Silvaa1190322015-08-06 22:56:48 +0000266 "one definition rule" --- "ODR"). Such languages can use the
Sean Silvab084af42012-12-07 10:36:55 +0000267 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
268 global will only be merged with equivalent globals. These linkage
269 types are otherwise the same as their non-``odr`` versions.
Sean Silvab084af42012-12-07 10:36:55 +0000270``external``
271 If none of the above identifiers are used, the global is externally
272 visible, meaning that it participates in linkage and can be used to
273 resolve external symbol references.
274
Sean Silvab084af42012-12-07 10:36:55 +0000275It is illegal for a function *declaration* to have any linkage type
Nico Rieck7157bb72014-01-14 15:22:47 +0000276other than ``external`` or ``extern_weak``.
Sean Silvab084af42012-12-07 10:36:55 +0000277
Sean Silvab084af42012-12-07 10:36:55 +0000278.. _callingconv:
279
280Calling Conventions
281-------------------
282
283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
284:ref:`invokes <i_invoke>` can all have an optional calling convention
285specified for the call. The calling convention of any pair of dynamic
286caller/callee must match, or the behavior of the program is undefined.
287The following calling conventions are supported by LLVM, and more may be
288added in the future:
289
290"``ccc``" - The C calling convention
291 This calling convention (the default if no other calling convention
292 is specified) matches the target C calling conventions. This calling
293 convention supports varargs function calls and tolerates some
294 mismatch in the declared prototype and implemented declaration of
295 the function (as does normal C).
296"``fastcc``" - The fast calling convention
297 This calling convention attempts to make calls as fast as possible
298 (e.g. by passing things in registers). This calling convention
299 allows the target to use whatever tricks it wants to produce fast
300 code for the target, without having to conform to an externally
301 specified ABI (Application Binary Interface). `Tail calls can only
302 be optimized when this, the GHC or the HiPE convention is
303 used. <CodeGenerator.html#id80>`_ This calling convention does not
304 support varargs and requires the prototype of all callees to exactly
305 match the prototype of the function definition.
306"``coldcc``" - The cold calling convention
307 This calling convention attempts to make code in the caller as
308 efficient as possible under the assumption that the call is not
309 commonly executed. As such, these calls often preserve all registers
310 so that the call does not break any live ranges in the caller side.
311 This calling convention does not support varargs and requires the
312 prototype of all callees to exactly match the prototype of the
Juergen Ributzka5d05ed12014-01-17 22:24:35 +0000313 function definition. Furthermore the inliner doesn't consider such function
314 calls for inlining.
Sean Silvab084af42012-12-07 10:36:55 +0000315"``cc 10``" - GHC convention
316 This calling convention has been implemented specifically for use by
317 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
318 It passes everything in registers, going to extremes to achieve this
319 by disabling callee save registers. This calling convention should
320 not be used lightly but only for specific situations such as an
321 alternative to the *register pinning* performance technique often
322 used when implementing functional programming languages. At the
323 moment only X86 supports this convention and it has the following
324 limitations:
325
326 - On *X86-32* only supports up to 4 bit type parameters. No
Sanjay Patel85fa9ef2018-03-21 14:15:33 +0000327 floating-point types are supported.
Sean Silvab084af42012-12-07 10:36:55 +0000328 - On *X86-64* only supports up to 10 bit type parameters and 6
Sanjay Patel85fa9ef2018-03-21 14:15:33 +0000329 floating-point parameters.
Sean Silvab084af42012-12-07 10:36:55 +0000330
331 This calling convention supports `tail call
332 optimization <CodeGenerator.html#id80>`_ but requires both the
333 caller and callee are using it.
334"``cc 11``" - The HiPE calling convention
335 This calling convention has been implemented specifically for use by
336 the `High-Performance Erlang
337 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
338 native code compiler of the `Ericsson's Open Source Erlang/OTP
339 system <http://www.erlang.org/download.shtml>`_. It uses more
340 registers for argument passing than the ordinary C calling
341 convention and defines no callee-saved registers. The calling
342 convention properly supports `tail call
343 optimization <CodeGenerator.html#id80>`_ but requires that both the
344 caller and the callee use it. It uses a *register pinning*
345 mechanism, similar to GHC's convention, for keeping frequently
346 accessed runtime components pinned to specific hardware registers.
347 At the moment only X86 supports this convention (both 32 and 64
348 bit).
Andrew Trick5e029ce2013-12-24 02:57:25 +0000349"``webkit_jscc``" - WebKit's JavaScript calling convention
350 This calling convention has been implemented for `WebKit FTL JIT
351 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
352 stack right to left (as cdecl does), and returns a value in the
353 platform's customary return register.
354"``anyregcc``" - Dynamic calling convention for code patching
355 This is a special convention that supports patching an arbitrary code
356 sequence in place of a call site. This convention forces the call
Eli Bendersky45324ce2015-04-02 15:20:04 +0000357 arguments into registers but allows them to be dynamically
Andrew Trick5e029ce2013-12-24 02:57:25 +0000358 allocated. This can currently only be used with calls to
359 llvm.experimental.patchpoint because only this intrinsic records
360 the location of its arguments in a side table. See :doc:`StackMaps`.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000361"``preserve_mostcc``" - The `PreserveMost` calling convention
Eli Bendersky45324ce2015-04-02 15:20:04 +0000362 This calling convention attempts to make the code in the caller as
363 unintrusive as possible. This convention behaves identically to the `C`
Juergen Ributzkae6250132014-01-17 19:47:03 +0000364 calling convention on how arguments and return values are passed, but it
365 uses a different set of caller/callee-saved registers. This alleviates the
366 burden of saving and recovering a large register set before and after the
Juergen Ributzka980f2dc2014-01-30 02:39:00 +0000367 call in the caller. If the arguments are passed in callee-saved registers,
368 then they will be preserved by the callee across the call. This doesn't
369 apply for values returned in callee-saved registers.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000370
371 - On X86-64 the callee preserves all general purpose registers, except for
372 R11. R11 can be used as a scratch register. Floating-point registers
373 (XMMs/YMMs) are not preserved and need to be saved by the caller.
374
375 The idea behind this convention is to support calls to runtime functions
376 that have a hot path and a cold path. The hot path is usually a small piece
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000377 of code that doesn't use many registers. The cold path might need to call out to
Juergen Ributzkae6250132014-01-17 19:47:03 +0000378 another function and therefore only needs to preserve the caller-saved
Juergen Ributzka5d05ed12014-01-17 22:24:35 +0000379 registers, which haven't already been saved by the caller. The
380 `PreserveMost` calling convention is very similar to the `cold` calling
381 convention in terms of caller/callee-saved registers, but they are used for
382 different types of function calls. `coldcc` is for function calls that are
383 rarely executed, whereas `preserve_mostcc` function calls are intended to be
384 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
385 doesn't prevent the inliner from inlining the function call.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000386
387 This calling convention will be used by a future version of the ObjectiveC
388 runtime and should therefore still be considered experimental at this time.
389 Although this convention was created to optimize certain runtime calls to
390 the ObjectiveC runtime, it is not limited to this runtime and might be used
391 by other runtimes in the future too. The current implementation only
392 supports X86-64, but the intention is to support more architectures in the
393 future.
394"``preserve_allcc``" - The `PreserveAll` calling convention
395 This calling convention attempts to make the code in the caller even less
396 intrusive than the `PreserveMost` calling convention. This calling
397 convention also behaves identical to the `C` calling convention on how
398 arguments and return values are passed, but it uses a different set of
399 caller/callee-saved registers. This removes the burden of saving and
Juergen Ributzka980f2dc2014-01-30 02:39:00 +0000400 recovering a large register set before and after the call in the caller. If
401 the arguments are passed in callee-saved registers, then they will be
402 preserved by the callee across the call. This doesn't apply for values
403 returned in callee-saved registers.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000404
405 - On X86-64 the callee preserves all general purpose registers, except for
406 R11. R11 can be used as a scratch register. Furthermore it also preserves
407 all floating-point registers (XMMs/YMMs).
408
409 The idea behind this convention is to support calls to runtime functions
410 that don't need to call out to any other functions.
411
412 This calling convention, like the `PreserveMost` calling convention, will be
413 used by a future version of the ObjectiveC runtime and should be considered
414 experimental at this time.
Manman Ren19c7bbe2015-12-04 17:40:13 +0000415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
Manman Ren17567d22015-12-07 21:40:09 +0000416 Clang generates an access function to access C++-style TLS. The access
417 function generally has an entry block, an exit block and an initialization
418 block that is run at the first time. The entry and exit blocks can access
419 a few TLS IR variables, each access will be lowered to a platform-specific
420 sequence.
421
Manman Ren19c7bbe2015-12-04 17:40:13 +0000422 This calling convention aims to minimize overhead in the caller by
Manman Ren17567d22015-12-07 21:40:09 +0000423 preserving as many registers as possible (all the registers that are
424 perserved on the fast path, composed of the entry and exit blocks).
425
426 This calling convention behaves identical to the `C` calling convention on
427 how arguments and return values are passed, but it uses a different set of
428 caller/callee-saved registers.
429
430 Given that each platform has its own lowering sequence, hence its own set
431 of preserved registers, we can't use the existing `PreserveMost`.
Manman Ren19c7bbe2015-12-04 17:40:13 +0000432
433 - On X86-64 the callee preserves all general purpose registers, except for
434 RDI and RAX.
Manman Renf8bdd882016-04-05 22:41:47 +0000435"``swiftcc``" - This calling convention is used for Swift language.
436 - On X86-64 RCX and R8 are available for additional integer returns, and
437 XMM2 and XMM3 are available for additional FP/vector returns.
Manman Ren802cd6f2016-04-05 22:44:44 +0000438 - On iOS platforms, we use AAPCS-VFP calling convention.
Sean Silvab084af42012-12-07 10:36:55 +0000439"``cc <n>``" - Numbered convention
440 Any calling convention may be specified by number, allowing
441 target-specific calling conventions to be used. Target specific
442 calling conventions start at 64.
443
444More calling conventions can be added/defined on an as-needed basis, to
445support Pascal conventions or any other well-known target-independent
446convention.
447
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000448.. _visibilitystyles:
449
Sean Silvab084af42012-12-07 10:36:55 +0000450Visibility Styles
451-----------------
452
453All Global Variables and Functions have one of the following visibility
454styles:
455
456"``default``" - Default style
457 On targets that use the ELF object file format, default visibility
458 means that the declaration is visible to other modules and, in
459 shared libraries, means that the declared entity may be overridden.
460 On Darwin, default visibility means that the declaration is visible
461 to other modules. Default visibility corresponds to "external
462 linkage" in the language.
463"``hidden``" - Hidden style
464 Two declarations of an object with hidden visibility refer to the
465 same object if they are in the same shared object. Usually, hidden
466 visibility indicates that the symbol will not be placed into the
467 dynamic symbol table, so no other module (executable or shared
468 library) can reference it directly.
469"``protected``" - Protected style
470 On ELF, protected visibility indicates that the symbol will be
471 placed in the dynamic symbol table, but that references within the
472 defining module will bind to the local symbol. That is, the symbol
473 cannot be overridden by another module.
474
Duncan P. N. Exon Smithb80de102014-05-07 22:57:20 +0000475A symbol with ``internal`` or ``private`` linkage must have ``default``
476visibility.
477
Rafael Espindola3bc64d52014-05-26 21:30:40 +0000478.. _dllstorageclass:
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000479
Nico Rieck7157bb72014-01-14 15:22:47 +0000480DLL Storage Classes
481-------------------
482
483All Global Variables, Functions and Aliases can have one of the following
484DLL storage class:
485
486``dllimport``
487 "``dllimport``" causes the compiler to reference a function or variable via
488 a global pointer to a pointer that is set up by the DLL exporting the
489 symbol. On Microsoft Windows targets, the pointer name is formed by
490 combining ``__imp_`` and the function or variable name.
491``dllexport``
492 "``dllexport``" causes the compiler to provide a global pointer to a pointer
493 in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
494 Microsoft Windows targets, the pointer name is formed by combining
495 ``__imp_`` and the function or variable name. Since this storage class
496 exists for defining a dll interface, the compiler, assembler and linker know
497 it is externally referenced and must refrain from deleting the symbol.
498
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000499.. _tls_model:
500
501Thread Local Storage Models
502---------------------------
503
504A variable may be defined as ``thread_local``, which means that it will
505not be shared by threads (each thread will have a separated copy of the
506variable). Not all targets support thread-local variables. Optionally, a
507TLS model may be specified:
508
509``localdynamic``
510 For variables that are only used within the current shared library.
511``initialexec``
512 For variables in modules that will not be loaded dynamically.
513``localexec``
514 For variables defined in the executable and only used within it.
515
516If no explicit model is given, the "general dynamic" model is used.
517
518The models correspond to the ELF TLS models; see `ELF Handling For
519Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
520more information on under which circumstances the different models may
521be used. The target may choose a different TLS model if the specified
522model is not supported, or if a better choice of model can be made.
523
Sean Silva706fba52015-08-06 22:56:24 +0000524A model can also be specified in an alias, but then it only governs how
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000525the alias is accessed. It will not have any effect in the aliasee.
526
Chih-Hung Hsieh1e859582015-07-28 16:24:05 +0000527For platforms without linker support of ELF TLS model, the -femulated-tls
528flag can be used to generate GCC compatible emulated TLS code.
529
Sean Fertilec70d28b2017-10-26 15:00:26 +0000530.. _runtime_preemption_model:
531
532Runtime Preemption Specifiers
533-----------------------------
534
535Global variables, functions and aliases may have an optional runtime preemption
536specifier. If a preemption specifier isn't given explicitly, then a
537symbol is assumed to be ``dso_preemptable``.
538
539``dso_preemptable``
540 Indicates that the function or variable may be replaced by a symbol from
541 outside the linkage unit at runtime.
542
543``dso_local``
544 The compiler may assume that a function or variable marked as ``dso_local``
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +0000545 will resolve to a symbol within the same linkage unit. Direct access will
Sean Fertilec70d28b2017-10-26 15:00:26 +0000546 be generated even if the definition is not within this compilation unit.
547
Rafael Espindola3bc64d52014-05-26 21:30:40 +0000548.. _namedtypes:
549
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000550Structure Types
551---------------
Sean Silvab084af42012-12-07 10:36:55 +0000552
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000553LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
Sean Silvaa1190322015-08-06 22:56:48 +0000554types <t_struct>`. Literal types are uniqued structurally, but identified types
555are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
Richard Smith32dbdf62014-07-31 04:25:36 +0000556to forward declare a type that is not yet available.
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000557
Sean Silva706fba52015-08-06 22:56:24 +0000558An example of an identified structure specification is:
Sean Silvab084af42012-12-07 10:36:55 +0000559
560.. code-block:: llvm
561
562 %mytype = type { %mytype*, i32 }
563
Sean Silvaa1190322015-08-06 22:56:48 +0000564Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000565literal types are uniqued in recent versions of LLVM.
Sean Silvab084af42012-12-07 10:36:55 +0000566
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +0000567.. _nointptrtype:
568
569Non-Integral Pointer Type
570-------------------------
571
572Note: non-integral pointer types are a work in progress, and they should be
573considered experimental at this time.
574
575LLVM IR optionally allows the frontend to denote pointers in certain address
Sanjoy Das63752e62016-08-10 21:48:24 +0000576spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
577Non-integral pointer types represent pointers that have an *unspecified* bitwise
578representation; that is, the integral representation may be target dependent or
579unstable (not backed by a fixed integer).
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +0000580
581``inttoptr`` instructions converting integers to non-integral pointer types are
582ill-typed, and so are ``ptrtoint`` instructions converting values of
583non-integral pointer types to integers. Vector versions of said instructions
584are ill-typed as well.
585
Sean Silvab084af42012-12-07 10:36:55 +0000586.. _globalvars:
587
588Global Variables
589----------------
590
591Global variables define regions of memory allocated at compilation time
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000592instead of run-time.
593
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000594Global variable definitions must be initialized.
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000595
596Global variables in other translation units can also be declared, in which
597case they don't have an initializer.
Sean Silvab084af42012-12-07 10:36:55 +0000598
Bob Wilson85b24f22014-06-12 20:40:33 +0000599Either global variable definitions or declarations may have an explicit section
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +0000600to be placed in and may have an optional explicit alignment specified. If there
601is a mismatch between the explicit or inferred section information for the
602variable declaration and its definition the resulting behavior is undefined.
Bob Wilson85b24f22014-06-12 20:40:33 +0000603
Michael Gottesman006039c2013-01-31 05:48:48 +0000604A variable may be defined as a global ``constant``, which indicates that
Sean Silvab084af42012-12-07 10:36:55 +0000605the contents of the variable will **never** be modified (enabling better
606optimization, allowing the global data to be placed in the read-only
607section of an executable, etc). Note that variables that need runtime
Michael Gottesman1cffcf742013-01-31 05:44:04 +0000608initialization cannot be marked ``constant`` as there is a store to the
Sean Silvab084af42012-12-07 10:36:55 +0000609variable.
610
611LLVM explicitly allows *declarations* of global variables to be marked
612constant, even if the final definition of the global is not. This
613capability can be used to enable slightly better optimization of the
614program, but requires the language definition to guarantee that
615optimizations based on the 'constantness' are valid for the translation
616units that do not include the definition.
617
618As SSA values, global variables define pointer values that are in scope
619(i.e. they dominate) all basic blocks in the program. Global variables
620always define a pointer to their "content" type because they describe a
621region of memory, and all memory objects in LLVM are accessed through
622pointers.
623
624Global variables can be marked with ``unnamed_addr`` which indicates
625that the address is not significant, only the content. Constants marked
626like this can be merged with other constants if they have the same
627initializer. Note that a constant with significant address *can* be
628merged with a ``unnamed_addr`` constant, the result being a constant
629whose address is significant.
630
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000631If the ``local_unnamed_addr`` attribute is given, the address is known to
632not be significant within the module.
633
Sean Silvab084af42012-12-07 10:36:55 +0000634A global variable may be declared to reside in a target-specific
635numbered address space. For targets that support them, address spaces
636may affect how optimizations are performed and/or what target
637instructions are used to access the variable. The default address space
638is zero. The address space qualifier must precede any other attributes.
639
640LLVM allows an explicit section to be specified for globals. If the
641target supports it, it will emit globals to the section specified.
David Majnemerdad0a642014-06-27 18:19:56 +0000642Additionally, the global can placed in a comdat if the target has the necessary
643support.
Sean Silvab084af42012-12-07 10:36:55 +0000644
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +0000645External declarations may have an explicit section specified. Section
646information is retained in LLVM IR for targets that make use of this
647information. Attaching section information to an external declaration is an
648assertion that its definition is located in the specified section. If the
649definition is located in a different section, the behavior is undefined.
Erich Keane0343ef82017-08-22 15:30:43 +0000650
Michael Gottesmane743a302013-02-04 03:22:00 +0000651By default, global initializers are optimized by assuming that global
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000652variables defined within the module are not modified from their
Sean Silvaa1190322015-08-06 22:56:48 +0000653initial values before the start of the global initializer. This is
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000654true even for variables potentially accessible from outside the
655module, including those with external linkage or appearing in
Yunzhong Gaof5b769e2013-12-05 18:37:54 +0000656``@llvm.used`` or dllexported variables. This assumption may be suppressed
657by marking the variable with ``externally_initialized``.
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000658
Sean Silvab084af42012-12-07 10:36:55 +0000659An explicit alignment may be specified for a global, which must be a
660power of 2. If not present, or if the alignment is set to zero, the
661alignment of the global is set by the target to whatever it feels
662convenient. If an explicit alignment is specified, the global is forced
663to have exactly that alignment. Targets and optimizers are not allowed
664to over-align the global if the global has an assigned section. In this
665case, the extra alignment could be observable: for example, code could
666assume that the globals are densely packed in their section and try to
667iterate over them as an array, alignment padding would break this
Reid Kleckner15fe7a52014-07-15 01:16:09 +0000668iteration. The maximum alignment is ``1 << 29``.
Sean Silvab084af42012-12-07 10:36:55 +0000669
Javed Absarf3d79042017-05-11 12:28:08 +0000670Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
Sean Fertilec70d28b2017-10-26 15:00:26 +0000671an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
Javed Absarf3d79042017-05-11 12:28:08 +0000672an optional :ref:`global attributes <glattrs>` and
673an optional list of attached :ref:`metadata <metadata>`.
Nico Rieck7157bb72014-01-14 15:22:47 +0000674
Peter Collingbourne69ba0162015-02-04 00:42:45 +0000675Variables and aliases can have a
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000676:ref:`Thread Local Storage Model <tls_model>`.
677
Nico Rieck7157bb72014-01-14 15:22:47 +0000678Syntax::
679
Sean Fertilec70d28b2017-10-26 15:00:26 +0000680 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
681 [DLLStorageClass] [ThreadLocal]
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000682 [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
683 [ExternallyInitialized]
Bob Wilson85b24f22014-06-12 20:40:33 +0000684 <global | constant> <Type> [<InitializerConstant>]
Rafael Espindola83a362c2015-01-06 22:55:16 +0000685 [, section "name"] [, comdat [($name)]]
Peter Collingbournecceae7f2016-05-31 23:01:54 +0000686 [, align <Alignment>] (, !name !N)*
Nico Rieck7157bb72014-01-14 15:22:47 +0000687
Sean Silvab084af42012-12-07 10:36:55 +0000688For example, the following defines a global in a numbered address space
689with an initializer, section, and alignment:
690
691.. code-block:: llvm
692
693 @G = addrspace(5) constant float 1.0, section "foo", align 4
694
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000695The following example just declares a global variable
696
697.. code-block:: llvm
698
699 @G = external global i32
700
Sean Silvab084af42012-12-07 10:36:55 +0000701The following example defines a thread-local global with the
702``initialexec`` TLS model:
703
704.. code-block:: llvm
705
706 @G = thread_local(initialexec) global i32 0, align 4
707
708.. _functionstructure:
709
710Functions
711---------
712
713LLVM function definitions consist of the "``define``" keyword, an
Sean Fertilec70d28b2017-10-26 15:00:26 +0000714optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
715specifier <runtime_preemption_model>`, an optional :ref:`visibility
Nico Rieck7157bb72014-01-14 15:22:47 +0000716style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
717an optional :ref:`calling convention <callingconv>`,
Sean Silvab084af42012-12-07 10:36:55 +0000718an optional ``unnamed_addr`` attribute, a return type, an optional
719:ref:`parameter attribute <paramattrs>` for the return type, a function
720name, a (possibly empty) argument list (each with optional :ref:`parameter
721attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
David Majnemerdad0a642014-06-27 18:19:56 +0000722an optional section, an optional alignment,
723an optional :ref:`comdat <langref_comdats>`,
Peter Collingbourne51d2de72014-12-03 02:08:38 +0000724an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
David Majnemer7fddecc2015-06-17 20:52:32 +0000725an optional :ref:`prologue <prologuedata>`,
726an optional :ref:`personality <personalityfn>`,
Peter Collingbourne50108682015-11-06 02:41:02 +0000727an optional list of attached :ref:`metadata <metadata>`,
David Majnemer7fddecc2015-06-17 20:52:32 +0000728an opening curly brace, a list of basic blocks, and a closing curly brace.
Sean Silvab084af42012-12-07 10:36:55 +0000729
730LLVM function declarations consist of the "``declare``" keyword, an
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000731optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
732<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
733optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
734or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter
735attribute <paramattrs>` for the return type, a function name, a possibly
736empty list of arguments, an optional alignment, an optional :ref:`garbage
737collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
738:ref:`prologue <prologuedata>`.
Sean Silvab084af42012-12-07 10:36:55 +0000739
Bill Wendling6822ecb2013-10-27 05:09:12 +0000740A function definition contains a list of basic blocks, forming the CFG (Control
741Flow Graph) for the function. Each basic block may optionally start with a label
742(giving the basic block a symbol table entry), contains a list of instructions,
743and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
744function return). If an explicit label is not provided, a block is assigned an
745implicit numbered label, using the next value from the same counter as used for
746unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function
747entry block does not have an explicit label, it will be assigned label "%0",
748then the first unnamed temporary in that block will be "%1", etc.
Sean Silvab084af42012-12-07 10:36:55 +0000749
750The first basic block in a function is special in two ways: it is
751immediately executed on entrance to the function, and it is not allowed
752to have predecessor basic blocks (i.e. there can not be any branches to
753the entry block of a function). Because the block can have no
754predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
755
756LLVM allows an explicit section to be specified for functions. If the
757target supports it, it will emit functions to the section specified.
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000758Additionally, the function can be placed in a COMDAT.
Sean Silvab084af42012-12-07 10:36:55 +0000759
760An explicit alignment may be specified for a function. If not present,
761or if the alignment is set to zero, the alignment of the function is set
762by the target to whatever it feels convenient. If an explicit alignment
763is specified, the function is forced to have at least that much
764alignment. All alignments must be a power of 2.
765
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000766If the ``unnamed_addr`` attribute is given, the address is known to not
Sean Silvab084af42012-12-07 10:36:55 +0000767be significant and two identical functions can be merged.
768
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000769If the ``local_unnamed_addr`` attribute is given, the address is known to
770not be significant within the module.
771
Sean Silvab084af42012-12-07 10:36:55 +0000772Syntax::
773
Sean Fertilec70d28b2017-10-26 15:00:26 +0000774 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
Sean Silvab084af42012-12-07 10:36:55 +0000775 [cconv] [ret attrs]
776 <ResultType> @<FunctionName> ([argument list])
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000777 [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"]
778 [comdat [($name)]] [align N] [gc] [prefix Constant]
779 [prologue Constant] [personality Constant] (!name !N)* { ... }
Sean Silvab084af42012-12-07 10:36:55 +0000780
Sean Silva706fba52015-08-06 22:56:24 +0000781The argument list is a comma separated sequence of arguments where each
782argument is of the following form:
Dan Liew2661dfc2014-08-20 15:06:30 +0000783
784Syntax::
785
786 <type> [parameter Attrs] [name]
787
788
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000789.. _langref_aliases:
790
Sean Silvab084af42012-12-07 10:36:55 +0000791Aliases
792-------
793
Rafael Espindola64c1e182014-06-03 02:41:57 +0000794Aliases, unlike function or variables, don't create any new data. They
795are just a new symbol and metadata for an existing position.
796
797Aliases have a name and an aliasee that is either a global value or a
798constant expression.
799
Nico Rieck7157bb72014-01-14 15:22:47 +0000800Aliases may have an optional :ref:`linkage type <linkage>`, an optional
Sean Fertilec70d28b2017-10-26 15:00:26 +0000801:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
Rafael Espindola64c1e182014-06-03 02:41:57 +0000802:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
803<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
Sean Silvab084af42012-12-07 10:36:55 +0000804
805Syntax::
806
Sean Fertilec70d28b2017-10-26 15:00:26 +0000807 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
Sean Silvab084af42012-12-07 10:36:55 +0000808
Rafael Espindola2fb5bc32014-03-13 23:18:37 +0000809The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
Rafael Espindola716e7402013-11-01 17:09:14 +0000810``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
Rafael Espindola64c1e182014-06-03 02:41:57 +0000811might not correctly handle dropping a weak symbol that is aliased.
Rafael Espindola78527052013-10-06 15:10:43 +0000812
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000813Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
Rafael Espindola42a4c9f2014-06-06 01:20:28 +0000814the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
815to the same content.
Rafael Espindolaf3336bc2014-03-12 20:15:49 +0000816
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000817If the ``local_unnamed_addr`` attribute is given, the address is known to
818not be significant within the module.
819
Rafael Espindola64c1e182014-06-03 02:41:57 +0000820Since aliases are only a second name, some restrictions apply, of which
821some can only be checked when producing an object file:
Rafael Espindolaf3336bc2014-03-12 20:15:49 +0000822
Rafael Espindola64c1e182014-06-03 02:41:57 +0000823* The expression defining the aliasee must be computable at assembly
824 time. Since it is just a name, no relocations can be used.
825
826* No alias in the expression can be weak as the possibility of the
827 intermediate alias being overridden cannot be represented in an
828 object file.
829
830* No global value in the expression can be a declaration, since that
831 would require a relocation, which is not possible.
Rafael Espindola24a669d2014-03-27 15:26:56 +0000832
Dmitry Polukhina1feff72016-04-07 12:32:19 +0000833.. _langref_ifunc:
834
835IFuncs
836-------
837
838IFuncs, like as aliases, don't create any new data or func. They are just a new
839symbol that dynamic linker resolves at runtime by calling a resolver function.
840
841IFuncs have a name and a resolver that is a function called by dynamic linker
842that returns address of another function associated with the name.
843
844IFunc may have an optional :ref:`linkage type <linkage>` and an optional
845:ref:`visibility style <visibility>`.
846
847Syntax::
848
849 @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
850
851
David Majnemerdad0a642014-06-27 18:19:56 +0000852.. _langref_comdats:
853
854Comdats
855-------
856
857Comdat IR provides access to COFF and ELF object file COMDAT functionality.
858
Sean Silvaa1190322015-08-06 22:56:48 +0000859Comdats have a name which represents the COMDAT key. All global objects that
David Majnemerdad0a642014-06-27 18:19:56 +0000860specify this key will only end up in the final object file if the linker chooses
Sean Silvaa1190322015-08-06 22:56:48 +0000861that key over some other key. Aliases are placed in the same COMDAT that their
David Majnemerdad0a642014-06-27 18:19:56 +0000862aliasee computes to, if any.
863
864Comdats have a selection kind to provide input on how the linker should
865choose between keys in two different object files.
866
867Syntax::
868
869 $<Name> = comdat SelectionKind
870
871The selection kind must be one of the following:
872
873``any``
874 The linker may choose any COMDAT key, the choice is arbitrary.
875``exactmatch``
876 The linker may choose any COMDAT key but the sections must contain the
877 same data.
878``largest``
879 The linker will choose the section containing the largest COMDAT key.
880``noduplicates``
881 The linker requires that only section with this COMDAT key exist.
882``samesize``
883 The linker may choose any COMDAT key but the sections must contain the
884 same amount of data.
885
Sam Cleggea7cace2018-01-09 23:43:14 +0000886Note that the Mach-O platform doesn't support COMDATs, and ELF and WebAssembly
887only support ``any`` as a selection kind.
David Majnemerdad0a642014-06-27 18:19:56 +0000888
889Here is an example of a COMDAT group where a function will only be selected if
890the COMDAT key's section is the largest:
891
Renato Golin124f2592016-07-20 12:16:38 +0000892.. code-block:: text
David Majnemerdad0a642014-06-27 18:19:56 +0000893
894 $foo = comdat largest
Rafael Espindola83a362c2015-01-06 22:55:16 +0000895 @foo = global i32 2, comdat($foo)
David Majnemerdad0a642014-06-27 18:19:56 +0000896
Rafael Espindola83a362c2015-01-06 22:55:16 +0000897 define void @bar() comdat($foo) {
David Majnemerdad0a642014-06-27 18:19:56 +0000898 ret void
899 }
900
Rafael Espindola83a362c2015-01-06 22:55:16 +0000901As a syntactic sugar the ``$name`` can be omitted if the name is the same as
902the global name:
903
Renato Golin124f2592016-07-20 12:16:38 +0000904.. code-block:: text
Rafael Espindola83a362c2015-01-06 22:55:16 +0000905
906 $foo = comdat any
907 @foo = global i32 2, comdat
908
909
David Majnemerdad0a642014-06-27 18:19:56 +0000910In a COFF object file, this will create a COMDAT section with selection kind
911``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
912and another COMDAT section with selection kind
913``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
Hans Wennborg0def0662014-09-10 17:05:08 +0000914section and contains the contents of the ``@bar`` symbol.
David Majnemerdad0a642014-06-27 18:19:56 +0000915
916There are some restrictions on the properties of the global object.
917It, or an alias to it, must have the same name as the COMDAT group when
918targeting COFF.
919The contents and size of this object may be used during link-time to determine
920which COMDAT groups get selected depending on the selection kind.
921Because the name of the object must match the name of the COMDAT group, the
922linkage of the global object must not be local; local symbols can get renamed
923if a collision occurs in the symbol table.
924
925The combined use of COMDATS and section attributes may yield surprising results.
926For example:
927
Renato Golin124f2592016-07-20 12:16:38 +0000928.. code-block:: text
David Majnemerdad0a642014-06-27 18:19:56 +0000929
930 $foo = comdat any
931 $bar = comdat any
Rafael Espindola83a362c2015-01-06 22:55:16 +0000932 @g1 = global i32 42, section "sec", comdat($foo)
933 @g2 = global i32 42, section "sec", comdat($bar)
David Majnemerdad0a642014-06-27 18:19:56 +0000934
935From the object file perspective, this requires the creation of two sections
Sean Silvaa1190322015-08-06 22:56:48 +0000936with the same name. This is necessary because both globals belong to different
David Majnemerdad0a642014-06-27 18:19:56 +0000937COMDAT groups and COMDATs, at the object file level, are represented by
938sections.
939
Peter Collingbourne1feef2e2015-06-30 19:10:31 +0000940Note that certain IR constructs like global variables and functions may
941create COMDATs in the object file in addition to any which are specified using
Sean Silvaa1190322015-08-06 22:56:48 +0000942COMDAT IR. This arises when the code generator is configured to emit globals
Peter Collingbourne1feef2e2015-06-30 19:10:31 +0000943in individual sections (e.g. when `-data-sections` or `-function-sections`
944is supplied to `llc`).
David Majnemerdad0a642014-06-27 18:19:56 +0000945
Sean Silvab084af42012-12-07 10:36:55 +0000946.. _namedmetadatastructure:
947
948Named Metadata
949--------------
950
951Named metadata is a collection of metadata. :ref:`Metadata
952nodes <metadata>` (but not metadata strings) are the only valid
953operands for a named metadata.
954
Filipe Cabecinhas62431b12015-06-02 21:25:08 +0000955#. Named metadata are represented as a string of characters with the
956 metadata prefix. The rules for metadata names are the same as for
957 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
958 are still valid, which allows any character to be part of a name.
959
Sean Silvab084af42012-12-07 10:36:55 +0000960Syntax::
961
962 ; Some unnamed metadata nodes, which are referenced by the named metadata.
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +0000963 !0 = !{!"zero"}
964 !1 = !{!"one"}
965 !2 = !{!"two"}
Sean Silvab084af42012-12-07 10:36:55 +0000966 ; A named metadata.
967 !name = !{!0, !1, !2}
968
969.. _paramattrs:
970
971Parameter Attributes
972--------------------
973
974The return type and each parameter of a function type may have a set of
975*parameter attributes* associated with them. Parameter attributes are
976used to communicate additional information about the result or
977parameters of a function. Parameter attributes are considered to be part
978of the function, not of the function type, so functions with different
979parameter attributes can have the same function type.
980
981Parameter attributes are simple keywords that follow the type specified.
982If multiple parameter attributes are needed, they are space separated.
983For example:
984
985.. code-block:: llvm
986
987 declare i32 @printf(i8* noalias nocapture, ...)
988 declare i32 @atoi(i8 zeroext)
989 declare signext i8 @returns_signed_char()
990
991Note that any attributes for the function result (``nounwind``,
992``readonly``) come immediately after the argument list.
993
994Currently, only the following parameter attributes are defined:
995
996``zeroext``
997 This indicates to the code generator that the parameter or return
998 value should be zero-extended to the extent required by the target's
Hans Wennborg850ec6c2016-02-08 19:34:30 +0000999 ABI by the caller (for a parameter) or the callee (for a return value).
Sean Silvab084af42012-12-07 10:36:55 +00001000``signext``
1001 This indicates to the code generator that the parameter or return
1002 value should be sign-extended to the extent required by the target's
1003 ABI (which is usually 32-bits) by the caller (for a parameter) or
1004 the callee (for a return value).
1005``inreg``
1006 This indicates that this parameter or return value should be treated
Sean Silva706fba52015-08-06 22:56:24 +00001007 in a special target-dependent fashion while emitting code for
Sean Silvab084af42012-12-07 10:36:55 +00001008 a function call or return (usually, by putting it in a register as
1009 opposed to memory, though some targets use it to distinguish between
1010 two different kinds of registers). Use of this attribute is
1011 target-specific.
1012``byval``
1013 This indicates that the pointer parameter should really be passed by
1014 value to the function. The attribute implies that a hidden copy of
1015 the pointee is made between the caller and the callee, so the callee
1016 is unable to modify the value in the caller. This attribute is only
1017 valid on LLVM pointer arguments. It is generally used to pass
1018 structs and arrays by value, but is also valid on pointers to
1019 scalars. The copy is considered to belong to the caller not the
1020 callee (for example, ``readonly`` functions should not write to
1021 ``byval`` parameters). This is not a valid attribute for return
1022 values.
1023
1024 The byval attribute also supports specifying an alignment with the
1025 align attribute. It indicates the alignment of the stack slot to
1026 form and the known alignment of the pointer specified to the call
1027 site. If the alignment is not specified, then the code generator
1028 makes a target-specific assumption.
1029
Reid Klecknera534a382013-12-19 02:14:12 +00001030.. _attr_inalloca:
1031
1032``inalloca``
1033
Reid Kleckner60d3a832014-01-16 22:59:24 +00001034 The ``inalloca`` argument attribute allows the caller to take the
Sean Silvaa1190322015-08-06 22:56:48 +00001035 address of outgoing stack arguments. An ``inalloca`` argument must
Reid Kleckner436c42e2014-01-17 23:58:17 +00001036 be a pointer to stack memory produced by an ``alloca`` instruction.
1037 The alloca, or argument allocation, must also be tagged with the
Sean Silvaa1190322015-08-06 22:56:48 +00001038 inalloca keyword. Only the last argument may have the ``inalloca``
Reid Kleckner436c42e2014-01-17 23:58:17 +00001039 attribute, and that argument is guaranteed to be passed in memory.
Reid Klecknera534a382013-12-19 02:14:12 +00001040
Reid Kleckner436c42e2014-01-17 23:58:17 +00001041 An argument allocation may be used by a call at most once because
Sean Silvaa1190322015-08-06 22:56:48 +00001042 the call may deallocate it. The ``inalloca`` attribute cannot be
Reid Kleckner436c42e2014-01-17 23:58:17 +00001043 used in conjunction with other attributes that affect argument
Sean Silvaa1190322015-08-06 22:56:48 +00001044 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
Reid Klecknerf5b76512014-01-31 23:50:57 +00001045 ``inalloca`` attribute also disables LLVM's implicit lowering of
1046 large aggregate return values, which means that frontend authors
1047 must lower them with ``sret`` pointers.
Reid Klecknera534a382013-12-19 02:14:12 +00001048
Reid Kleckner60d3a832014-01-16 22:59:24 +00001049 When the call site is reached, the argument allocation must have
1050 been the most recent stack allocation that is still live, or the
Sean Silvaa1190322015-08-06 22:56:48 +00001051 results are undefined. It is possible to allocate additional stack
Reid Kleckner60d3a832014-01-16 22:59:24 +00001052 space after an argument allocation and before its call site, but it
1053 must be cleared off with :ref:`llvm.stackrestore
1054 <int_stackrestore>`.
Reid Klecknera534a382013-12-19 02:14:12 +00001055
1056 See :doc:`InAlloca` for more information on how to use this
1057 attribute.
1058
Sean Silvab084af42012-12-07 10:36:55 +00001059``sret``
1060 This indicates that the pointer parameter specifies the address of a
1061 structure that is the return value of the function in the source
1062 program. This pointer must be guaranteed by the caller to be valid:
Reid Kleckner1361c0c2016-09-08 15:45:27 +00001063 loads and stores to the structure may be assumed by the callee not
1064 to trap and to be properly aligned. This is not a valid attribute
1065 for return values.
Sean Silva1703e702014-04-08 21:06:22 +00001066
Daniel Neilson1e687242018-01-19 17:13:12 +00001067.. _attr_align:
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00001068
Hal Finkelccc70902014-07-22 16:58:55 +00001069``align <n>``
1070 This indicates that the pointer value may be assumed by the optimizer to
1071 have the specified alignment.
1072
1073 Note that this attribute has additional semantics when combined with the
1074 ``byval`` attribute.
1075
Sean Silva1703e702014-04-08 21:06:22 +00001076.. _noalias:
1077
Sean Silvab084af42012-12-07 10:36:55 +00001078``noalias``
Hal Finkel12d36302014-11-21 02:22:46 +00001079 This indicates that objects accessed via pointer values
1080 :ref:`based <pointeraliasing>` on the argument or return value are not also
1081 accessed, during the execution of the function, via pointer values not
1082 *based* on the argument or return value. The attribute on a return value
1083 also has additional semantics described below. The caller shares the
1084 responsibility with the callee for ensuring that these requirements are met.
1085 For further details, please see the discussion of the NoAlias response in
1086 :ref:`alias analysis <Must, May, or No>`.
Sean Silvab084af42012-12-07 10:36:55 +00001087
1088 Note that this definition of ``noalias`` is intentionally similar
Hal Finkel12d36302014-11-21 02:22:46 +00001089 to the definition of ``restrict`` in C99 for function arguments.
Sean Silvab084af42012-12-07 10:36:55 +00001090
1091 For function return values, C99's ``restrict`` is not meaningful,
Hal Finkel12d36302014-11-21 02:22:46 +00001092 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1093 attribute on return values are stronger than the semantics of the attribute
1094 when used on function arguments. On function return values, the ``noalias``
1095 attribute indicates that the function acts like a system memory allocation
1096 function, returning a pointer to allocated storage disjoint from the
1097 storage for any other object accessible to the caller.
1098
Sean Silvab084af42012-12-07 10:36:55 +00001099``nocapture``
1100 This indicates that the callee does not make any copies of the
1101 pointer that outlive the callee itself. This is not a valid
David Majnemer7f324202016-05-26 17:36:22 +00001102 attribute for return values. Addresses used in volatile operations
1103 are considered to be captured.
Sean Silvab084af42012-12-07 10:36:55 +00001104
1105.. _nest:
1106
1107``nest``
1108 This indicates that the pointer parameter can be excised using the
1109 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
Stephen Linb8bd2322013-04-20 05:14:40 +00001110 attribute for return values and can only be applied to one parameter.
1111
1112``returned``
Stephen Linfec5b0b2013-06-20 21:55:10 +00001113 This indicates that the function always returns the argument as its return
Hal Finkel3b66caa2016-07-10 21:52:39 +00001114 value. This is a hint to the optimizer and code generator used when
1115 generating the caller, allowing value propagation, tail call optimization,
1116 and omission of register saves and restores in some cases; it is not
1117 checked or enforced when generating the callee. The parameter and the
1118 function return type must be valid operands for the
1119 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1120 return values and can only be applied to one parameter.
Sean Silvab084af42012-12-07 10:36:55 +00001121
Nick Lewyckyd52b1522014-05-20 01:23:40 +00001122``nonnull``
1123 This indicates that the parameter or return pointer is not null. This
1124 attribute may only be applied to pointer typed parameters. This is not
1125 checked or enforced by LLVM, the caller must ensure that the pointer
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001126 passed in is non-null, or the callee must ensure that the returned pointer
Nick Lewyckyd52b1522014-05-20 01:23:40 +00001127 is non-null.
1128
Hal Finkelb0407ba2014-07-18 15:51:28 +00001129``dereferenceable(<n>)``
1130 This indicates that the parameter or return pointer is dereferenceable. This
1131 attribute may only be applied to pointer typed parameters. A pointer that
1132 is dereferenceable can be loaded from speculatively without a risk of
1133 trapping. The number of bytes known to be dereferenceable must be provided
1134 in parentheses. It is legal for the number of bytes to be less than the
1135 size of the pointee type. The ``nonnull`` attribute does not imply
1136 dereferenceability (consider a pointer to one element past the end of an
1137 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1138 ``addrspace(0)`` (which is the default address space).
1139
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001140``dereferenceable_or_null(<n>)``
1141 This indicates that the parameter or return value isn't both
1142 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
Sean Silvaa1190322015-08-06 22:56:48 +00001143 time. All non-null pointers tagged with
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001144 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1145 For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1146 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1147 and in other address spaces ``dereferenceable_or_null(<n>)``
1148 implies that a pointer is at least one of ``dereferenceable(<n>)``
1149 or ``null`` (i.e. it may be both ``null`` and
Sean Silvaa1190322015-08-06 22:56:48 +00001150 ``dereferenceable(<n>)``). This attribute may only be applied to
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001151 pointer typed parameters.
1152
Manman Renf46262e2016-03-29 17:37:21 +00001153``swiftself``
1154 This indicates that the parameter is the self/context parameter. This is not
1155 a valid attribute for return values and can only be applied to one
1156 parameter.
1157
Manman Ren9bfd0d02016-04-01 21:41:15 +00001158``swifterror``
1159 This attribute is motivated to model and optimize Swift error handling. It
1160 can be applied to a parameter with pointer to pointer type or a
1161 pointer-sized alloca. At the call site, the actual argument that corresponds
Arnold Schwaighofer6c57f4f2016-09-10 19:42:53 +00001162 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1163 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1164 the parameter or the alloca) can only be loaded and stored from, or used as
1165 a ``swifterror`` argument. This is not a valid attribute for return values
1166 and can only be applied to one parameter.
Manman Ren9bfd0d02016-04-01 21:41:15 +00001167
1168 These constraints allow the calling convention to optimize access to
1169 ``swifterror`` variables by associating them with a specific register at
1170 call boundaries rather than placing them in memory. Since this does change
1171 the calling convention, a function which uses the ``swifterror`` attribute
1172 on a parameter is not ABI-compatible with one which does not.
1173
1174 These constraints also allow LLVM to assume that a ``swifterror`` argument
1175 does not alias any other memory visible within a function and that a
1176 ``swifterror`` alloca passed as an argument does not escape.
1177
Sean Silvab084af42012-12-07 10:36:55 +00001178.. _gc:
1179
Philip Reamesf80bbff2015-02-25 23:45:20 +00001180Garbage Collector Strategy Names
1181--------------------------------
Sean Silvab084af42012-12-07 10:36:55 +00001182
Philip Reamesf80bbff2015-02-25 23:45:20 +00001183Each function may specify a garbage collector strategy name, which is simply a
Sean Silvab084af42012-12-07 10:36:55 +00001184string:
1185
1186.. code-block:: llvm
1187
1188 define void @f() gc "name" { ... }
1189
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001190The supported values of *name* includes those :ref:`built in to LLVM
Sean Silvaa1190322015-08-06 22:56:48 +00001191<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001192strategy will cause the compiler to alter its output in order to support the
Sean Silvaa1190322015-08-06 22:56:48 +00001193named garbage collection algorithm. Note that LLVM itself does not contain a
Philip Reamesf80bbff2015-02-25 23:45:20 +00001194garbage collector, this functionality is restricted to generating machine code
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001195which can interoperate with a collector provided externally.
Sean Silvab084af42012-12-07 10:36:55 +00001196
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001197.. _prefixdata:
1198
1199Prefix Data
1200-----------
1201
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001202Prefix data is data associated with a function which the code
1203generator will emit immediately before the function's entrypoint.
1204The purpose of this feature is to allow frontends to associate
1205language-specific runtime metadata with specific functions and make it
1206available through the function pointer while still allowing the
1207function pointer to be called.
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001208
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001209To access the data for a given function, a program may bitcast the
1210function pointer to a pointer to the constant's type and dereference
Sean Silvaa1190322015-08-06 22:56:48 +00001211index -1. This implies that the IR symbol points just past the end of
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001212the prefix data. For instance, take the example of a function annotated
1213with a single ``i32``,
1214
1215.. code-block:: llvm
1216
1217 define void @f() prefix i32 123 { ... }
1218
1219The prefix data can be referenced as,
1220
1221.. code-block:: llvm
1222
David Blaikie16a97eb2015-03-04 22:02:58 +00001223 %0 = bitcast void* () @f to i32*
1224 %a = getelementptr inbounds i32, i32* %0, i32 -1
David Blaikiec7aabbb2015-03-04 22:06:14 +00001225 %b = load i32, i32* %a
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001226
1227Prefix data is laid out as if it were an initializer for a global variable
Sean Silvaa1190322015-08-06 22:56:48 +00001228of the prefix data's type. The function will be placed such that the
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001229beginning of the prefix data is aligned. This means that if the size
1230of the prefix data is not a multiple of the alignment size, the
1231function's entrypoint will not be aligned. If alignment of the
1232function's entrypoint is desired, padding must be added to the prefix
1233data.
1234
Sean Silvaa1190322015-08-06 22:56:48 +00001235A function may have prefix data but no body. This has similar semantics
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001236to the ``available_externally`` linkage in that the data may be used by the
1237optimizers but will not be emitted in the object file.
1238
1239.. _prologuedata:
1240
1241Prologue Data
1242-------------
1243
1244The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1245be inserted prior to the function body. This can be used for enabling
1246function hot-patching and instrumentation.
1247
1248To maintain the semantics of ordinary function calls, the prologue data must
Sean Silvaa1190322015-08-06 22:56:48 +00001249have a particular format. Specifically, it must begin with a sequence of
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001250bytes which decode to a sequence of machine instructions, valid for the
1251module's target, which transfer control to the point immediately succeeding
Sean Silvaa1190322015-08-06 22:56:48 +00001252the prologue data, without performing any other visible action. This allows
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001253the inliner and other passes to reason about the semantics of the function
Sean Silvaa1190322015-08-06 22:56:48 +00001254definition without needing to reason about the prologue data. Obviously this
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001255makes the format of the prologue data highly target dependent.
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001256
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001257A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001258which encodes the ``nop`` instruction:
1259
Renato Golin124f2592016-07-20 12:16:38 +00001260.. code-block:: text
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001261
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001262 define void @f() prologue i8 144 { ... }
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001263
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001264Generally prologue data can be formed by encoding a relative branch instruction
1265which skips the metadata, as in this example of valid prologue data for the
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001266x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1267
Renato Golin124f2592016-07-20 12:16:38 +00001268.. code-block:: text
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001269
1270 %0 = type <{ i8, i8, i8* }>
1271
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001272 define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001273
Sean Silvaa1190322015-08-06 22:56:48 +00001274A function may have prologue data but no body. This has similar semantics
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001275to the ``available_externally`` linkage in that the data may be used by the
1276optimizers but will not be emitted in the object file.
1277
David Majnemer7fddecc2015-06-17 20:52:32 +00001278.. _personalityfn:
1279
1280Personality Function
David Majnemerc5ad8a92015-06-17 21:21:16 +00001281--------------------
David Majnemer7fddecc2015-06-17 20:52:32 +00001282
1283The ``personality`` attribute permits functions to specify what function
1284to use for exception handling.
1285
Bill Wendling63b88192013-02-06 06:52:58 +00001286.. _attrgrp:
1287
1288Attribute Groups
1289----------------
1290
1291Attribute groups are groups of attributes that are referenced by objects within
1292the IR. They are important for keeping ``.ll`` files readable, because a lot of
1293functions will use the same set of attributes. In the degenerative case of a
1294``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1295group will capture the important command line flags used to build that file.
1296
1297An attribute group is a module-level object. To use an attribute group, an
1298object references the attribute group's ID (e.g. ``#37``). An object may refer
1299to more than one attribute group. In that situation, the attributes from the
1300different groups are merged.
1301
1302Here is an example of attribute groups for a function that should always be
1303inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1304
1305.. code-block:: llvm
1306
1307 ; Target-independent attributes:
Eli Bendersky97ad9242013-04-18 16:11:44 +00001308 attributes #0 = { alwaysinline alignstack=4 }
Bill Wendling63b88192013-02-06 06:52:58 +00001309
1310 ; Target-dependent attributes:
Eli Bendersky97ad9242013-04-18 16:11:44 +00001311 attributes #1 = { "no-sse" }
Bill Wendling63b88192013-02-06 06:52:58 +00001312
1313 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1314 define void @f() #0 #1 { ... }
1315
Sean Silvab084af42012-12-07 10:36:55 +00001316.. _fnattrs:
1317
1318Function Attributes
1319-------------------
1320
1321Function attributes are set to communicate additional information about
1322a function. Function attributes are considered to be part of the
1323function, not of the function type, so functions with different function
1324attributes can have the same function type.
1325
1326Function attributes are simple keywords that follow the type specified.
1327If multiple attributes are needed, they are space separated. For
1328example:
1329
1330.. code-block:: llvm
1331
1332 define void @f() noinline { ... }
1333 define void @f() alwaysinline { ... }
1334 define void @f() alwaysinline optsize { ... }
1335 define void @f() optsize { ... }
1336
Sean Silvab084af42012-12-07 10:36:55 +00001337``alignstack(<n>)``
1338 This attribute indicates that, when emitting the prologue and
1339 epilogue, the backend should forcibly align the stack pointer.
1340 Specify the desired alignment, which must be a power of two, in
1341 parentheses.
George Burgess IV278199f2016-04-12 01:05:35 +00001342``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1343 This attribute indicates that the annotated function will always return at
1344 least a given number of bytes (or null). Its arguments are zero-indexed
1345 parameter numbers; if one argument is provided, then it's assumed that at
1346 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1347 returned pointer. If two are provided, then it's assumed that
1348 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1349 available. The referenced parameters must be integer types. No assumptions
1350 are made about the contents of the returned block of memory.
Sean Silvab084af42012-12-07 10:36:55 +00001351``alwaysinline``
1352 This attribute indicates that the inliner should attempt to inline
1353 this function into callers whenever possible, ignoring any active
1354 inlining size threshold for this caller.
Michael Gottesman41748d72013-06-27 00:25:01 +00001355``builtin``
1356 This indicates that the callee function at a call site should be
1357 recognized as a built-in function, even though the function's declaration
Michael Gottesman3a6a9672013-07-02 21:32:56 +00001358 uses the ``nobuiltin`` attribute. This is only valid at call sites for
Richard Smith32dbdf62014-07-31 04:25:36 +00001359 direct calls to functions that are declared with the ``nobuiltin``
Michael Gottesman41748d72013-06-27 00:25:01 +00001360 attribute.
Michael Gottesman296adb82013-06-27 22:48:08 +00001361``cold``
1362 This attribute indicates that this function is rarely called. When
1363 computing edge weights, basic blocks post-dominated by a cold
1364 function call are also considered to be cold; and, thus, given low
1365 weight.
Owen Anderson85fa7d52015-05-26 23:48:40 +00001366``convergent``
Justin Lebard5fb6952016-02-09 23:03:17 +00001367 In some parallel execution models, there exist operations that cannot be
1368 made control-dependent on any additional values. We call such operations
Justin Lebar58535b12016-02-17 17:46:41 +00001369 ``convergent``, and mark them with this attribute.
Justin Lebard5fb6952016-02-09 23:03:17 +00001370
Justin Lebar58535b12016-02-17 17:46:41 +00001371 The ``convergent`` attribute may appear on functions or call/invoke
1372 instructions. When it appears on a function, it indicates that calls to
1373 this function should not be made control-dependent on additional values.
Justin Bognera4635372016-07-06 20:02:45 +00001374 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
Justin Lebard5fb6952016-02-09 23:03:17 +00001375 calls to this intrinsic cannot be made control-dependent on additional
Justin Lebar58535b12016-02-17 17:46:41 +00001376 values.
Justin Lebard5fb6952016-02-09 23:03:17 +00001377
Justin Lebar58535b12016-02-17 17:46:41 +00001378 When it appears on a call/invoke, the ``convergent`` attribute indicates
1379 that we should treat the call as though we're calling a convergent
1380 function. This is particularly useful on indirect calls; without this we
1381 may treat such calls as though the target is non-convergent.
1382
1383 The optimizer may remove the ``convergent`` attribute on functions when it
1384 can prove that the function does not execute any convergent operations.
1385 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1386 can prove that the call/invoke cannot call a convergent function.
Vaivaswatha Nagarajfb3f4902015-12-16 16:16:19 +00001387``inaccessiblememonly``
1388 This attribute indicates that the function may only access memory that
1389 is not accessible by the module being compiled. This is a weaker form
1390 of ``readnone``.
1391``inaccessiblemem_or_argmemonly``
1392 This attribute indicates that the function may only access memory that is
1393 either not accessible by the module being compiled, or is pointed to
1394 by its pointer arguments. This is a weaker form of ``argmemonly``
Sean Silvab084af42012-12-07 10:36:55 +00001395``inlinehint``
1396 This attribute indicates that the source code contained a hint that
1397 inlining this function is desirable (such as the "inline" keyword in
1398 C/C++). It is just a hint; it imposes no requirements on the
1399 inliner.
Tom Roeder44cb65f2014-06-05 19:29:43 +00001400``jumptable``
1401 This attribute indicates that the function should be added to a
1402 jump-instruction table at code-generation time, and that all address-taken
1403 references to this function should be replaced with a reference to the
1404 appropriate jump-instruction-table function pointer. Note that this creates
1405 a new pointer for the original function, which means that code that depends
1406 on function-pointer identity can break. So, any function annotated with
1407 ``jumptable`` must also be ``unnamed_addr``.
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001408``minsize``
1409 This attribute suggests that optimization passes and code generator
1410 passes make choices that keep the code size of this function as small
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001411 as possible and perform optimizations that may sacrifice runtime
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001412 performance in order to minimize the size of the generated code.
Sean Silvab084af42012-12-07 10:36:55 +00001413``naked``
1414 This attribute disables prologue / epilogue emission for the
1415 function. This can have very system-specific consequences.
Sumanth Gundapaneni6af104e2017-07-28 22:26:22 +00001416``no-jump-tables``
1417 When this attribute is set to true, the jump tables and lookup tables that
1418 can be generated from a switch case lowering are disabled.
Eli Bendersky97ad9242013-04-18 16:11:44 +00001419``nobuiltin``
Michael Gottesman41748d72013-06-27 00:25:01 +00001420 This indicates that the callee function at a call site is not recognized as
1421 a built-in function. LLVM will retain the original call and not replace it
1422 with equivalent code based on the semantics of the built-in function, unless
1423 the call site uses the ``builtin`` attribute. This is valid at call sites
1424 and on function declarations and definitions.
Bill Wendlingbf902f12013-02-06 06:22:58 +00001425``noduplicate``
1426 This attribute indicates that calls to the function cannot be
1427 duplicated. A call to a ``noduplicate`` function may be moved
1428 within its parent function, but may not be duplicated within
1429 its parent function.
1430
1431 A function containing a ``noduplicate`` call may still
1432 be an inlining candidate, provided that the call is not
1433 duplicated by inlining. That implies that the function has
1434 internal linkage and only has one call site, so the original
1435 call is dead after inlining.
Sean Silvab084af42012-12-07 10:36:55 +00001436``noimplicitfloat``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00001437 This attributes disables implicit floating-point instructions.
Sean Silvab084af42012-12-07 10:36:55 +00001438``noinline``
1439 This attribute indicates that the inliner should never inline this
1440 function in any situation. This attribute may not be used together
1441 with the ``alwaysinline`` attribute.
Sean Silva1cbbcf12013-08-06 19:34:37 +00001442``nonlazybind``
1443 This attribute suppresses lazy symbol binding for the function. This
1444 may make calls to the function faster, at the cost of extra program
1445 startup time if the function is not called during program startup.
Sean Silvab084af42012-12-07 10:36:55 +00001446``noredzone``
1447 This attribute indicates that the code generator should not use a
1448 red zone, even if the target-specific ABI normally permits it.
1449``noreturn``
1450 This function attribute indicates that the function never returns
1451 normally. This produces undefined behavior at runtime if the
1452 function ever does dynamically return.
James Molloye6f87ca2015-11-06 10:32:53 +00001453``norecurse``
1454 This function attribute indicates that the function does not call itself
1455 either directly or indirectly down any possible call path. This produces
1456 undefined behavior at runtime if the function ever does recurse.
Sean Silvab084af42012-12-07 10:36:55 +00001457``nounwind``
Reid Kleckner96d01132015-02-11 01:23:16 +00001458 This function attribute indicates that the function never raises an
1459 exception. If the function does raise an exception, its runtime
1460 behavior is undefined. However, functions marked nounwind may still
1461 trap or generate asynchronous exceptions. Exception handling schemes
1462 that are recognized by LLVM to handle asynchronous exceptions, such
1463 as SEH, will still provide their implementation defined semantics.
Manoj Gupta77eeac32018-07-09 22:27:23 +00001464``"null-pointer-is-valid"``
1465 If ``"null-pointer-is-valid"`` is set to ``"true"``, then ``null`` address
1466 in address-space 0 is considered to be a valid address for memory loads and
1467 stores. Any analysis or optimization should not treat dereferencing a
1468 pointer to ``null`` as undefined behavior in this function.
1469 Note: Comparing address of a global variable to ``null`` may still
1470 evaluate to false because of a limitation in querying this attribute inside
1471 constant expressions.
Matt Morehouse31819412018-03-22 19:50:10 +00001472``optforfuzzing``
1473 This attribute indicates that this function should be optimized
1474 for maximum fuzzing signal.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001475``optnone``
Paul Robinsona2550a62015-11-30 21:56:16 +00001476 This function attribute indicates that most optimization passes will skip
1477 this function, with the exception of interprocedural optimization passes.
1478 Code generation defaults to the "fast" instruction selector.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001479 This attribute cannot be used together with the ``alwaysinline``
1480 attribute; this attribute is also incompatible
1481 with the ``minsize`` attribute and the ``optsize`` attribute.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001482
Paul Robinsondcbe35b2013-11-18 21:44:03 +00001483 This attribute requires the ``noinline`` attribute to be specified on
1484 the function as well, so the function is never inlined into any caller.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001485 Only functions with the ``alwaysinline`` attribute are valid
Paul Robinsondcbe35b2013-11-18 21:44:03 +00001486 candidates for inlining into the body of this function.
Sean Silvab084af42012-12-07 10:36:55 +00001487``optsize``
1488 This attribute suggests that optimization passes and code generator
1489 passes make choices that keep the code size of this function low,
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001490 and otherwise do optimizations specifically to reduce code size as
1491 long as they do not significantly impact runtime performance.
Sanjoy Dasc0441c22016-04-19 05:24:47 +00001492``"patchable-function"``
1493 This attribute tells the code generator that the code
1494 generated for this function needs to follow certain conventions that
1495 make it possible for a runtime function to patch over it later.
1496 The exact effect of this attribute depends on its string value,
Charles Davise9c32c72016-08-08 21:20:15 +00001497 for which there currently is one legal possibility:
Sanjoy Dasc0441c22016-04-19 05:24:47 +00001498
1499 * ``"prologue-short-redirect"`` - This style of patchable
1500 function is intended to support patching a function prologue to
1501 redirect control away from the function in a thread safe
1502 manner. It guarantees that the first instruction of the
1503 function will be large enough to accommodate a short jump
1504 instruction, and will be sufficiently aligned to allow being
1505 fully changed via an atomic compare-and-swap instruction.
1506 While the first requirement can be satisfied by inserting large
1507 enough NOP, LLVM can and will try to re-purpose an existing
1508 instruction (i.e. one that would have to be emitted anyway) as
1509 the patchable instruction larger than a short jump.
1510
1511 ``"prologue-short-redirect"`` is currently only supported on
1512 x86-64.
1513
1514 This attribute by itself does not imply restrictions on
1515 inter-procedural optimizations. All of the semantic effects the
1516 patching may have to be separately conveyed via the linkage type.
whitequarked54b4a2017-06-21 18:46:50 +00001517``"probe-stack"``
1518 This attribute indicates that the function will trigger a guard region
1519 in the end of the stack. It ensures that accesses to the stack must be
1520 no further apart than the size of the guard region to a previous
1521 access of the stack. It takes one required string value, the name of
1522 the stack probing function that will be called.
1523
1524 If a function that has a ``"probe-stack"`` attribute is inlined into
1525 a function with another ``"probe-stack"`` attribute, the resulting
1526 function has the ``"probe-stack"`` attribute of the caller. If a
1527 function that has a ``"probe-stack"`` attribute is inlined into a
1528 function that has no ``"probe-stack"`` attribute at all, the resulting
1529 function has the ``"probe-stack"`` attribute of the callee.
Sean Silvab084af42012-12-07 10:36:55 +00001530``readnone``
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001531 On a function, this attribute indicates that the function computes its
1532 result (or decides to unwind an exception) based strictly on its arguments,
Sean Silvab084af42012-12-07 10:36:55 +00001533 without dereferencing any pointer arguments or otherwise accessing
1534 any mutable state (e.g. memory, control registers, etc) visible to
1535 caller functions. It does not write through any pointer arguments
1536 (including ``byval`` arguments) and never changes any state visible
Sanjoy Das5be2e842017-02-13 23:19:07 +00001537 to callers. This means while it cannot unwind exceptions by calling
1538 the ``C++`` exception throwing methods (since they write to memory), there may
1539 be non-``C++`` mechanisms that throw exceptions without writing to LLVM
1540 visible memory.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001541
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001542 On an argument, this attribute indicates that the function does not
1543 dereference that pointer argument, even though it may read or write the
Nick Lewyckyefe31f22013-07-06 01:04:47 +00001544 memory that the pointer points to if accessed through other pointers.
Sean Silvab084af42012-12-07 10:36:55 +00001545``readonly``
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001546 On a function, this attribute indicates that the function does not write
1547 through any pointer arguments (including ``byval`` arguments) or otherwise
Sean Silvab084af42012-12-07 10:36:55 +00001548 modify any state (e.g. memory, control registers, etc) visible to
1549 caller functions. It may dereference pointer arguments and read
1550 state that may be set in the caller. A readonly function always
1551 returns the same value (or unwinds an exception identically) when
Sanjoy Das5be2e842017-02-13 23:19:07 +00001552 called with the same set of arguments and global state. This means while it
1553 cannot unwind exceptions by calling the ``C++`` exception throwing methods
1554 (since they write to memory), there may be non-``C++`` mechanisms that throw
1555 exceptions without writing to LLVM visible memory.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001556
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001557 On an argument, this attribute indicates that the function does not write
1558 through this pointer argument, even though it may write to the memory that
1559 the pointer points to.
whitequark08b20352017-06-22 23:22:36 +00001560``"stack-probe-size"``
1561 This attribute controls the behavior of stack probes: either
1562 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1563 It defines the size of the guard region. It ensures that if the function
1564 may use more stack space than the size of the guard region, stack probing
1565 sequence will be emitted. It takes one required integer value, which
1566 is 4096 by default.
1567
1568 If a function that has a ``"stack-probe-size"`` attribute is inlined into
1569 a function with another ``"stack-probe-size"`` attribute, the resulting
1570 function has the ``"stack-probe-size"`` attribute that has the lower
1571 numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1572 inlined into a function that has no ``"stack-probe-size"`` attribute
1573 at all, the resulting function has the ``"stack-probe-size"`` attribute
1574 of the callee.
Hans Wennborg89c35fc2018-02-23 13:46:25 +00001575``"no-stack-arg-probe"``
1576 This attribute disables ABI-required stack probes, if any.
Nicolai Haehnle84c9f992016-07-04 08:01:29 +00001577``writeonly``
1578 On a function, this attribute indicates that the function may write to but
1579 does not read from memory.
1580
1581 On an argument, this attribute indicates that the function may write to but
1582 does not read through this pointer argument (even though it may read from
1583 the memory that the pointer points to).
Igor Laevsky39d662f2015-07-11 10:30:36 +00001584``argmemonly``
1585 This attribute indicates that the only memory accesses inside function are
1586 loads and stores from objects pointed to by its pointer-typed arguments,
1587 with arbitrary offsets. Or in other words, all memory operations in the
1588 function can refer to memory only using pointers based on its function
1589 arguments.
1590 Note that ``argmemonly`` can be used together with ``readonly`` attribute
1591 in order to specify that function reads only from its arguments.
Sean Silvab084af42012-12-07 10:36:55 +00001592``returns_twice``
1593 This attribute indicates that this function can return twice. The C
1594 ``setjmp`` is an example of such a function. The compiler disables
1595 some optimizations (like tail calls) in the caller of these
1596 functions.
Peter Collingbourne82437bf2015-06-15 21:07:11 +00001597``safestack``
1598 This attribute indicates that
1599 `SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_
1600 protection is enabled for this function.
1601
1602 If a function that has a ``safestack`` attribute is inlined into a
1603 function that doesn't have a ``safestack`` attribute or which has an
1604 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1605 function will have a ``safestack`` attribute.
Kostya Serebryanycf880b92013-02-26 06:58:09 +00001606``sanitize_address``
1607 This attribute indicates that AddressSanitizer checks
1608 (dynamic address safety analysis) are enabled for this function.
1609``sanitize_memory``
1610 This attribute indicates that MemorySanitizer checks (dynamic detection
1611 of accesses to uninitialized memory) are enabled for this function.
1612``sanitize_thread``
1613 This attribute indicates that ThreadSanitizer checks
1614 (dynamic thread safety analysis) are enabled for this function.
Evgeniy Stepanovc667c1f2017-12-09 00:21:41 +00001615``sanitize_hwaddress``
1616 This attribute indicates that HWAddressSanitizer checks
1617 (dynamic address safety analysis based on tagged pointers) are enabled for
1618 this function.
Matt Arsenaultb19b57e2017-04-28 20:25:27 +00001619``speculatable``
1620 This function attribute indicates that the function does not have any
1621 effects besides calculating its result and does not have undefined behavior.
1622 Note that ``speculatable`` is not enough to conclude that along any
Xin Tongc7180202017-05-02 23:24:12 +00001623 particular execution path the number of calls to this function will not be
Matt Arsenaultb19b57e2017-04-28 20:25:27 +00001624 externally observable. This attribute is only valid on functions
1625 and declarations, not on individual call sites. If a function is
1626 incorrectly marked as speculatable and really does exhibit
1627 undefined behavior, the undefined behavior may be observed even
1628 if the call site is dead code.
1629
Sean Silvab084af42012-12-07 10:36:55 +00001630``ssp``
1631 This attribute indicates that the function should emit a stack
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00001632 smashing protector. It is in the form of a "canary" --- a random value
Sean Silvab084af42012-12-07 10:36:55 +00001633 placed on the stack before the local variables that's checked upon
1634 return from the function to see if it has been overwritten. A
1635 heuristic is used to determine if a function needs stack protectors
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001636 or not. The heuristic used will enable protectors for functions with:
Dmitri Gribenko69b56472013-01-29 23:14:41 +00001637
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001638 - Character arrays larger than ``ssp-buffer-size`` (default 8).
1639 - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1640 - Calls to alloca() with variable sizes or constant sizes greater than
1641 ``ssp-buffer-size``.
Sean Silvab084af42012-12-07 10:36:55 +00001642
Josh Magee24c7f062014-02-01 01:36:16 +00001643 Variables that are identified as requiring a protector will be arranged
1644 on the stack such that they are adjacent to the stack protector guard.
1645
Sean Silvab084af42012-12-07 10:36:55 +00001646 If a function that has an ``ssp`` attribute is inlined into a
1647 function that doesn't have an ``ssp`` attribute, then the resulting
1648 function will have an ``ssp`` attribute.
1649``sspreq``
1650 This attribute indicates that the function should *always* emit a
1651 stack smashing protector. This overrides the ``ssp`` function
1652 attribute.
1653
Josh Magee24c7f062014-02-01 01:36:16 +00001654 Variables that are identified as requiring a protector will be arranged
1655 on the stack such that they are adjacent to the stack protector guard.
1656 The specific layout rules are:
1657
1658 #. Large arrays and structures containing large arrays
1659 (``>= ssp-buffer-size``) are closest to the stack protector.
1660 #. Small arrays and structures containing small arrays
1661 (``< ssp-buffer-size``) are 2nd closest to the protector.
1662 #. Variables that have had their address taken are 3rd closest to the
1663 protector.
1664
Sean Silvab084af42012-12-07 10:36:55 +00001665 If a function that has an ``sspreq`` attribute is inlined into a
1666 function that doesn't have an ``sspreq`` attribute or which has an
Bill Wendlingd154e2832013-01-23 06:41:41 +00001667 ``ssp`` or ``sspstrong`` attribute, then the resulting function will have
1668 an ``sspreq`` attribute.
1669``sspstrong``
1670 This attribute indicates that the function should emit a stack smashing
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001671 protector. This attribute causes a strong heuristic to be used when
Sean Silvaa1190322015-08-06 22:56:48 +00001672 determining if a function needs stack protectors. The strong heuristic
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001673 will enable protectors for functions with:
Dmitri Gribenko69b56472013-01-29 23:14:41 +00001674
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001675 - Arrays of any size and type
1676 - Aggregates containing an array of any size and type.
1677 - Calls to alloca().
1678 - Local variables that have had their address taken.
1679
Josh Magee24c7f062014-02-01 01:36:16 +00001680 Variables that are identified as requiring a protector will be arranged
1681 on the stack such that they are adjacent to the stack protector guard.
1682 The specific layout rules are:
1683
1684 #. Large arrays and structures containing large arrays
1685 (``>= ssp-buffer-size``) are closest to the stack protector.
1686 #. Small arrays and structures containing small arrays
1687 (``< ssp-buffer-size``) are 2nd closest to the protector.
1688 #. Variables that have had their address taken are 3rd closest to the
1689 protector.
1690
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001691 This overrides the ``ssp`` function attribute.
Bill Wendlingd154e2832013-01-23 06:41:41 +00001692
1693 If a function that has an ``sspstrong`` attribute is inlined into a
1694 function that doesn't have an ``sspstrong`` attribute, then the
1695 resulting function will have an ``sspstrong`` attribute.
Andrew Kaylor53a5fbb2017-08-14 21:15:13 +00001696``strictfp``
1697 This attribute indicates that the function was called from a scope that
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00001698 requires strict floating-point semantics. LLVM will not attempt any
1699 optimizations that require assumptions about the floating-point rounding
1700 mode or that might alter the state of floating-point status flags that
Andrew Kaylor53a5fbb2017-08-14 21:15:13 +00001701 might otherwise be set or cleared by calling this function.
Reid Kleckner5a2ab2b2015-03-04 00:08:56 +00001702``"thunk"``
1703 This attribute indicates that the function will delegate to some other
1704 function with a tail call. The prototype of a thunk should not be used for
1705 optimization purposes. The caller is expected to cast the thunk prototype to
1706 match the thunk target prototype.
Sean Silvab084af42012-12-07 10:36:55 +00001707``uwtable``
1708 This attribute indicates that the ABI being targeted requires that
Sean Silva706fba52015-08-06 22:56:24 +00001709 an unwind table entry be produced for this function even if we can
Sean Silvab084af42012-12-07 10:36:55 +00001710 show that no exceptions passes by it. This is normally the case for
1711 the ELF x86-64 abi, but it can be disabled for some compilation
1712 units.
Oren Ben Simhonfdd72fd2018-03-17 13:29:46 +00001713``nocf_check``
Hiroshi Inouec36a1f12018-06-15 05:10:09 +00001714 This attribute indicates that no control-flow check will be performed on
Oren Ben Simhonfdd72fd2018-03-17 13:29:46 +00001715 the attributed entity. It disables -fcf-protection=<> for a specific
1716 entity to fine grain the HW control flow protection mechanism. The flag
Hiroshi Inouec36a1f12018-06-15 05:10:09 +00001717 is target independent and currently appertains to a function or function
Oren Ben Simhonfdd72fd2018-03-17 13:29:46 +00001718 pointer.
Vlad Tsyrklevichd17f61e2018-04-03 20:10:40 +00001719``shadowcallstack``
1720 This attribute indicates that the ShadowCallStack checks are enabled for
1721 the function. The instrumentation checks that the return address for the
1722 function has not changed between the function prolog and eiplog. It is
1723 currently x86_64-specific.
Sean Silvab084af42012-12-07 10:36:55 +00001724
Javed Absarf3d79042017-05-11 12:28:08 +00001725.. _glattrs:
1726
1727Global Attributes
1728-----------------
1729
1730Attributes may be set to communicate additional information about a global variable.
1731Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
1732are grouped into a single :ref:`attribute group <attrgrp>`.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001733
1734.. _opbundles:
1735
1736Operand Bundles
1737---------------
1738
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001739Operand bundles are tagged sets of SSA values that can be associated
Sanjoy Dasb0e9d4a52015-09-25 00:05:40 +00001740with certain LLVM instructions (currently only ``call`` s and
1741``invoke`` s). In a way they are like metadata, but dropping them is
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001742incorrect and will change program semantics.
1743
1744Syntax::
David Majnemer34cacb42015-10-22 01:46:38 +00001745
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001746 operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001747 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
1748 bundle operand ::= SSA value
1749 tag ::= string constant
1750
1751Operand bundles are **not** part of a function's signature, and a
1752given function may be called from multiple places with different kinds
1753of operand bundles. This reflects the fact that the operand bundles
1754are conceptually a part of the ``call`` (or ``invoke``), not the
1755callee being dispatched to.
1756
1757Operand bundles are a generic mechanism intended to support
1758runtime-introspection-like functionality for managed languages. While
1759the exact semantics of an operand bundle depend on the bundle tag,
1760there are certain limitations to how much the presence of an operand
1761bundle can influence the semantics of a program. These restrictions
1762are described as the semantics of an "unknown" operand bundle. As
1763long as the behavior of an operand bundle is describable within these
1764restrictions, LLVM does not need to have special knowledge of the
1765operand bundle to not miscompile programs containing it.
1766
David Majnemer34cacb42015-10-22 01:46:38 +00001767- The bundle operands for an unknown operand bundle escape in unknown
1768 ways before control is transferred to the callee or invokee.
1769- Calls and invokes with operand bundles have unknown read / write
1770 effect on the heap on entry and exit (even if the call target is
Sylvestre Ledru84666a12016-02-14 20:16:22 +00001771 ``readnone`` or ``readonly``), unless they're overridden with
Sanjoy Das98a341b2015-10-22 03:12:22 +00001772 callsite specific attributes.
1773- An operand bundle at a call site cannot change the implementation
1774 of the called function. Inter-procedural optimizations work as
1775 usual as long as they take into account the first two properties.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001776
Sanjoy Dascdafd842015-11-11 21:38:02 +00001777More specific types of operand bundles are described below.
1778
Sanjoy Dasb51325d2016-03-11 19:08:34 +00001779.. _deopt_opbundles:
1780
Sanjoy Dascdafd842015-11-11 21:38:02 +00001781Deoptimization Operand Bundles
1782^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1783
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001784Deoptimization operand bundles are characterized by the ``"deopt"``
Sanjoy Dascdafd842015-11-11 21:38:02 +00001785operand bundle tag. These operand bundles represent an alternate
1786"safe" continuation for the call site they're attached to, and can be
1787used by a suitable runtime to deoptimize the compiled frame at the
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001788specified call site. There can be at most one ``"deopt"`` operand
1789bundle attached to a call site. Exact details of deoptimization is
1790out of scope for the language reference, but it usually involves
1791rewriting a compiled frame into a set of interpreted frames.
Sanjoy Dascdafd842015-11-11 21:38:02 +00001792
1793From the compiler's perspective, deoptimization operand bundles make
1794the call sites they're attached to at least ``readonly``. They read
1795through all of their pointer typed operands (even if they're not
1796otherwise escaped) and the entire visible heap. Deoptimization
1797operand bundles do not capture their operands except during
1798deoptimization, in which case control will not be returned to the
1799compiled frame.
1800
Sanjoy Das2d161452015-11-18 06:23:38 +00001801The inliner knows how to inline through calls that have deoptimization
1802operand bundles. Just like inlining through a normal call site
1803involves composing the normal and exceptional continuations, inlining
1804through a call site with a deoptimization operand bundle needs to
1805appropriately compose the "safe" deoptimization continuation. The
1806inliner does this by prepending the parent's deoptimization
1807continuation to every deoptimization continuation in the inlined body.
1808E.g. inlining ``@f`` into ``@g`` in the following example
1809
1810.. code-block:: llvm
1811
1812 define void @f() {
1813 call void @x() ;; no deopt state
1814 call void @y() [ "deopt"(i32 10) ]
1815 call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
1816 ret void
1817 }
1818
1819 define void @g() {
1820 call void @f() [ "deopt"(i32 20) ]
1821 ret void
1822 }
1823
1824will result in
1825
1826.. code-block:: llvm
1827
1828 define void @g() {
1829 call void @x() ;; still no deopt state
1830 call void @y() [ "deopt"(i32 20, i32 10) ]
1831 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
1832 ret void
1833 }
1834
1835It is the frontend's responsibility to structure or encode the
1836deoptimization state in a way that syntactically prepending the
1837caller's deoptimization state to the callee's deoptimization state is
1838semantically equivalent to composing the caller's deoptimization
1839continuation after the callee's deoptimization continuation.
1840
Joseph Tremoulete28885e2016-01-10 04:28:38 +00001841.. _ob_funclet:
1842
David Majnemer3bb88c02015-12-15 21:27:27 +00001843Funclet Operand Bundles
1844^^^^^^^^^^^^^^^^^^^^^^^
1845
1846Funclet operand bundles are characterized by the ``"funclet"``
1847operand bundle tag. These operand bundles indicate that a call site
1848is within a particular funclet. There can be at most one
1849``"funclet"`` operand bundle attached to a call site and it must have
1850exactly one bundle operand.
1851
Joseph Tremoulete28885e2016-01-10 04:28:38 +00001852If any funclet EH pads have been "entered" but not "exited" (per the
1853`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
1854it is undefined behavior to execute a ``call`` or ``invoke`` which:
1855
1856* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
1857 intrinsic, or
1858* has a ``"funclet"`` bundle whose operand is not the most-recently-entered
1859 not-yet-exited funclet EH pad.
1860
1861Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
1862executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
1863
Sanjoy Dasa34ce952016-01-20 19:50:25 +00001864GC Transition Operand Bundles
1865^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1866
1867GC transition operand bundles are characterized by the
1868``"gc-transition"`` operand bundle tag. These operand bundles mark a
1869call as a transition between a function with one GC strategy to a
1870function with a different GC strategy. If coordinating the transition
1871between GC strategies requires additional code generation at the call
1872site, these bundles may contain any values that are needed by the
1873generated code. For more details, see :ref:`GC Transitions
1874<gc_transition_args>`.
1875
Sean Silvab084af42012-12-07 10:36:55 +00001876.. _moduleasm:
1877
1878Module-Level Inline Assembly
1879----------------------------
1880
1881Modules may contain "module-level inline asm" blocks, which corresponds
1882to the GCC "file scope inline asm" blocks. These blocks are internally
1883concatenated by LLVM and treated as a single unit, but may be separated
1884in the ``.ll`` file if desired. The syntax is very simple:
1885
1886.. code-block:: llvm
1887
1888 module asm "inline asm code goes here"
1889 module asm "more can go here"
1890
1891The strings can contain any character by escaping non-printable
1892characters. The escape sequence used is simply "\\xx" where "xx" is the
1893two digit hex code for the number.
1894
James Y Knightbc832ed2015-07-08 18:08:36 +00001895Note that the assembly string *must* be parseable by LLVM's integrated assembler
1896(unless it is disabled), even when emitting a ``.s`` file.
Sean Silvab084af42012-12-07 10:36:55 +00001897
Eli Benderskyfdc529a2013-06-07 19:40:08 +00001898.. _langref_datalayout:
1899
Sean Silvab084af42012-12-07 10:36:55 +00001900Data Layout
1901-----------
1902
1903A module may specify a target specific data layout string that specifies
1904how data is to be laid out in memory. The syntax for the data layout is
1905simply:
1906
1907.. code-block:: llvm
1908
1909 target datalayout = "layout specification"
1910
1911The *layout specification* consists of a list of specifications
1912separated by the minus sign character ('-'). Each specification starts
1913with a letter and may include other information after the letter to
1914define some aspect of the data layout. The specifications accepted are
1915as follows:
1916
1917``E``
1918 Specifies that the target lays out data in big-endian form. That is,
1919 the bits with the most significance have the lowest address
1920 location.
1921``e``
1922 Specifies that the target lays out data in little-endian form. That
1923 is, the bits with the least significance have the lowest address
1924 location.
1925``S<size>``
1926 Specifies the natural alignment of the stack in bits. Alignment
1927 promotion of stack variables is limited to the natural stack
1928 alignment to avoid dynamic stack realignment. The stack alignment
1929 must be a multiple of 8-bits. If omitted, the natural stack
1930 alignment defaults to "unspecified", which does not prevent any
1931 alignment promotions.
Dylan McKayced2fe62018-02-19 09:56:22 +00001932``P<address space>``
1933 Specifies the address space that corresponds to program memory.
1934 Harvard architectures can use this to specify what space LLVM
1935 should place things such as functions into. If omitted, the
1936 program memory space defaults to the default address space of 0,
1937 which corresponds to a Von Neumann architecture that has code
1938 and data in the same space.
Matt Arsenault3c1fc762017-04-10 22:27:50 +00001939``A<address space>``
Dylan McKayced2fe62018-02-19 09:56:22 +00001940 Specifies the address space of objects created by '``alloca``'.
Matt Arsenault3c1fc762017-04-10 22:27:50 +00001941 Defaults to the default address space of 0.
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00001942``p[n]:<size>:<abi>:<pref>:<idx>``
Sean Silvab084af42012-12-07 10:36:55 +00001943 This specifies the *size* of a pointer and its ``<abi>`` and
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00001944 ``<pref>``\erred alignments for address space ``n``. The fourth parameter
1945 ``<idx>`` is a size of index that used for address calculation. If not
1946 specified, the default index size is equal to the pointer size. All sizes
1947 are in bits. The address space, ``n``, is optional, and if not specified,
Sean Silvaa1190322015-08-06 22:56:48 +00001948 denotes the default address space 0. The value of ``n`` must be
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001949 in the range [1,2^23).
Sean Silvab084af42012-12-07 10:36:55 +00001950``i<size>:<abi>:<pref>``
1951 This specifies the alignment for an integer type of a given bit
1952 ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
1953``v<size>:<abi>:<pref>``
1954 This specifies the alignment for a vector type of a given bit
1955 ``<size>``.
1956``f<size>:<abi>:<pref>``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00001957 This specifies the alignment for a floating-point type of a given bit
Sean Silvab084af42012-12-07 10:36:55 +00001958 ``<size>``. Only values of ``<size>`` that are supported by the target
1959 will work. 32 (float) and 64 (double) are supported on all targets; 80
1960 or 128 (different flavors of long double) are also supported on some
1961 targets.
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001962``a:<abi>:<pref>``
1963 This specifies the alignment for an object of aggregate type.
Rafael Espindola58873562014-01-03 19:21:54 +00001964``m:<mangling>``
Reid Klecknerf8b51c52018-03-16 20:13:32 +00001965 If present, specifies that llvm names are mangled in the output. Symbols
1966 prefixed with the mangling escape character ``\01`` are passed through
1967 directly to the assembler without the escape character. The mangling style
Hans Wennborgd4245ac2014-01-15 02:49:17 +00001968 options are
1969
1970 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
1971 * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
1972 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
1973 symbols get a ``_`` prefix.
Reid Klecknerf8b51c52018-03-16 20:13:32 +00001974 * ``x``: Windows x86 COFF mangling: Private symbols get the usual prefix.
1975 Regular C symbols get a ``_`` prefix. Functions with ``__stdcall``,
1976 ``__fastcall``, and ``__vectorcall`` have custom mangling that appends
1977 ``@N`` where N is the number of bytes used to pass parameters. C++ symbols
1978 starting with ``?`` are not mangled in any way.
1979 * ``w``: Windows COFF mangling: Similar to ``x``, except that normal C
1980 symbols do not receive a ``_`` prefix.
Sean Silvab084af42012-12-07 10:36:55 +00001981``n<size1>:<size2>:<size3>...``
1982 This specifies a set of native integer widths for the target CPU in
1983 bits. For example, it might contain ``n32`` for 32-bit PowerPC,
1984 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
1985 this set are considered to support most general arithmetic operations
1986 efficiently.
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +00001987``ni:<address space0>:<address space1>:<address space2>...``
1988 This specifies pointer types with the specified address spaces
1989 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0``
1990 address space cannot be specified as non-integral.
Sean Silvab084af42012-12-07 10:36:55 +00001991
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001992On every specification that takes a ``<abi>:<pref>``, specifying the
1993``<pref>`` alignment is optional. If omitted, the preceding ``:``
1994should be omitted too and ``<pref>`` will be equal to ``<abi>``.
1995
Sean Silvab084af42012-12-07 10:36:55 +00001996When constructing the data layout for a given target, LLVM starts with a
1997default set of specifications which are then (possibly) overridden by
1998the specifications in the ``datalayout`` keyword. The default
1999specifications are given in this list:
2000
2001- ``E`` - big endian
Matt Arsenault24b49c42013-07-31 17:49:08 +00002002- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
2003- ``p[n]:64:64:64`` - Other address spaces are assumed to be the
2004 same as the default address space.
Patrik Hagglunda832ab12013-01-30 09:02:06 +00002005- ``S0`` - natural stack alignment is unspecified
Sean Silvab084af42012-12-07 10:36:55 +00002006- ``i1:8:8`` - i1 is 8-bit (byte) aligned
2007- ``i8:8:8`` - i8 is 8-bit (byte) aligned
2008- ``i16:16:16`` - i16 is 16-bit aligned
2009- ``i32:32:32`` - i32 is 32-bit aligned
2010- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
2011 alignment of 64-bits
Patrik Hagglunda832ab12013-01-30 09:02:06 +00002012- ``f16:16:16`` - half is 16-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00002013- ``f32:32:32`` - float is 32-bit aligned
2014- ``f64:64:64`` - double is 64-bit aligned
Patrik Hagglunda832ab12013-01-30 09:02:06 +00002015- ``f128:128:128`` - quad is 128-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00002016- ``v64:64:64`` - 64-bit vector is 64-bit aligned
2017- ``v128:128:128`` - 128-bit vector is 128-bit aligned
Rafael Espindolae8f4d582013-12-12 17:21:51 +00002018- ``a:0:64`` - aggregates are 64-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00002019
2020When LLVM is determining the alignment for a given type, it uses the
2021following rules:
2022
2023#. If the type sought is an exact match for one of the specifications,
2024 that specification is used.
2025#. If no match is found, and the type sought is an integer type, then
2026 the smallest integer type that is larger than the bitwidth of the
2027 sought type is used. If none of the specifications are larger than
2028 the bitwidth then the largest integer type is used. For example,
2029 given the default specifications above, the i7 type will use the
2030 alignment of i8 (next largest) while both i65 and i256 will use the
2031 alignment of i64 (largest specified).
2032#. If no match is found, and the type sought is a vector type, then the
2033 largest vector type that is smaller than the sought vector type will
2034 be used as a fall back. This happens because <128 x double> can be
2035 implemented in terms of 64 <2 x double>, for example.
2036
2037The function of the data layout string may not be what you expect.
2038Notably, this is not a specification from the frontend of what alignment
2039the code generator should use.
2040
2041Instead, if specified, the target data layout is required to match what
2042the ultimate *code generator* expects. This string is used by the
2043mid-level optimizers to improve code, and this only works if it matches
Mehdi Amini4a121fa2015-03-14 22:04:06 +00002044what the ultimate code generator uses. There is no way to generate IR
2045that does not embed this target-specific detail into the IR. If you
2046don't specify the string, the default specifications will be used to
2047generate a Data Layout and the optimization phases will operate
2048accordingly and introduce target specificity into the IR with respect to
2049these default specifications.
Sean Silvab084af42012-12-07 10:36:55 +00002050
Bill Wendling5cc90842013-10-18 23:41:25 +00002051.. _langref_triple:
2052
2053Target Triple
2054-------------
2055
2056A module may specify a target triple string that describes the target
2057host. The syntax for the target triple is simply:
2058
2059.. code-block:: llvm
2060
2061 target triple = "x86_64-apple-macosx10.7.0"
2062
2063The *target triple* string consists of a series of identifiers delimited
2064by the minus sign character ('-'). The canonical forms are:
2065
2066::
2067
2068 ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2069 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2070
2071This information is passed along to the backend so that it generates
2072code for the proper architecture. It's possible to override this on the
2073command line with the ``-mtriple`` command line option.
2074
Sean Silvab084af42012-12-07 10:36:55 +00002075.. _pointeraliasing:
2076
2077Pointer Aliasing Rules
2078----------------------
2079
2080Any memory access must be done through a pointer value associated with
2081an address range of the memory access, otherwise the behavior is
2082undefined. Pointer values are associated with address ranges according
2083to the following rules:
2084
2085- A pointer value is associated with the addresses associated with any
2086 value it is *based* on.
2087- An address of a global variable is associated with the address range
2088 of the variable's storage.
2089- The result value of an allocation instruction is associated with the
2090 address range of the allocated storage.
2091- A null pointer in the default address-space is associated with no
2092 address.
2093- An integer constant other than zero or a pointer value returned from
2094 a function not defined within LLVM may be associated with address
2095 ranges allocated through mechanisms other than those provided by
2096 LLVM. Such ranges shall not overlap with any ranges of addresses
2097 allocated by mechanisms provided by LLVM.
2098
2099A pointer value is *based* on another pointer value according to the
2100following rules:
2101
Sanjoy Das6d489492017-09-13 18:49:22 +00002102- A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2103 the pointer-typed operand of the ``getelementptr``.
2104- The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2105 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2106 of the ``getelementptr``.
Sean Silvab084af42012-12-07 10:36:55 +00002107- The result value of a ``bitcast`` is *based* on the operand of the
2108 ``bitcast``.
2109- A pointer value formed by an ``inttoptr`` is *based* on all pointer
2110 values that contribute (directly or indirectly) to the computation of
2111 the pointer's value.
2112- The "*based* on" relationship is transitive.
2113
2114Note that this definition of *"based"* is intentionally similar to the
2115definition of *"based"* in C99, though it is slightly weaker.
2116
2117LLVM IR does not associate types with memory. The result type of a
2118``load`` merely indicates the size and alignment of the memory from
2119which to load, as well as the interpretation of the value. The first
2120operand type of a ``store`` similarly only indicates the size and
2121alignment of the store.
2122
2123Consequently, type-based alias analysis, aka TBAA, aka
2124``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2125:ref:`Metadata <metadata>` may be used to encode additional information
2126which specialized optimization passes may use to implement type-based
2127alias analysis.
2128
2129.. _volatile:
2130
2131Volatile Memory Accesses
2132------------------------
2133
2134Certain memory accesses, such as :ref:`load <i_load>`'s,
2135:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2136marked ``volatile``. The optimizers must not change the number of
2137volatile operations or change their order of execution relative to other
2138volatile operations. The optimizers *may* change the order of volatile
2139operations relative to non-volatile operations. This is not Java's
2140"volatile" and has no cross-thread synchronization behavior.
2141
Andrew Trick89fc5a62013-01-30 21:19:35 +00002142IR-level volatile loads and stores cannot safely be optimized into
2143llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are
2144flagged volatile. Likewise, the backend should never split or merge
2145target-legal volatile load/store instructions.
2146
Andrew Trick7e6f9282013-01-31 00:49:39 +00002147.. admonition:: Rationale
2148
2149 Platforms may rely on volatile loads and stores of natively supported
2150 data width to be executed as single instruction. For example, in C
2151 this holds for an l-value of volatile primitive type with native
2152 hardware support, but not necessarily for aggregate types. The
2153 frontend upholds these expectations, which are intentionally
Sean Silva706fba52015-08-06 22:56:24 +00002154 unspecified in the IR. The rules above ensure that IR transformations
Andrew Trick7e6f9282013-01-31 00:49:39 +00002155 do not violate the frontend's contract with the language.
2156
Sean Silvab084af42012-12-07 10:36:55 +00002157.. _memmodel:
2158
2159Memory Model for Concurrent Operations
2160--------------------------------------
2161
2162The LLVM IR does not define any way to start parallel threads of
2163execution or to register signal handlers. Nonetheless, there are
2164platform-specific ways to create them, and we define LLVM IR's behavior
2165in their presence. This model is inspired by the C++0x memory model.
2166
2167For a more informal introduction to this model, see the :doc:`Atomics`.
2168
2169We define a *happens-before* partial order as the least partial order
2170that
2171
2172- Is a superset of single-thread program order, and
2173- When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2174 ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2175 techniques, like pthread locks, thread creation, thread joining,
2176 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2177 Constraints <ordering>`).
2178
2179Note that program order does not introduce *happens-before* edges
2180between a thread and signals executing inside that thread.
2181
2182Every (defined) read operation (load instructions, memcpy, atomic
2183loads/read-modify-writes, etc.) R reads a series of bytes written by
2184(defined) write operations (store instructions, atomic
2185stores/read-modify-writes, memcpy, etc.). For the purposes of this
2186section, initialized globals are considered to have a write of the
2187initializer which is atomic and happens before any other read or write
2188of the memory in question. For each byte of a read R, R\ :sub:`byte`
2189may see any write to the same byte, except:
2190
2191- If write\ :sub:`1` happens before write\ :sub:`2`, and
2192 write\ :sub:`2` happens before R\ :sub:`byte`, then
2193 R\ :sub:`byte` does not see write\ :sub:`1`.
2194- If R\ :sub:`byte` happens before write\ :sub:`3`, then
2195 R\ :sub:`byte` does not see write\ :sub:`3`.
2196
2197Given that definition, R\ :sub:`byte` is defined as follows:
2198
2199- If R is volatile, the result is target-dependent. (Volatile is
2200 supposed to give guarantees which can support ``sig_atomic_t`` in
Richard Smith32dbdf62014-07-31 04:25:36 +00002201 C/C++, and may be used for accesses to addresses that do not behave
Sean Silvab084af42012-12-07 10:36:55 +00002202 like normal memory. It does not generally provide cross-thread
2203 synchronization.)
2204- Otherwise, if there is no write to the same byte that happens before
2205 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2206- Otherwise, if R\ :sub:`byte` may see exactly one write,
2207 R\ :sub:`byte` returns the value written by that write.
2208- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2209 see are atomic, it chooses one of the values written. See the :ref:`Atomic
2210 Memory Ordering Constraints <ordering>` section for additional
2211 constraints on how the choice is made.
2212- Otherwise R\ :sub:`byte` returns ``undef``.
2213
2214R returns the value composed of the series of bytes it read. This
2215implies that some bytes within the value may be ``undef`` **without**
2216the entire value being ``undef``. Note that this only defines the
2217semantics of the operation; it doesn't mean that targets will emit more
2218than one instruction to read the series of bytes.
2219
2220Note that in cases where none of the atomic intrinsics are used, this
2221model places only one restriction on IR transformations on top of what
2222is required for single-threaded execution: introducing a store to a byte
2223which might not otherwise be stored is not allowed in general.
2224(Specifically, in the case where another thread might write to and read
2225from an address, introducing a store can change a load that may see
2226exactly one write into a load that may see multiple writes.)
2227
2228.. _ordering:
2229
2230Atomic Memory Ordering Constraints
2231----------------------------------
2232
2233Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
2234:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
2235:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
Tim Northovere94a5182014-03-11 10:48:52 +00002236ordering parameters that determine which other atomic instructions on
Sean Silvab084af42012-12-07 10:36:55 +00002237the same address they *synchronize with*. These semantics are borrowed
2238from Java and C++0x, but are somewhat more colloquial. If these
2239descriptions aren't precise enough, check those specs (see spec
2240references in the :doc:`atomics guide <Atomics>`).
2241:ref:`fence <i_fence>` instructions treat these orderings somewhat
2242differently since they don't take an address. See that instruction's
2243documentation for details.
2244
2245For a simpler introduction to the ordering constraints, see the
2246:doc:`Atomics`.
2247
2248``unordered``
2249 The set of values that can be read is governed by the happens-before
2250 partial order. A value cannot be read unless some operation wrote
2251 it. This is intended to provide a guarantee strong enough to model
2252 Java's non-volatile shared variables. This ordering cannot be
2253 specified for read-modify-write operations; it is not strong enough
2254 to make them atomic in any interesting way.
2255``monotonic``
2256 In addition to the guarantees of ``unordered``, there is a single
2257 total order for modifications by ``monotonic`` operations on each
2258 address. All modification orders must be compatible with the
2259 happens-before order. There is no guarantee that the modification
2260 orders can be combined to a global total order for the whole program
2261 (and this often will not be possible). The read in an atomic
2262 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
2263 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
2264 order immediately before the value it writes. If one atomic read
2265 happens before another atomic read of the same address, the later
2266 read must see the same value or a later value in the address's
2267 modification order. This disallows reordering of ``monotonic`` (or
2268 stronger) operations on the same address. If an address is written
2269 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
2270 read that address repeatedly, the other threads must eventually see
2271 the write. This corresponds to the C++0x/C1x
2272 ``memory_order_relaxed``.
2273``acquire``
2274 In addition to the guarantees of ``monotonic``, a
2275 *synchronizes-with* edge may be formed with a ``release`` operation.
2276 This is intended to model C++'s ``memory_order_acquire``.
2277``release``
2278 In addition to the guarantees of ``monotonic``, if this operation
2279 writes a value which is subsequently read by an ``acquire``
2280 operation, it *synchronizes-with* that operation. (This isn't a
2281 complete description; see the C++0x definition of a release
2282 sequence.) This corresponds to the C++0x/C1x
2283 ``memory_order_release``.
2284``acq_rel`` (acquire+release)
2285 Acts as both an ``acquire`` and ``release`` operation on its
2286 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
2287``seq_cst`` (sequentially consistent)
2288 In addition to the guarantees of ``acq_rel`` (``acquire`` for an
Richard Smith32dbdf62014-07-31 04:25:36 +00002289 operation that only reads, ``release`` for an operation that only
Sean Silvab084af42012-12-07 10:36:55 +00002290 writes), there is a global total order on all
2291 sequentially-consistent operations on all addresses, which is
2292 consistent with the *happens-before* partial order and with the
2293 modification orders of all the affected addresses. Each
2294 sequentially-consistent read sees the last preceding write to the
2295 same address in this global order. This corresponds to the C++0x/C1x
2296 ``memory_order_seq_cst`` and Java volatile.
2297
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00002298.. _syncscope:
Sean Silvab084af42012-12-07 10:36:55 +00002299
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00002300If an atomic operation is marked ``syncscope("singlethread")``, it only
2301*synchronizes with* and only participates in the seq\_cst total orderings of
2302other operations running in the same thread (for example, in signal handlers).
2303
2304If an atomic operation is marked ``syncscope("<target-scope>")``, where
2305``<target-scope>`` is a target specific synchronization scope, then it is target
2306dependent if it *synchronizes with* and participates in the seq\_cst total
2307orderings of other operations.
2308
2309Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
2310or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
2311seq\_cst total orderings of other operations that are not marked
2312``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
Sean Silvab084af42012-12-07 10:36:55 +00002313
Sanjay Patel54b161e2018-03-20 16:38:22 +00002314.. _floatenv:
2315
2316Floating-Point Environment
2317--------------------------
2318
2319The default LLVM floating-point environment assumes that floating-point
2320instructions do not have side effects. Results assume the round-to-nearest
2321rounding mode. No floating-point exception state is maintained in this
2322environment. Therefore, there is no attempt to create or preserve invalid
2323operation (SNaN) or division-by-zero exceptions in these examples:
2324
2325.. code-block:: llvm
2326
2327 %A = fdiv 0x7ff0000000000001, %X ; 64-bit SNaN hex value
2328 %B = fdiv %X, 0.0
2329 Safe:
2330 %A = NaN
2331 %B = NaN
2332
2333The benefit of this exception-free assumption is that floating-point
2334operations may be speculated freely without any other fast-math relaxations
2335to the floating-point model.
2336
2337Code that requires different behavior than this should use the
Sanjay Patelec95e0e2018-03-20 17:05:19 +00002338:ref:`Constrained Floating-Point Intrinsics <constrainedfp>`.
Sanjay Patel54b161e2018-03-20 16:38:22 +00002339
Sean Silvab084af42012-12-07 10:36:55 +00002340.. _fastmath:
2341
2342Fast-Math Flags
2343---------------
2344
Sanjay Patel629c4112017-11-06 16:27:15 +00002345LLVM IR floating-point operations (:ref:`fadd <i_fadd>`,
Sean Silvab084af42012-12-07 10:36:55 +00002346:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
Matt Arsenault74b73e52017-01-10 18:06:38 +00002347:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>`
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00002348may use the following flags to enable otherwise unsafe
Sanjay Patel629c4112017-11-06 16:27:15 +00002349floating-point transformations.
Sean Silvab084af42012-12-07 10:36:55 +00002350
2351``nnan``
2352 No NaNs - Allow optimizations to assume the arguments and result are not
2353 NaN. Such optimizations are required to retain defined behavior over
2354 NaNs, but the value of the result is undefined.
2355
2356``ninf``
2357 No Infs - Allow optimizations to assume the arguments and result are not
2358 +/-Inf. Such optimizations are required to retain defined behavior over
2359 +/-Inf, but the value of the result is undefined.
2360
2361``nsz``
2362 No Signed Zeros - Allow optimizations to treat the sign of a zero
2363 argument or result as insignificant.
2364
2365``arcp``
2366 Allow Reciprocal - Allow optimizations to use the reciprocal of an
2367 argument rather than perform division.
2368
Adam Nemetcd847a82017-03-28 20:11:52 +00002369``contract``
2370 Allow floating-point contraction (e.g. fusing a multiply followed by an
2371 addition into a fused multiply-and-add).
2372
Sanjay Patel629c4112017-11-06 16:27:15 +00002373``afn``
2374 Approximate functions - Allow substitution of approximate calculations for
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00002375 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
2376 for places where this can apply to LLVM's intrinsic math functions.
Sanjay Patel629c4112017-11-06 16:27:15 +00002377
2378``reassoc``
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00002379 Allow reassociation transformations for floating-point instructions.
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002380 This may dramatically change results in floating-point.
Sanjay Patel629c4112017-11-06 16:27:15 +00002381
Sean Silvab084af42012-12-07 10:36:55 +00002382``fast``
Sanjay Patel629c4112017-11-06 16:27:15 +00002383 This flag implies all of the others.
Sean Silvab084af42012-12-07 10:36:55 +00002384
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002385.. _uselistorder:
2386
2387Use-list Order Directives
2388-------------------------
2389
2390Use-list directives encode the in-memory order of each use-list, allowing the
Sean Silvaa1190322015-08-06 22:56:48 +00002391order to be recreated. ``<order-indexes>`` is a comma-separated list of
2392indexes that are assigned to the referenced value's uses. The referenced
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002393value's use-list is immediately sorted by these indexes.
2394
Sean Silvaa1190322015-08-06 22:56:48 +00002395Use-list directives may appear at function scope or global scope. They are not
2396instructions, and have no effect on the semantics of the IR. When they're at
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002397function scope, they must appear after the terminator of the final basic block.
2398
2399If basic blocks have their address taken via ``blockaddress()`` expressions,
2400``uselistorder_bb`` can be used to reorder their use-lists from outside their
2401function's scope.
2402
2403:Syntax:
2404
2405::
2406
2407 uselistorder <ty> <value>, { <order-indexes> }
2408 uselistorder_bb @function, %block { <order-indexes> }
2409
2410:Examples:
2411
2412::
2413
Duncan P. N. Exon Smith23046652014-08-19 21:48:04 +00002414 define void @foo(i32 %arg1, i32 %arg2) {
2415 entry:
2416 ; ... instructions ...
2417 bb:
2418 ; ... instructions ...
2419
2420 ; At function scope.
2421 uselistorder i32 %arg1, { 1, 0, 2 }
2422 uselistorder label %bb, { 1, 0 }
2423 }
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002424
2425 ; At global scope.
2426 uselistorder i32* @global, { 1, 2, 0 }
2427 uselistorder i32 7, { 1, 0 }
2428 uselistorder i32 (i32) @bar, { 1, 0 }
2429 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
2430
Teresa Johnsonde9b8b42016-04-22 13:09:17 +00002431.. _source_filename:
2432
2433Source Filename
2434---------------
2435
2436The *source filename* string is set to the original module identifier,
2437which will be the name of the compiled source file when compiling from
2438source through the clang front end, for example. It is then preserved through
2439the IR and bitcode.
2440
2441This is currently necessary to generate a consistent unique global
2442identifier for local functions used in profile data, which prepends the
2443source file name to the local function name.
2444
2445The syntax for the source file name is simply:
2446
Renato Golin124f2592016-07-20 12:16:38 +00002447.. code-block:: text
Teresa Johnsonde9b8b42016-04-22 13:09:17 +00002448
2449 source_filename = "/path/to/source.c"
2450
Sean Silvab084af42012-12-07 10:36:55 +00002451.. _typesystem:
2452
2453Type System
2454===========
2455
2456The LLVM type system is one of the most important features of the
2457intermediate representation. Being typed enables a number of
2458optimizations to be performed on the intermediate representation
2459directly, without having to do extra analyses on the side before the
2460transformation. A strong type system makes it easier to read the
2461generated code and enables novel analyses and transformations that are
2462not feasible to perform on normal three address code representations.
2463
Rafael Espindola08013342013-12-07 19:34:20 +00002464.. _t_void:
Eli Bendersky0220e6b2013-06-07 20:24:43 +00002465
Rafael Espindola08013342013-12-07 19:34:20 +00002466Void Type
2467---------
Sean Silvab084af42012-12-07 10:36:55 +00002468
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002469:Overview:
2470
Rafael Espindola08013342013-12-07 19:34:20 +00002471
2472The void type does not represent any value and has no size.
2473
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002474:Syntax:
2475
Rafael Espindola08013342013-12-07 19:34:20 +00002476
2477::
2478
2479 void
Sean Silvab084af42012-12-07 10:36:55 +00002480
2481
Rafael Espindola08013342013-12-07 19:34:20 +00002482.. _t_function:
Sean Silvab084af42012-12-07 10:36:55 +00002483
Rafael Espindola08013342013-12-07 19:34:20 +00002484Function Type
2485-------------
Sean Silvab084af42012-12-07 10:36:55 +00002486
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002487:Overview:
2488
Sean Silvab084af42012-12-07 10:36:55 +00002489
Rafael Espindola08013342013-12-07 19:34:20 +00002490The function type can be thought of as a function signature. It consists of a
2491return type and a list of formal parameter types. The return type of a function
2492type is a void type or first class type --- except for :ref:`label <t_label>`
2493and :ref:`metadata <t_metadata>` types.
Sean Silvab084af42012-12-07 10:36:55 +00002494
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002495:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002496
Rafael Espindola08013342013-12-07 19:34:20 +00002497::
Sean Silvab084af42012-12-07 10:36:55 +00002498
Rafael Espindola08013342013-12-07 19:34:20 +00002499 <returntype> (<parameter list>)
Sean Silvab084af42012-12-07 10:36:55 +00002500
Rafael Espindola08013342013-12-07 19:34:20 +00002501...where '``<parameter list>``' is a comma-separated list of type
2502specifiers. Optionally, the parameter list may include a type ``...``, which
Sean Silvaa1190322015-08-06 22:56:48 +00002503indicates that the function takes a variable number of arguments. Variable
Rafael Espindola08013342013-12-07 19:34:20 +00002504argument functions can access their arguments with the :ref:`variable argument
Sean Silvaa1190322015-08-06 22:56:48 +00002505handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
Rafael Espindola08013342013-12-07 19:34:20 +00002506except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
Sean Silvab084af42012-12-07 10:36:55 +00002507
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002508:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002509
Rafael Espindola08013342013-12-07 19:34:20 +00002510+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2511| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` |
2512+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2513| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. |
2514+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2515| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
2516+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2517| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values |
2518+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2519
2520.. _t_firstclass:
2521
2522First Class Types
2523-----------------
Sean Silvab084af42012-12-07 10:36:55 +00002524
2525The :ref:`first class <t_firstclass>` types are perhaps the most important.
2526Values of these types are the only ones which can be produced by
2527instructions.
2528
Rafael Espindola08013342013-12-07 19:34:20 +00002529.. _t_single_value:
Sean Silvab084af42012-12-07 10:36:55 +00002530
Rafael Espindola08013342013-12-07 19:34:20 +00002531Single Value Types
2532^^^^^^^^^^^^^^^^^^
Sean Silvab084af42012-12-07 10:36:55 +00002533
Rafael Espindola08013342013-12-07 19:34:20 +00002534These are the types that are valid in registers from CodeGen's perspective.
Sean Silvab084af42012-12-07 10:36:55 +00002535
2536.. _t_integer:
2537
2538Integer Type
Rafael Espindola08013342013-12-07 19:34:20 +00002539""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002540
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002541:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002542
2543The integer type is a very simple type that simply specifies an
2544arbitrary bit width for the integer type desired. Any bit width from 1
2545bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.
2546
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002547:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002548
2549::
2550
2551 iN
2552
2553The number of bits the integer will occupy is specified by the ``N``
2554value.
2555
2556Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002557*********
Sean Silvab084af42012-12-07 10:36:55 +00002558
2559+----------------+------------------------------------------------+
2560| ``i1`` | a single-bit integer. |
2561+----------------+------------------------------------------------+
2562| ``i32`` | a 32-bit integer. |
2563+----------------+------------------------------------------------+
2564| ``i1942652`` | a really big integer of over 1 million bits. |
2565+----------------+------------------------------------------------+
2566
2567.. _t_floating:
2568
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002569Floating-Point Types
Rafael Espindola08013342013-12-07 19:34:20 +00002570""""""""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002571
2572.. list-table::
2573 :header-rows: 1
2574
2575 * - Type
2576 - Description
2577
2578 * - ``half``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002579 - 16-bit floating-point value
Sean Silvab084af42012-12-07 10:36:55 +00002580
2581 * - ``float``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002582 - 32-bit floating-point value
Sean Silvab084af42012-12-07 10:36:55 +00002583
2584 * - ``double``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002585 - 64-bit floating-point value
Sean Silvab084af42012-12-07 10:36:55 +00002586
2587 * - ``fp128``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002588 - 128-bit floating-point value (112-bit mantissa)
Sean Silvab084af42012-12-07 10:36:55 +00002589
2590 * - ``x86_fp80``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002591 - 80-bit floating-point value (X87)
Sean Silvab084af42012-12-07 10:36:55 +00002592
2593 * - ``ppc_fp128``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002594 - 128-bit floating-point value (two 64-bits)
Sean Silvab084af42012-12-07 10:36:55 +00002595
Sanjay Patelbab6ce02018-03-21 15:22:09 +00002596The binary format of half, float, double, and fp128 correspond to the
2597IEEE-754-2008 specifications for binary16, binary32, binary64, and binary128
2598respectively.
2599
Reid Kleckner9a16d082014-03-05 02:41:37 +00002600X86_mmx Type
2601""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002602
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002603:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002604
Reid Kleckner9a16d082014-03-05 02:41:37 +00002605The x86_mmx type represents a value held in an MMX register on an x86
Sean Silvab084af42012-12-07 10:36:55 +00002606machine. The operations allowed on it are quite limited: parameters and
2607return values, load and store, and bitcast. User-specified MMX
2608instructions are represented as intrinsic or asm calls with arguments
2609and/or results of this type. There are no arrays, vectors or constants
2610of this type.
2611
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002612:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002613
2614::
2615
Reid Kleckner9a16d082014-03-05 02:41:37 +00002616 x86_mmx
Sean Silvab084af42012-12-07 10:36:55 +00002617
Sean Silvab084af42012-12-07 10:36:55 +00002618
Rafael Espindola08013342013-12-07 19:34:20 +00002619.. _t_pointer:
2620
2621Pointer Type
2622""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002623
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002624:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002625
Rafael Espindola08013342013-12-07 19:34:20 +00002626The pointer type is used to specify memory locations. Pointers are
2627commonly used to reference objects in memory.
2628
2629Pointer types may have an optional address space attribute defining the
2630numbered address space where the pointed-to object resides. The default
2631address space is number zero. The semantics of non-zero address spaces
2632are target-specific.
2633
2634Note that LLVM does not permit pointers to void (``void*``) nor does it
2635permit pointers to labels (``label*``). Use ``i8*`` instead.
Sean Silvab084af42012-12-07 10:36:55 +00002636
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002637:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002638
2639::
2640
Rafael Espindola08013342013-12-07 19:34:20 +00002641 <type> *
2642
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002643:Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002644
2645+-------------------------+--------------------------------------------------------------------------------------------------------------+
2646| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. |
2647+-------------------------+--------------------------------------------------------------------------------------------------------------+
2648| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
2649+-------------------------+--------------------------------------------------------------------------------------------------------------+
2650| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. |
2651+-------------------------+--------------------------------------------------------------------------------------------------------------+
2652
2653.. _t_vector:
2654
2655Vector Type
2656"""""""""""
2657
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002658:Overview:
Rafael Espindola08013342013-12-07 19:34:20 +00002659
2660A vector type is a simple derived type that represents a vector of
2661elements. Vector types are used when multiple primitive data are
2662operated in parallel using a single instruction (SIMD). A vector type
2663requires a size (number of elements) and an underlying primitive data
2664type. Vector types are considered :ref:`first class <t_firstclass>`.
2665
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002666:Syntax:
Rafael Espindola08013342013-12-07 19:34:20 +00002667
2668::
2669
2670 < <# elements> x <elementtype> >
2671
2672The number of elements is a constant integer value larger than 0;
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002673elementtype may be any integer, floating-point or pointer type. Vectors
Manuel Jacob961f7872014-07-30 12:30:06 +00002674of size zero are not allowed.
Rafael Espindola08013342013-12-07 19:34:20 +00002675
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002676:Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002677
2678+-------------------+--------------------------------------------------+
2679| ``<4 x i32>`` | Vector of 4 32-bit integer values. |
2680+-------------------+--------------------------------------------------+
2681| ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
2682+-------------------+--------------------------------------------------+
2683| ``<2 x i64>`` | Vector of 2 64-bit integer values. |
2684+-------------------+--------------------------------------------------+
2685| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. |
2686+-------------------+--------------------------------------------------+
Sean Silvab084af42012-12-07 10:36:55 +00002687
2688.. _t_label:
2689
2690Label Type
2691^^^^^^^^^^
2692
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002693:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002694
2695The label type represents code labels.
2696
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002697:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002698
2699::
2700
2701 label
2702
David Majnemerb611e3f2015-08-14 05:09:07 +00002703.. _t_token:
2704
2705Token Type
2706^^^^^^^^^^
2707
2708:Overview:
2709
2710The token type is used when a value is associated with an instruction
2711but all uses of the value must not attempt to introspect or obscure it.
2712As such, it is not appropriate to have a :ref:`phi <i_phi>` or
2713:ref:`select <i_select>` of type token.
2714
2715:Syntax:
2716
2717::
2718
2719 token
2720
2721
2722
Sean Silvab084af42012-12-07 10:36:55 +00002723.. _t_metadata:
2724
2725Metadata Type
2726^^^^^^^^^^^^^
2727
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002728:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002729
2730The metadata type represents embedded metadata. No derived types may be
2731created from metadata except for :ref:`function <t_function>` arguments.
2732
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002733:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002734
2735::
2736
2737 metadata
2738
Sean Silvab084af42012-12-07 10:36:55 +00002739.. _t_aggregate:
2740
2741Aggregate Types
2742^^^^^^^^^^^^^^^
2743
2744Aggregate Types are a subset of derived types that can contain multiple
2745member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
2746aggregate types. :ref:`Vectors <t_vector>` are not considered to be
2747aggregate types.
2748
2749.. _t_array:
2750
2751Array Type
Rafael Espindola08013342013-12-07 19:34:20 +00002752""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002753
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002754:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002755
2756The array type is a very simple derived type that arranges elements
2757sequentially in memory. The array type requires a size (number of
2758elements) and an underlying data type.
2759
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002760:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002761
2762::
2763
2764 [<# elements> x <elementtype>]
2765
2766The number of elements is a constant integer value; ``elementtype`` may
2767be any type with a size.
2768
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002769:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002770
2771+------------------+--------------------------------------+
2772| ``[40 x i32]`` | Array of 40 32-bit integer values. |
2773+------------------+--------------------------------------+
2774| ``[41 x i32]`` | Array of 41 32-bit integer values. |
2775+------------------+--------------------------------------+
2776| ``[4 x i8]`` | Array of 4 8-bit integer values. |
2777+------------------+--------------------------------------+
2778
2779Here are some examples of multidimensional arrays:
2780
2781+-----------------------------+----------------------------------------------------------+
2782| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
2783+-----------------------------+----------------------------------------------------------+
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002784| ``[12 x [10 x float]]`` | 12x10 array of single precision floating-point values. |
Sean Silvab084af42012-12-07 10:36:55 +00002785+-----------------------------+----------------------------------------------------------+
2786| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
2787+-----------------------------+----------------------------------------------------------+
2788
2789There is no restriction on indexing beyond the end of the array implied
2790by a static type (though there are restrictions on indexing beyond the
2791bounds of an allocated object in some cases). This means that
2792single-dimension 'variable sized array' addressing can be implemented in
2793LLVM with a zero length array type. An implementation of 'pascal style
2794arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
2795example.
2796
Sean Silvab084af42012-12-07 10:36:55 +00002797.. _t_struct:
2798
2799Structure Type
Rafael Espindola08013342013-12-07 19:34:20 +00002800""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002801
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002802:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002803
2804The structure type is used to represent a collection of data members
2805together in memory. The elements of a structure may be any type that has
2806a size.
2807
2808Structures in memory are accessed using '``load``' and '``store``' by
2809getting a pointer to a field with the '``getelementptr``' instruction.
2810Structures in registers are accessed using the '``extractvalue``' and
2811'``insertvalue``' instructions.
2812
2813Structures may optionally be "packed" structures, which indicate that
2814the alignment of the struct is one byte, and that there is no padding
2815between the elements. In non-packed structs, padding between field types
2816is inserted as defined by the DataLayout string in the module, which is
2817required to match what the underlying code generator expects.
2818
2819Structures can either be "literal" or "identified". A literal structure
2820is defined inline with other types (e.g. ``{i32, i32}*``) whereas
2821identified types are always defined at the top level with a name.
2822Literal types are uniqued by their contents and can never be recursive
2823or opaque since there is no way to write one. Identified types can be
2824recursive, can be opaqued, and are never uniqued.
2825
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002826:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002827
2828::
2829
2830 %T1 = type { <type list> } ; Identified normal struct type
2831 %T2 = type <{ <type list> }> ; Identified packed struct type
2832
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002833:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002834
2835+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2836| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values |
2837+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00002838| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. |
Sean Silvab084af42012-12-07 10:36:55 +00002839+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2840| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. |
2841+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2842
2843.. _t_opaque:
2844
2845Opaque Structure Types
Rafael Espindola08013342013-12-07 19:34:20 +00002846""""""""""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002847
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002848:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002849
2850Opaque structure types are used to represent named structure types that
2851do not have a body specified. This corresponds (for example) to the C
2852notion of a forward declared structure.
2853
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002854:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002855
2856::
2857
2858 %X = type opaque
2859 %52 = type opaque
2860
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002861:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002862
2863+--------------+-------------------+
2864| ``opaque`` | An opaque type. |
2865+--------------+-------------------+
2866
Sean Silva1703e702014-04-08 21:06:22 +00002867.. _constants:
2868
Sean Silvab084af42012-12-07 10:36:55 +00002869Constants
2870=========
2871
2872LLVM has several different basic types of constants. This section
2873describes them all and their syntax.
2874
2875Simple Constants
2876----------------
2877
2878**Boolean constants**
2879 The two strings '``true``' and '``false``' are both valid constants
2880 of the ``i1`` type.
2881**Integer constants**
2882 Standard integers (such as '4') are constants of the
2883 :ref:`integer <t_integer>` type. Negative numbers may be used with
2884 integer types.
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002885**Floating-point constants**
2886 Floating-point constants use standard decimal notation (e.g.
Sean Silvab084af42012-12-07 10:36:55 +00002887 123.421), exponential notation (e.g. 1.23421e+2), or a more precise
2888 hexadecimal notation (see below). The assembler requires the exact
2889 decimal value of a floating-point constant. For example, the
2890 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002891 decimal in binary. Floating-point constants must have a
2892 :ref:`floating-point <t_floating>` type.
Sean Silvab084af42012-12-07 10:36:55 +00002893**Null pointer constants**
2894 The identifier '``null``' is recognized as a null pointer constant
2895 and must be of :ref:`pointer type <t_pointer>`.
David Majnemerf0f224d2015-11-11 21:57:16 +00002896**Token constants**
2897 The identifier '``none``' is recognized as an empty token constant
2898 and must be of :ref:`token type <t_token>`.
Sean Silvab084af42012-12-07 10:36:55 +00002899
2900The one non-intuitive notation for constants is the hexadecimal form of
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002901floating-point constants. For example, the form
Sean Silvab084af42012-12-07 10:36:55 +00002902'``double 0x432ff973cafa8000``' is equivalent to (but harder to read
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002903than) '``double 4.5e+15``'. The only time hexadecimal floating-point
Sean Silvab084af42012-12-07 10:36:55 +00002904constants are required (and the only time that they are generated by the
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00002905disassembler) is when a floating-point constant must be emitted but it
2906cannot be represented as a decimal floating-point number in a reasonable
Sean Silvab084af42012-12-07 10:36:55 +00002907number of digits. For example, NaN's, infinities, and other special
2908values are represented in their IEEE hexadecimal format so that assembly
2909and disassembly do not cause any bits to change in the constants.
2910
2911When using the hexadecimal form, constants of types half, float, and
2912double are represented using the 16-digit form shown above (which
2913matches the IEEE754 representation for double); half and float values
Dmitri Gribenko4dc2ba12013-01-16 23:40:37 +00002914must, however, be exactly representable as IEEE 754 half and single
Sean Silvab084af42012-12-07 10:36:55 +00002915precision, respectively. Hexadecimal format is always used for long
2916double, and there are three forms of long double. The 80-bit format used
2917by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The
2918128-bit format used by PowerPC (two adjacent doubles) is represented by
2919``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is
Richard Sandifordae426b42013-05-03 14:32:27 +00002920represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles
2921will only work if they match the long double format on your target.
2922The IEEE 16-bit format (half precision) is represented by ``0xH``
2923followed by 4 hexadecimal digits. All hexadecimal formats are big-endian
2924(sign bit at the left).
Sean Silvab084af42012-12-07 10:36:55 +00002925
Reid Kleckner9a16d082014-03-05 02:41:37 +00002926There are no constants of type x86_mmx.
Sean Silvab084af42012-12-07 10:36:55 +00002927
Eli Bendersky0220e6b2013-06-07 20:24:43 +00002928.. _complexconstants:
2929
Sean Silvab084af42012-12-07 10:36:55 +00002930Complex Constants
2931-----------------
2932
2933Complex constants are a (potentially recursive) combination of simple
2934constants and smaller complex constants.
2935
2936**Structure constants**
2937 Structure constants are represented with notation similar to
2938 structure type definitions (a comma separated list of elements,
2939 surrounded by braces (``{}``)). For example:
2940 "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
2941 "``@G = external global i32``". Structure constants must have
2942 :ref:`structure type <t_struct>`, and the number and types of elements
2943 must match those specified by the type.
2944**Array constants**
2945 Array constants are represented with notation similar to array type
2946 definitions (a comma separated list of elements, surrounded by
2947 square brackets (``[]``)). For example:
2948 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
2949 :ref:`array type <t_array>`, and the number and types of elements must
Daniel Sandersf6051842014-09-11 12:02:59 +00002950 match those specified by the type. As a special case, character array
2951 constants may also be represented as a double-quoted string using the ``c``
2952 prefix. For example: "``c"Hello World\0A\00"``".
Sean Silvab084af42012-12-07 10:36:55 +00002953**Vector constants**
2954 Vector constants are represented with notation similar to vector
2955 type definitions (a comma separated list of elements, surrounded by
2956 less-than/greater-than's (``<>``)). For example:
2957 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
2958 must have :ref:`vector type <t_vector>`, and the number and types of
2959 elements must match those specified by the type.
2960**Zero initialization**
2961 The string '``zeroinitializer``' can be used to zero initialize a
2962 value to zero of *any* type, including scalar and
2963 :ref:`aggregate <t_aggregate>` types. This is often used to avoid
2964 having to print large zero initializers (e.g. for large arrays) and
2965 is always exactly equivalent to using explicit zero initializers.
2966**Metadata node**
Sean Silvaa1190322015-08-06 22:56:48 +00002967 A metadata node is a constant tuple without types. For example:
2968 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00002969 for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
2970 Unlike other typed constants that are meant to be interpreted as part of
2971 the instruction stream, metadata is a place to attach additional
Sean Silvab084af42012-12-07 10:36:55 +00002972 information such as debug info.
2973
2974Global Variable and Function Addresses
2975--------------------------------------
2976
2977The addresses of :ref:`global variables <globalvars>` and
2978:ref:`functions <functionstructure>` are always implicitly valid
2979(link-time) constants. These constants are explicitly referenced when
2980the :ref:`identifier for the global <identifiers>` is used and always have
2981:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
2982file:
2983
2984.. code-block:: llvm
2985
2986 @X = global i32 17
2987 @Y = global i32 42
2988 @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
2989
2990.. _undefvalues:
2991
2992Undefined Values
2993----------------
2994
2995The string '``undef``' can be used anywhere a constant is expected, and
2996indicates that the user of the value may receive an unspecified
2997bit-pattern. Undefined values may be of any type (other than '``label``'
2998or '``void``') and be used anywhere a constant is permitted.
2999
3000Undefined values are useful because they indicate to the compiler that
3001the program is well defined no matter what value is used. This gives the
3002compiler more freedom to optimize. Here are some examples of
3003(potentially surprising) transformations that are valid (in pseudo IR):
3004
3005.. code-block:: llvm
3006
3007 %A = add %X, undef
3008 %B = sub %X, undef
3009 %C = xor %X, undef
3010 Safe:
3011 %A = undef
3012 %B = undef
3013 %C = undef
3014
3015This is safe because all of the output bits are affected by the undef
3016bits. Any output bit can have a zero or one depending on the input bits.
3017
3018.. code-block:: llvm
3019
3020 %A = or %X, undef
3021 %B = and %X, undef
3022 Safe:
3023 %A = -1
3024 %B = 0
Sanjoy Das151493a2016-09-15 01:56:58 +00003025 Safe:
3026 %A = %X ;; By choosing undef as 0
3027 %B = %X ;; By choosing undef as -1
Sean Silvab084af42012-12-07 10:36:55 +00003028 Unsafe:
3029 %A = undef
3030 %B = undef
3031
3032These logical operations have bits that are not always affected by the
3033input. For example, if ``%X`` has a zero bit, then the output of the
3034'``and``' operation will always be a zero for that bit, no matter what
3035the corresponding bit from the '``undef``' is. As such, it is unsafe to
3036optimize or assume that the result of the '``and``' is '``undef``'.
3037However, it is safe to assume that all bits of the '``undef``' could be
30380, and optimize the '``and``' to 0. Likewise, it is safe to assume that
3039all the bits of the '``undef``' operand to the '``or``' could be set,
3040allowing the '``or``' to be folded to -1.
3041
3042.. code-block:: llvm
3043
3044 %A = select undef, %X, %Y
3045 %B = select undef, 42, %Y
3046 %C = select %X, %Y, undef
3047 Safe:
3048 %A = %X (or %Y)
3049 %B = 42 (or %Y)
3050 %C = %Y
3051 Unsafe:
3052 %A = undef
3053 %B = undef
3054 %C = undef
3055
3056This set of examples shows that undefined '``select``' (and conditional
3057branch) conditions can go *either way*, but they have to come from one
3058of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
3059both known to have a clear low bit, then ``%A`` would have to have a
3060cleared low bit. However, in the ``%C`` example, the optimizer is
3061allowed to assume that the '``undef``' operand could be the same as
3062``%Y``, allowing the whole '``select``' to be eliminated.
3063
Renato Golin124f2592016-07-20 12:16:38 +00003064.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00003065
3066 %A = xor undef, undef
3067
3068 %B = undef
3069 %C = xor %B, %B
3070
3071 %D = undef
Jonathan Roelofsec81c0b2014-10-16 19:28:10 +00003072 %E = icmp slt %D, 4
Sean Silvab084af42012-12-07 10:36:55 +00003073 %F = icmp gte %D, 4
3074
3075 Safe:
3076 %A = undef
3077 %B = undef
3078 %C = undef
3079 %D = undef
3080 %E = undef
3081 %F = undef
3082
3083This example points out that two '``undef``' operands are not
3084necessarily the same. This can be surprising to people (and also matches
3085C semantics) where they assume that "``X^X``" is always zero, even if
3086``X`` is undefined. This isn't true for a number of reasons, but the
3087short answer is that an '``undef``' "variable" can arbitrarily change
3088its value over its "live range". This is true because the variable
3089doesn't actually *have a live range*. Instead, the value is logically
3090read from arbitrary registers that happen to be around when needed, so
3091the value is not necessarily consistent over time. In fact, ``%A`` and
3092``%C`` need to have the same semantics or the core LLVM "replace all
3093uses with" concept would not hold.
3094
3095.. code-block:: llvm
3096
Sanjay Patel3aaf6a02018-03-09 15:27:48 +00003097 %A = sdiv undef, %X
3098 %B = sdiv %X, undef
Sean Silvab084af42012-12-07 10:36:55 +00003099 Safe:
Sanjay Patel3aaf6a02018-03-09 15:27:48 +00003100 %A = 0
Sean Silvab084af42012-12-07 10:36:55 +00003101 b: unreachable
3102
3103These examples show the crucial difference between an *undefined value*
3104and *undefined behavior*. An undefined value (like '``undef``') is
3105allowed to have an arbitrary bit-pattern. This means that the ``%A``
Sanjay Patel3aaf6a02018-03-09 15:27:48 +00003106operation can be constant folded to '``0``', because the '``undef``'
3107could be zero, and zero divided by any value is zero.
Sean Silvab084af42012-12-07 10:36:55 +00003108However, in the second example, we can make a more aggressive
3109assumption: because the ``undef`` is allowed to be an arbitrary value,
3110we are allowed to assume that it could be zero. Since a divide by zero
3111has *undefined behavior*, we are allowed to assume that the operation
3112does not execute at all. This allows us to delete the divide and all
3113code after it. Because the undefined operation "can't happen", the
3114optimizer can assume that it occurs in dead code.
3115
Renato Golin124f2592016-07-20 12:16:38 +00003116.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00003117
3118 a: store undef -> %X
3119 b: store %X -> undef
3120 Safe:
3121 a: <deleted>
3122 b: unreachable
3123
Sanjay Patel7b722402018-03-07 17:18:22 +00003124A store *of* an undefined value can be assumed to not have any effect;
3125we can assume that the value is overwritten with bits that happen to
3126match what was already there. However, a store *to* an undefined
3127location could clobber arbitrary memory, therefore, it has undefined
3128behavior.
Sean Silvab084af42012-12-07 10:36:55 +00003129
3130.. _poisonvalues:
3131
3132Poison Values
3133-------------
3134
3135Poison values are similar to :ref:`undef values <undefvalues>`, however
3136they also represent the fact that an instruction or constant expression
Richard Smith32dbdf62014-07-31 04:25:36 +00003137that cannot evoke side effects has nevertheless detected a condition
3138that results in undefined behavior.
Sean Silvab084af42012-12-07 10:36:55 +00003139
3140There is currently no way of representing a poison value in the IR; they
3141only exist when produced by operations such as :ref:`add <i_add>` with
3142the ``nsw`` flag.
3143
3144Poison value behavior is defined in terms of value *dependence*:
3145
3146- Values other than :ref:`phi <i_phi>` nodes depend on their operands.
3147- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
3148 their dynamic predecessor basic block.
3149- Function arguments depend on the corresponding actual argument values
3150 in the dynamic callers of their functions.
3151- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
3152 instructions that dynamically transfer control back to them.
3153- :ref:`Invoke <i_invoke>` instructions depend on the
3154 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
3155 call instructions that dynamically transfer control back to them.
3156- Non-volatile loads and stores depend on the most recent stores to all
3157 of the referenced memory addresses, following the order in the IR
3158 (including loads and stores implied by intrinsics such as
3159 :ref:`@llvm.memcpy <int_memcpy>`.)
3160- An instruction with externally visible side effects depends on the
3161 most recent preceding instruction with externally visible side
3162 effects, following the order in the IR. (This includes :ref:`volatile
3163 operations <volatile>`.)
3164- An instruction *control-depends* on a :ref:`terminator
3165 instruction <terminators>` if the terminator instruction has
3166 multiple successors and the instruction is always executed when
3167 control transfers to one of the successors, and may not be executed
3168 when control is transferred to another.
3169- Additionally, an instruction also *control-depends* on a terminator
3170 instruction if the set of instructions it otherwise depends on would
3171 be different if the terminator had transferred control to a different
3172 successor.
3173- Dependence is transitive.
3174
Richard Smith32dbdf62014-07-31 04:25:36 +00003175Poison values have the same behavior as :ref:`undef values <undefvalues>`,
3176with the additional effect that any instruction that has a *dependence*
Sean Silvab084af42012-12-07 10:36:55 +00003177on a poison value has undefined behavior.
3178
3179Here are some examples:
3180
3181.. code-block:: llvm
3182
3183 entry:
3184 %poison = sub nuw i32 0, 1 ; Results in a poison value.
3185 %still_poison = and i32 %poison, 0 ; 0, but also poison.
David Blaikie16a97eb2015-03-04 22:02:58 +00003186 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
Sean Silvab084af42012-12-07 10:36:55 +00003187 store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned
3188
3189 store i32 %poison, i32* @g ; Poison value stored to memory.
David Blaikiec7aabbb2015-03-04 22:06:14 +00003190 %poison2 = load i32, i32* @g ; Poison value loaded back from memory.
Sean Silvab084af42012-12-07 10:36:55 +00003191
3192 store volatile i32 %poison, i32* @g ; External observation; undefined behavior.
3193
3194 %narrowaddr = bitcast i32* @g to i16*
3195 %wideaddr = bitcast i32* @g to i64*
David Blaikiec7aabbb2015-03-04 22:06:14 +00003196 %poison3 = load i16, i16* %narrowaddr ; Returns a poison value.
3197 %poison4 = load i64, i64* %wideaddr ; Returns a poison value.
Sean Silvab084af42012-12-07 10:36:55 +00003198
3199 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value.
3200 br i1 %cmp, label %true, label %end ; Branch to either destination.
3201
3202 true:
3203 store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so
3204 ; it has undefined behavior.
3205 br label %end
3206
3207 end:
3208 %p = phi i32 [ 0, %entry ], [ 1, %true ]
3209 ; Both edges into this PHI are
3210 ; control-dependent on %cmp, so this
3211 ; always results in a poison value.
3212
3213 store volatile i32 0, i32* @g ; This would depend on the store in %true
3214 ; if %cmp is true, or the store in %entry
3215 ; otherwise, so this is undefined behavior.
3216
3217 br i1 %cmp, label %second_true, label %second_end
3218 ; The same branch again, but this time the
3219 ; true block doesn't have side effects.
3220
3221 second_true:
3222 ; No side effects!
3223 ret void
3224
3225 second_end:
3226 store volatile i32 0, i32* @g ; This time, the instruction always depends
3227 ; on the store in %end. Also, it is
3228 ; control-equivalent to %end, so this is
3229 ; well-defined (ignoring earlier undefined
3230 ; behavior in this example).
3231
3232.. _blockaddress:
3233
3234Addresses of Basic Blocks
3235-------------------------
3236
3237``blockaddress(@function, %block)``
3238
3239The '``blockaddress``' constant computes the address of the specified
3240basic block in the specified function, and always has an ``i8*`` type.
3241Taking the address of the entry block is illegal.
3242
3243This value only has defined behavior when used as an operand to the
3244':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons
3245against null. Pointer equality tests between labels addresses results in
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00003246undefined behavior --- though, again, comparison against null is ok, and
Sean Silvab084af42012-12-07 10:36:55 +00003247no label is equal to the null pointer. This may be passed around as an
3248opaque pointer sized value as long as the bits are not inspected. This
3249allows ``ptrtoint`` and arithmetic to be performed on these values so
3250long as the original value is reconstituted before the ``indirectbr``
3251instruction.
3252
3253Finally, some targets may provide defined semantics when using the value
3254as the operand to an inline assembly, but that is target specific.
3255
Eli Bendersky0220e6b2013-06-07 20:24:43 +00003256.. _constantexprs:
3257
Sean Silvab084af42012-12-07 10:36:55 +00003258Constant Expressions
3259--------------------
3260
3261Constant expressions are used to allow expressions involving other
3262constants to be used as constants. Constant expressions may be of any
3263:ref:`first class <t_firstclass>` type and may involve any LLVM operation
3264that does not have side effects (e.g. load and call are not supported).
3265The following is the syntax for constant expressions:
3266
3267``trunc (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003268 Perform the :ref:`trunc operation <i_trunc>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003269``zext (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003270 Perform the :ref:`zext operation <i_zext>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003271``sext (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003272 Perform the :ref:`sext operation <i_sext>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003273``fptrunc (CST to TYPE)``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003274 Truncate a floating-point constant to another floating-point type.
Sean Silvab084af42012-12-07 10:36:55 +00003275 The size of CST must be larger than the size of TYPE. Both types
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003276 must be floating-point.
Sean Silvab084af42012-12-07 10:36:55 +00003277``fpext (CST to TYPE)``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003278 Floating-point extend a constant to another type. The size of CST
Sean Silvab084af42012-12-07 10:36:55 +00003279 must be smaller or equal to the size of TYPE. Both types must be
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003280 floating-point.
Sean Silvab084af42012-12-07 10:36:55 +00003281``fptoui (CST to TYPE)``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003282 Convert a floating-point constant to the corresponding unsigned
Sean Silvab084af42012-12-07 10:36:55 +00003283 integer constant. TYPE must be a scalar or vector integer type. CST
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003284 must be of scalar or vector floating-point type. Both CST and TYPE
Sean Silvab084af42012-12-07 10:36:55 +00003285 must be scalars, or vectors of the same number of elements. If the
Eli Friedmanc065bb22018-06-08 21:33:33 +00003286 value won't fit in the integer type, the result is a
3287 :ref:`poison value <poisonvalues>`.
Sean Silvab084af42012-12-07 10:36:55 +00003288``fptosi (CST to TYPE)``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003289 Convert a floating-point constant to the corresponding signed
Sean Silvab084af42012-12-07 10:36:55 +00003290 integer constant. TYPE must be a scalar or vector integer type. CST
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003291 must be of scalar or vector floating-point type. Both CST and TYPE
Sean Silvab084af42012-12-07 10:36:55 +00003292 must be scalars, or vectors of the same number of elements. If the
Eli Friedmanc065bb22018-06-08 21:33:33 +00003293 value won't fit in the integer type, the result is a
3294 :ref:`poison value <poisonvalues>`.
Sean Silvab084af42012-12-07 10:36:55 +00003295``uitofp (CST to TYPE)``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003296 Convert an unsigned integer constant to the corresponding
3297 floating-point constant. TYPE must be a scalar or vector floating-point
3298 type. CST must be of scalar or vector integer type. Both CST and TYPE must
Eli Friedman3f1ce092018-06-14 22:58:48 +00003299 be scalars, or vectors of the same number of elements.
Sean Silvab084af42012-12-07 10:36:55 +00003300``sitofp (CST to TYPE)``
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003301 Convert a signed integer constant to the corresponding floating-point
3302 constant. TYPE must be a scalar or vector floating-point type.
Sean Silvab084af42012-12-07 10:36:55 +00003303 CST must be of scalar or vector integer type. Both CST and TYPE must
Eli Friedman3f1ce092018-06-14 22:58:48 +00003304 be scalars, or vectors of the same number of elements.
Sean Silvab084af42012-12-07 10:36:55 +00003305``ptrtoint (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003306 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003307``inttoptr (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003308 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003309 This one is *really* dangerous!
3310``bitcast (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003311 Convert a constant, CST, to another TYPE.
3312 The constraints of the operands are the same as those for the
3313 :ref:`bitcast instruction <i_bitcast>`.
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00003314``addrspacecast (CST to TYPE)``
3315 Convert a constant pointer or constant vector of pointer, CST, to another
3316 TYPE in a different address space. The constraints of the operands are the
3317 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
David Blaikief72d05b2015-03-13 18:20:45 +00003318``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
Sean Silvab084af42012-12-07 10:36:55 +00003319 Perform the :ref:`getelementptr operation <i_getelementptr>` on
3320 constants. As with the :ref:`getelementptr <i_getelementptr>`
David Blaikief91b0302017-06-19 05:34:21 +00003321 instruction, the index list may have one or more indexes, which are
David Blaikief72d05b2015-03-13 18:20:45 +00003322 required to make sense for the type of "pointer to TY".
Sean Silvab084af42012-12-07 10:36:55 +00003323``select (COND, VAL1, VAL2)``
3324 Perform the :ref:`select operation <i_select>` on constants.
3325``icmp COND (VAL1, VAL2)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003326 Perform the :ref:`icmp operation <i_icmp>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003327``fcmp COND (VAL1, VAL2)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003328 Perform the :ref:`fcmp operation <i_fcmp>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003329``extractelement (VAL, IDX)``
3330 Perform the :ref:`extractelement operation <i_extractelement>` on
3331 constants.
3332``insertelement (VAL, ELT, IDX)``
3333 Perform the :ref:`insertelement operation <i_insertelement>` on
3334 constants.
3335``shufflevector (VEC1, VEC2, IDXMASK)``
3336 Perform the :ref:`shufflevector operation <i_shufflevector>` on
3337 constants.
3338``extractvalue (VAL, IDX0, IDX1, ...)``
3339 Perform the :ref:`extractvalue operation <i_extractvalue>` on
3340 constants. The index list is interpreted in a similar manner as
3341 indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
3342 least one index value must be specified.
3343``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
3344 Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
3345 The index list is interpreted in a similar manner as indices in a
3346 ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
3347 value must be specified.
3348``OPCODE (LHS, RHS)``
3349 Perform the specified operation of the LHS and RHS constants. OPCODE
3350 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
3351 binary <bitwiseops>` operations. The constraints on operands are
3352 the same as those for the corresponding instruction (e.g. no bitwise
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003353 operations on floating-point values are allowed).
Sean Silvab084af42012-12-07 10:36:55 +00003354
3355Other Values
3356============
3357
Eli Bendersky0220e6b2013-06-07 20:24:43 +00003358.. _inlineasmexprs:
3359
Sean Silvab084af42012-12-07 10:36:55 +00003360Inline Assembler Expressions
3361----------------------------
3362
3363LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
James Y Knightbc832ed2015-07-08 18:08:36 +00003364Inline Assembly <moduleasm>`) through the use of a special value. This value
3365represents the inline assembler as a template string (containing the
3366instructions to emit), a list of operand constraints (stored as a string), a
3367flag that indicates whether or not the inline asm expression has side effects,
3368and a flag indicating whether the function containing the asm needs to align its
3369stack conservatively.
3370
3371The template string supports argument substitution of the operands using "``$``"
3372followed by a number, to indicate substitution of the given register/memory
3373location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
3374be used, where ``MODIFIER`` is a target-specific annotation for how to print the
3375operand (See :ref:`inline-asm-modifiers`).
3376
3377A literal "``$``" may be included by using "``$$``" in the template. To include
3378other special characters into the output, the usual "``\XX``" escapes may be
3379used, just as in other strings. Note that after template substitution, the
3380resulting assembly string is parsed by LLVM's integrated assembler unless it is
3381disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
3382syntax known to LLVM.
3383
Reid Kleckner71cb1642017-02-06 18:08:45 +00003384LLVM also supports a few more substitions useful for writing inline assembly:
3385
3386- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
3387 This substitution is useful when declaring a local label. Many standard
3388 compiler optimizations, such as inlining, may duplicate an inline asm blob.
3389 Adding a blob-unique identifier ensures that the two labels will not conflict
3390 during assembly. This is used to implement `GCC's %= special format
3391 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
3392- ``${:comment}``: Expands to the comment character of the current target's
3393 assembly dialect. This is usually ``#``, but many targets use other strings,
3394 such as ``;``, ``//``, or ``!``.
3395- ``${:private}``: Expands to the assembler private label prefix. Labels with
3396 this prefix will not appear in the symbol table of the assembled object.
3397 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
3398 relatively popular.
3399
James Y Knightbc832ed2015-07-08 18:08:36 +00003400LLVM's support for inline asm is modeled closely on the requirements of Clang's
3401GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
3402modifier codes listed here are similar or identical to those in GCC's inline asm
3403support. However, to be clear, the syntax of the template and constraint strings
3404described here is *not* the same as the syntax accepted by GCC and Clang, and,
3405while most constraint letters are passed through as-is by Clang, some get
3406translated to other codes when converting from the C source to the LLVM
3407assembly.
3408
3409An example inline assembler expression is:
Sean Silvab084af42012-12-07 10:36:55 +00003410
3411.. code-block:: llvm
3412
3413 i32 (i32) asm "bswap $0", "=r,r"
3414
3415Inline assembler expressions may **only** be used as the callee operand
3416of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
3417Thus, typically we have:
3418
3419.. code-block:: llvm
3420
3421 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
3422
3423Inline asms with side effects not visible in the constraint list must be
3424marked as having side effects. This is done through the use of the
3425'``sideeffect``' keyword, like so:
3426
3427.. code-block:: llvm
3428
3429 call void asm sideeffect "eieio", ""()
3430
3431In some cases inline asms will contain code that will not work unless
3432the stack is aligned in some way, such as calls or SSE instructions on
3433x86, yet will not contain code that does that alignment within the asm.
3434The compiler should make conservative assumptions about what the asm
3435might contain and should generate its usual stack alignment code in the
3436prologue if the '``alignstack``' keyword is present:
3437
3438.. code-block:: llvm
3439
3440 call void asm alignstack "eieio", ""()
3441
3442Inline asms also support using non-standard assembly dialects. The
3443assumed dialect is ATT. When the '``inteldialect``' keyword is present,
3444the inline asm is using the Intel dialect. Currently, ATT and Intel are
3445the only supported dialects. An example is:
3446
3447.. code-block:: llvm
3448
3449 call void asm inteldialect "eieio", ""()
3450
3451If multiple keywords appear the '``sideeffect``' keyword must come
3452first, the '``alignstack``' keyword second and the '``inteldialect``'
3453keyword last.
3454
James Y Knightbc832ed2015-07-08 18:08:36 +00003455Inline Asm Constraint String
3456^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3457
3458The constraint list is a comma-separated string, each element containing one or
3459more constraint codes.
3460
3461For each element in the constraint list an appropriate register or memory
3462operand will be chosen, and it will be made available to assembly template
3463string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
3464second, etc.
3465
3466There are three different types of constraints, which are distinguished by a
3467prefix symbol in front of the constraint code: Output, Input, and Clobber. The
3468constraints must always be given in that order: outputs first, then inputs, then
3469clobbers. They cannot be intermingled.
3470
3471There are also three different categories of constraint codes:
3472
3473- Register constraint. This is either a register class, or a fixed physical
3474 register. This kind of constraint will allocate a register, and if necessary,
3475 bitcast the argument or result to the appropriate type.
3476- Memory constraint. This kind of constraint is for use with an instruction
3477 taking a memory operand. Different constraints allow for different addressing
3478 modes used by the target.
3479- Immediate value constraint. This kind of constraint is for an integer or other
3480 immediate value which can be rendered directly into an instruction. The
3481 various target-specific constraints allow the selection of a value in the
3482 proper range for the instruction you wish to use it with.
3483
3484Output constraints
3485""""""""""""""""""
3486
3487Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
3488indicates that the assembly will write to this operand, and the operand will
3489then be made available as a return value of the ``asm`` expression. Output
3490constraints do not consume an argument from the call instruction. (Except, see
3491below about indirect outputs).
3492
3493Normally, it is expected that no output locations are written to by the assembly
3494expression until *all* of the inputs have been read. As such, LLVM may assign
3495the same register to an output and an input. If this is not safe (e.g. if the
3496assembly contains two instructions, where the first writes to one output, and
3497the second reads an input and writes to a second output), then the "``&``"
3498modifier must be used (e.g. "``=&r``") to specify that the output is an
Sylvestre Ledru84666a12016-02-14 20:16:22 +00003499"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
James Y Knightbc832ed2015-07-08 18:08:36 +00003500will not use the same register for any inputs (other than an input tied to this
3501output).
3502
3503Input constraints
3504"""""""""""""""""
3505
3506Input constraints do not have a prefix -- just the constraint codes. Each input
3507constraint will consume one argument from the call instruction. It is not
3508permitted for the asm to write to any input register or memory location (unless
3509that input is tied to an output). Note also that multiple inputs may all be
3510assigned to the same register, if LLVM can determine that they necessarily all
3511contain the same value.
3512
3513Instead of providing a Constraint Code, input constraints may also "tie"
3514themselves to an output constraint, by providing an integer as the constraint
3515string. Tied inputs still consume an argument from the call instruction, and
3516take up a position in the asm template numbering as is usual -- they will simply
3517be constrained to always use the same register as the output they've been tied
3518to. For example, a constraint string of "``=r,0``" says to assign a register for
3519output, and use that register as an input as well (it being the 0'th
3520constraint).
3521
3522It is permitted to tie an input to an "early-clobber" output. In that case, no
3523*other* input may share the same register as the input tied to the early-clobber
3524(even when the other input has the same value).
3525
3526You may only tie an input to an output which has a register constraint, not a
3527memory constraint. Only a single input may be tied to an output.
3528
3529There is also an "interesting" feature which deserves a bit of explanation: if a
3530register class constraint allocates a register which is too small for the value
3531type operand provided as input, the input value will be split into multiple
3532registers, and all of them passed to the inline asm.
3533
3534However, this feature is often not as useful as you might think.
3535
3536Firstly, the registers are *not* guaranteed to be consecutive. So, on those
3537architectures that have instructions which operate on multiple consecutive
3538instructions, this is not an appropriate way to support them. (e.g. the 32-bit
3539SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
3540hardware then loads into both the named register, and the next register. This
3541feature of inline asm would not be useful to support that.)
3542
3543A few of the targets provide a template string modifier allowing explicit access
3544to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
3545``D``). On such an architecture, you can actually access the second allocated
3546register (yet, still, not any subsequent ones). But, in that case, you're still
3547probably better off simply splitting the value into two separate operands, for
3548clarity. (e.g. see the description of the ``A`` constraint on X86, which,
3549despite existing only for use with this feature, is not really a good idea to
3550use)
3551
3552Indirect inputs and outputs
3553"""""""""""""""""""""""""""
3554
3555Indirect output or input constraints can be specified by the "``*``" modifier
3556(which goes after the "``=``" in case of an output). This indicates that the asm
3557will write to or read from the contents of an *address* provided as an input
3558argument. (Note that in this way, indirect outputs act more like an *input* than
3559an output: just like an input, they consume an argument of the call expression,
3560rather than producing a return value. An indirect output constraint is an
3561"output" only in that the asm is expected to write to the contents of the input
3562memory location, instead of just read from it).
3563
3564This is most typically used for memory constraint, e.g. "``=*m``", to pass the
3565address of a variable as a value.
3566
3567It is also possible to use an indirect *register* constraint, but only on output
3568(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
3569value normally, and then, separately emit a store to the address provided as
3570input, after the provided inline asm. (It's not clear what value this
3571functionality provides, compared to writing the store explicitly after the asm
3572statement, and it can only produce worse code, since it bypasses many
3573optimization passes. I would recommend not using it.)
3574
3575
3576Clobber constraints
3577"""""""""""""""""""
3578
3579A clobber constraint is indicated by a "``~``" prefix. A clobber does not
3580consume an input operand, nor generate an output. Clobbers cannot use any of the
3581general constraint code letters -- they may use only explicit register
3582constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
3583"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
3584memory locations -- not only the memory pointed to by a declared indirect
3585output.
3586
Peter Zotov00257232016-08-30 10:48:31 +00003587Note that clobbering named registers that are also present in output
3588constraints is not legal.
3589
James Y Knightbc832ed2015-07-08 18:08:36 +00003590
3591Constraint Codes
3592""""""""""""""""
3593After a potential prefix comes constraint code, or codes.
3594
3595A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
3596followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
3597(e.g. "``{eax}``").
3598
3599The one and two letter constraint codes are typically chosen to be the same as
3600GCC's constraint codes.
3601
3602A single constraint may include one or more than constraint code in it, leaving
3603it up to LLVM to choose which one to use. This is included mainly for
3604compatibility with the translation of GCC inline asm coming from clang.
3605
3606There are two ways to specify alternatives, and either or both may be used in an
3607inline asm constraint list:
3608
36091) Append the codes to each other, making a constraint code set. E.g. "``im``"
3610 or "``{eax}m``". This means "choose any of the options in the set". The
3611 choice of constraint is made independently for each constraint in the
3612 constraint list.
3613
36142) Use "``|``" between constraint code sets, creating alternatives. Every
3615 constraint in the constraint list must have the same number of alternative
3616 sets. With this syntax, the same alternative in *all* of the items in the
3617 constraint list will be chosen together.
3618
3619Putting those together, you might have a two operand constraint string like
3620``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
3621operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
3622may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
3623
3624However, the use of either of the alternatives features is *NOT* recommended, as
3625LLVM is not able to make an intelligent choice about which one to use. (At the
3626point it currently needs to choose, not enough information is available to do so
3627in a smart way.) Thus, it simply tries to make a choice that's most likely to
3628compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
3629always choose to use memory, not registers). And, if given multiple registers,
3630or multiple register classes, it will simply choose the first one. (In fact, it
3631doesn't currently even ensure explicitly specified physical registers are
3632unique, so specifying multiple physical registers as alternatives, like
3633``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
3634intended.)
3635
3636Supported Constraint Code List
3637""""""""""""""""""""""""""""""
3638
3639The constraint codes are, in general, expected to behave the same way they do in
3640GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3641inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3642and GCC likely indicates a bug in LLVM.
3643
3644Some constraint codes are typically supported by all targets:
3645
3646- ``r``: A register in the target's general purpose register class.
3647- ``m``: A memory address operand. It is target-specific what addressing modes
3648 are supported, typical examples are register, or register + register offset,
3649 or register + immediate offset (of some target-specific size).
3650- ``i``: An integer constant (of target-specific width). Allows either a simple
3651 immediate, or a relocatable value.
3652- ``n``: An integer constant -- *not* including relocatable values.
3653- ``s``: An integer constant, but allowing *only* relocatable values.
3654- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
3655 useful to pass a label for an asm branch or call.
3656
3657 .. FIXME: but that surely isn't actually okay to jump out of an asm
3658 block without telling llvm about the control transfer???)
3659
3660- ``{register-name}``: Requires exactly the named physical register.
3661
3662Other constraints are target-specific:
3663
3664AArch64:
3665
3666- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
3667- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
3668 i.e. 0 to 4095 with optional shift by 12.
3669- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
3670 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
3671- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
3672 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
3673- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
3674 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
3675- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
3676 32-bit register. This is a superset of ``K``: in addition to the bitmask
3677 immediate, also allows immediate integers which can be loaded with a single
3678 ``MOVZ`` or ``MOVL`` instruction.
3679- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
3680 64-bit register. This is a superset of ``L``.
3681- ``Q``: Memory address operand must be in a single register (no
3682 offsets). (However, LLVM currently does this for the ``m`` constraint as
3683 well.)
3684- ``r``: A 32 or 64-bit integer register (W* or X*).
3685- ``w``: A 32, 64, or 128-bit floating-point/SIMD register.
3686- ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``).
3687
3688AMDGPU:
3689
3690- ``r``: A 32 or 64-bit integer register.
3691- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
3692- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
3693
3694
3695All ARM modes:
3696
3697- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
3698 operand. Treated the same as operand ``m``, at the moment.
3699
3700ARM and ARM's Thumb2 mode:
3701
3702- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
3703- ``I``: An immediate integer valid for a data-processing instruction.
3704- ``J``: An immediate integer between -4095 and 4095.
3705- ``K``: An immediate integer whose bitwise inverse is valid for a
3706 data-processing instruction. (Can be used with template modifier "``B``" to
3707 print the inverted value).
3708- ``L``: An immediate integer whose negation is valid for a data-processing
3709 instruction. (Can be used with template modifier "``n``" to print the negated
3710 value).
3711- ``M``: A power of two or a integer between 0 and 32.
3712- ``N``: Invalid immediate constraint.
3713- ``O``: Invalid immediate constraint.
3714- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
3715- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
3716 as ``r``.
3717- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
3718 invalid.
3719- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3720 ``d0-d31``, or ``q0-q15``.
3721- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3722 ``d0-d7``, or ``q0-q3``.
Pablo Barrioe28cb832018-02-15 14:44:22 +00003723- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or
3724 ``q0-q8``.
James Y Knightbc832ed2015-07-08 18:08:36 +00003725
3726ARM's Thumb1 mode:
3727
3728- ``I``: An immediate integer between 0 and 255.
3729- ``J``: An immediate integer between -255 and -1.
3730- ``K``: An immediate integer between 0 and 255, with optional left-shift by
3731 some amount.
3732- ``L``: An immediate integer between -7 and 7.
3733- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
3734- ``N``: An immediate integer between 0 and 31.
3735- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
3736- ``r``: A low 32-bit GPR register (``r0-r7``).
3737- ``l``: A low 32-bit GPR register (``r0-r7``).
3738- ``h``: A high GPR register (``r0-r7``).
3739- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3740 ``d0-d31``, or ``q0-q15``.
3741- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3742 ``d0-d7``, or ``q0-q3``.
Pablo Barrioe28cb832018-02-15 14:44:22 +00003743- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or
3744 ``q0-q8``.
James Y Knightbc832ed2015-07-08 18:08:36 +00003745
3746
3747Hexagon:
3748
3749- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
3750 at the moment.
3751- ``r``: A 32 or 64-bit register.
3752
3753MSP430:
3754
3755- ``r``: An 8 or 16-bit register.
3756
3757MIPS:
3758
3759- ``I``: An immediate signed 16-bit integer.
3760- ``J``: An immediate integer zero.
3761- ``K``: An immediate unsigned 16-bit integer.
3762- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
3763- ``N``: An immediate integer between -65535 and -1.
3764- ``O``: An immediate signed 15-bit integer.
3765- ``P``: An immediate integer between 1 and 65535.
3766- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
3767 register plus 16-bit immediate offset. In MIPS mode, just a base register.
3768- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
3769 register plus a 9-bit signed offset. In MIPS mode, the same as constraint
3770 ``m``.
3771- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
3772 ``sc`` instruction on the given subtarget (details vary).
3773- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
3774- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
Daniel Sanders3745e022015-07-13 09:24:21 +00003775 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
3776 argument modifier for compatibility with GCC.
James Y Knightbc832ed2015-07-08 18:08:36 +00003777- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
3778 ``25``).
3779- ``l``: The ``lo`` register, 32 or 64-bit.
3780- ``x``: Invalid.
3781
3782NVPTX:
3783
3784- ``b``: A 1-bit integer register.
3785- ``c`` or ``h``: A 16-bit integer register.
3786- ``r``: A 32-bit integer register.
3787- ``l`` or ``N``: A 64-bit integer register.
3788- ``f``: A 32-bit float register.
3789- ``d``: A 64-bit float register.
3790
3791
3792PowerPC:
3793
3794- ``I``: An immediate signed 16-bit integer.
3795- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
3796- ``K``: An immediate unsigned 16-bit integer.
3797- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
3798- ``M``: An immediate integer greater than 31.
3799- ``N``: An immediate integer that is an exact power of 2.
3800- ``O``: The immediate integer constant 0.
3801- ``P``: An immediate integer constant whose negation is a signed 16-bit
3802 constant.
3803- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
3804 treated the same as ``m``.
3805- ``r``: A 32 or 64-bit integer register.
3806- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
3807 ``R1-R31``).
3808- ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a
3809 128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers).
3810- ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a
3811 128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit
3812 altivec vector register (``V0-V31``).
3813
3814 .. FIXME: is this a bug that v accepts QPX registers? I think this
3815 is supposed to only use the altivec vector registers?
3816
3817- ``y``: Condition register (``CR0-CR7``).
3818- ``wc``: An individual CR bit in a CR register.
3819- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
3820 register set (overlapping both the floating-point and vector register files).
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003821- ``ws``: A 32 or 64-bit floating-point register, from the full VSX register
James Y Knightbc832ed2015-07-08 18:08:36 +00003822 set.
3823
3824Sparc:
3825
3826- ``I``: An immediate 13-bit signed integer.
3827- ``r``: A 32-bit integer register.
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003828- ``f``: Any floating-point register on SparcV8, or a floating-point
James Y Knightd4e1b002017-05-12 15:59:10 +00003829 register in the "low" half of the registers on SparcV9.
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003830- ``e``: Any floating-point register. (Same as ``f`` on SparcV8.)
James Y Knightbc832ed2015-07-08 18:08:36 +00003831
3832SystemZ:
3833
3834- ``I``: An immediate unsigned 8-bit integer.
3835- ``J``: An immediate unsigned 12-bit integer.
3836- ``K``: An immediate signed 16-bit integer.
3837- ``L``: An immediate signed 20-bit integer.
3838- ``M``: An immediate integer 0x7fffffff.
Ulrich Weiganddaae87aa2016-06-13 14:24:05 +00003839- ``Q``: A memory address operand with a base address and a 12-bit immediate
3840 unsigned displacement.
3841- ``R``: A memory address operand with a base address, a 12-bit immediate
3842 unsigned displacement, and an index register.
3843- ``S``: A memory address operand with a base address and a 20-bit immediate
3844 signed displacement.
3845- ``T``: A memory address operand with a base address, a 20-bit immediate
3846 signed displacement, and an index register.
James Y Knightbc832ed2015-07-08 18:08:36 +00003847- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
3848- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
3849 address context evaluates as zero).
3850- ``h``: A 32-bit value in the high part of a 64bit data register
3851 (LLVM-specific)
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00003852- ``f``: A 32, 64, or 128-bit floating-point register.
James Y Knightbc832ed2015-07-08 18:08:36 +00003853
3854X86:
3855
3856- ``I``: An immediate integer between 0 and 31.
3857- ``J``: An immediate integer between 0 and 64.
3858- ``K``: An immediate signed 8-bit integer.
3859- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
3860 0xffffffff.
3861- ``M``: An immediate integer between 0 and 3.
3862- ``N``: An immediate unsigned 8-bit integer.
3863- ``O``: An immediate integer between 0 and 127.
3864- ``e``: An immediate 32-bit signed integer.
3865- ``Z``: An immediate 32-bit unsigned integer.
3866- ``o``, ``v``: Treated the same as ``m``, at the moment.
3867- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3868 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
3869 registers, and on X86-64, it is all of the integer registers.
3870- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3871 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
3872- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
3873- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
3874 existed since i386, and can be accessed without the REX prefix.
3875- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
3876- ``y``: A 64-bit MMX register, if MMX is enabled.
3877- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
3878 operand in a SSE register. If AVX is also enabled, can also be a 256-bit
3879 vector operand in an AVX register. If AVX-512 is also enabled, can also be a
3880 512-bit vector operand in an AVX512 register, Otherwise, an error.
3881- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
3882- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
3883 32-bit mode, a 64-bit integer operand will get split into two registers). It
3884 is not recommended to use this constraint, as in 64-bit mode, the 64-bit
3885 operand will get allocated only to RAX -- if two 32-bit operands are needed,
3886 you're better off splitting it yourself, before passing it to the asm
3887 statement.
3888
3889XCore:
3890
3891- ``r``: A 32-bit integer register.
3892
3893
3894.. _inline-asm-modifiers:
3895
3896Asm template argument modifiers
3897^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3898
3899In the asm template string, modifiers can be used on the operand reference, like
3900"``${0:n}``".
3901
3902The modifiers are, in general, expected to behave the same way they do in
3903GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3904inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3905and GCC likely indicates a bug in LLVM.
3906
3907Target-independent:
3908
Sean Silvaa1190322015-08-06 22:56:48 +00003909- ``c``: Print an immediate integer constant unadorned, without
James Y Knightbc832ed2015-07-08 18:08:36 +00003910 the target-specific immediate punctuation (e.g. no ``$`` prefix).
3911- ``n``: Negate and print immediate integer constant unadorned, without the
3912 target-specific immediate punctuation (e.g. no ``$`` prefix).
3913- ``l``: Print as an unadorned label, without the target-specific label
3914 punctuation (e.g. no ``$`` prefix).
3915
3916AArch64:
3917
3918- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
3919 instead of ``x30``, print ``w30``.
3920- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
3921- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
3922 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
3923 ``v*``.
3924
3925AMDGPU:
3926
3927- ``r``: No effect.
3928
3929ARM:
3930
3931- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
3932 register).
3933- ``P``: No effect.
3934- ``q``: No effect.
3935- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
3936 as ``d4[1]`` instead of ``s9``)
3937- ``B``: Bitwise invert and print an immediate integer constant without ``#``
3938 prefix.
3939- ``L``: Print the low 16-bits of an immediate integer constant.
3940- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
3941 register operands subsequent to the specified one (!), so use carefully.
3942- ``Q``: Print the low-order register of a register-pair, or the low-order
3943 register of a two-register operand.
3944- ``R``: Print the high-order register of a register-pair, or the high-order
3945 register of a two-register operand.
3946- ``H``: Print the second register of a register-pair. (On a big-endian system,
3947 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
3948 to ``R``.)
3949
3950 .. FIXME: H doesn't currently support printing the second register
3951 of a two-register operand.
3952
3953- ``e``: Print the low doubleword register of a NEON quad register.
3954- ``f``: Print the high doubleword register of a NEON quad register.
3955- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
3956 adornment.
3957
3958Hexagon:
3959
3960- ``L``: Print the second register of a two-register operand. Requires that it
3961 has been allocated consecutively to the first.
3962
3963 .. FIXME: why is it restricted to consecutive ones? And there's
3964 nothing that ensures that happens, is there?
3965
3966- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3967 nothing. Used to print 'addi' vs 'add' instructions.
3968
3969MSP430:
3970
3971No additional modifiers.
3972
3973MIPS:
3974
3975- ``X``: Print an immediate integer as hexadecimal
3976- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
3977- ``d``: Print an immediate integer as decimal.
3978- ``m``: Subtract one and print an immediate integer as decimal.
3979- ``z``: Print $0 if an immediate zero, otherwise print normally.
3980- ``L``: Print the low-order register of a two-register operand, or prints the
3981 address of the low-order word of a double-word memory operand.
3982
3983 .. FIXME: L seems to be missing memory operand support.
3984
3985- ``M``: Print the high-order register of a two-register operand, or prints the
3986 address of the high-order word of a double-word memory operand.
3987
3988 .. FIXME: M seems to be missing memory operand support.
3989
3990- ``D``: Print the second register of a two-register operand, or prints the
3991 second word of a double-word memory operand. (On a big-endian system, ``D`` is
3992 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
3993 ``M``.)
Daniel Sanders3745e022015-07-13 09:24:21 +00003994- ``w``: No effect. Provided for compatibility with GCC which requires this
3995 modifier in order to print MSA registers (``W0-W31``) with the ``f``
3996 constraint.
James Y Knightbc832ed2015-07-08 18:08:36 +00003997
3998NVPTX:
3999
4000- ``r``: No effect.
4001
4002PowerPC:
4003
4004- ``L``: Print the second register of a two-register operand. Requires that it
4005 has been allocated consecutively to the first.
4006
4007 .. FIXME: why is it restricted to consecutive ones? And there's
4008 nothing that ensures that happens, is there?
4009
4010- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
4011 nothing. Used to print 'addi' vs 'add' instructions.
4012- ``y``: For a memory operand, prints formatter for a two-register X-form
4013 instruction. (Currently always prints ``r0,OPERAND``).
4014- ``U``: Prints 'u' if the memory operand is an update form, and nothing
4015 otherwise. (NOTE: LLVM does not support update form, so this will currently
4016 always print nothing)
4017- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
4018 not support indexed form, so this will currently always print nothing)
4019
4020Sparc:
4021
4022- ``r``: No effect.
4023
4024SystemZ:
4025
4026SystemZ implements only ``n``, and does *not* support any of the other
4027target-independent modifiers.
4028
4029X86:
4030
4031- ``c``: Print an unadorned integer or symbol name. (The latter is
4032 target-specific behavior for this typically target-independent modifier).
4033- ``A``: Print a register name with a '``*``' before it.
4034- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
4035 operand.
4036- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
4037 memory operand.
4038- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
4039 operand.
4040- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
4041 operand.
4042- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
4043 available, otherwise the 32-bit register name; do nothing on a memory operand.
4044- ``n``: Negate and print an unadorned integer, or, for operands other than an
4045 immediate integer (e.g. a relocatable symbol expression), print a '-' before
4046 the operand. (The behavior for relocatable symbol expressions is a
4047 target-specific behavior for this typically target-independent modifier)
4048- ``H``: Print a memory reference with additional offset +8.
4049- ``P``: Print a memory reference or operand for use as the argument of a call
4050 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
4051
4052XCore:
4053
4054No additional modifiers.
4055
4056
Sean Silvab084af42012-12-07 10:36:55 +00004057Inline Asm Metadata
4058^^^^^^^^^^^^^^^^^^^
4059
4060The call instructions that wrap inline asm nodes may have a
4061"``!srcloc``" MDNode attached to it that contains a list of constant
4062integers. If present, the code generator will use the integer as the
4063location cookie value when report errors through the ``LLVMContext``
4064error reporting mechanisms. This allows a front-end to correlate backend
4065errors that occur with inline asm back to the source code that produced
4066it. For example:
4067
4068.. code-block:: llvm
4069
4070 call void asm sideeffect "something bad", ""(), !srcloc !42
4071 ...
4072 !42 = !{ i32 1234567 }
4073
4074It is up to the front-end to make sense of the magic numbers it places
4075in the IR. If the MDNode contains multiple constants, the code generator
4076will use the one that corresponds to the line of the asm that the error
4077occurs on.
4078
4079.. _metadata:
4080
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004081Metadata
4082========
Sean Silvab084af42012-12-07 10:36:55 +00004083
4084LLVM IR allows metadata to be attached to instructions in the program
4085that can convey extra information about the code to the optimizers and
4086code generator. One example application of metadata is source-level
4087debug information. There are two metadata primitives: strings and nodes.
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004088
Sean Silvaa1190322015-08-06 22:56:48 +00004089Metadata does not have a type, and is not a value. If referenced from a
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004090``call`` instruction, it uses the ``metadata`` type.
4091
4092All metadata are identified in syntax by a exclamation point ('``!``').
4093
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004094.. _metadata-string:
4095
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004096Metadata Nodes and Metadata Strings
4097-----------------------------------
Sean Silvab084af42012-12-07 10:36:55 +00004098
4099A metadata string is a string surrounded by double quotes. It can
4100contain any character by escaping non-printable characters with
4101"``\xx``" where "``xx``" is the two digit hex code. For example:
4102"``!"test\00"``".
4103
4104Metadata nodes are represented with notation similar to structure
4105constants (a comma separated list of elements, surrounded by braces and
4106preceded by an exclamation point). Metadata nodes can have any values as
4107their operand. For example:
4108
4109.. code-block:: llvm
4110
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004111 !{ !"test\00", i32 10}
Sean Silvab084af42012-12-07 10:36:55 +00004112
Duncan P. N. Exon Smith090a19b2015-01-08 22:38:29 +00004113Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
4114
Renato Golin124f2592016-07-20 12:16:38 +00004115.. code-block:: text
Duncan P. N. Exon Smith090a19b2015-01-08 22:38:29 +00004116
4117 !0 = distinct !{!"test\00", i32 10}
4118
Duncan P. N. Exon Smith99010342015-01-08 23:50:26 +00004119``distinct`` nodes are useful when nodes shouldn't be merged based on their
Sean Silvaa1190322015-08-06 22:56:48 +00004120content. They can also occur when transformations cause uniquing collisions
Duncan P. N. Exon Smith99010342015-01-08 23:50:26 +00004121when metadata operands change.
4122
Sean Silvab084af42012-12-07 10:36:55 +00004123A :ref:`named metadata <namedmetadatastructure>` is a collection of
4124metadata nodes, which can be looked up in the module symbol table. For
4125example:
4126
4127.. code-block:: llvm
4128
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004129 !foo = !{!4, !3}
Sean Silvab084af42012-12-07 10:36:55 +00004130
Adrian Prantl1b842da2017-07-28 20:44:29 +00004131Metadata can be used as function arguments. Here the ``llvm.dbg.value``
4132intrinsic is using three metadata arguments:
Sean Silvab084af42012-12-07 10:36:55 +00004133
4134.. code-block:: llvm
4135
Adrian Prantlabe04752017-07-28 20:21:02 +00004136 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
Sean Silvab084af42012-12-07 10:36:55 +00004137
Peter Collingbourne50108682015-11-06 02:41:02 +00004138Metadata can be attached to an instruction. Here metadata ``!21`` is attached
4139to the ``add`` instruction using the ``!dbg`` identifier:
Sean Silvab084af42012-12-07 10:36:55 +00004140
4141.. code-block:: llvm
4142
4143 %indvar.next = add i64 %indvar, 1, !dbg !21
4144
Peter Collingbourne7b5b7c72017-01-25 21:50:14 +00004145Metadata can also be attached to a function or a global variable. Here metadata
4146``!22`` is attached to the ``f1`` and ``f2 functions, and the globals ``g1``
4147and ``g2`` using the ``!dbg`` identifier:
Peter Collingbourne50108682015-11-06 02:41:02 +00004148
4149.. code-block:: llvm
4150
Peter Collingbourne7b5b7c72017-01-25 21:50:14 +00004151 declare !dbg !22 void @f1()
4152 define void @f2() !dbg !22 {
Peter Collingbourne50108682015-11-06 02:41:02 +00004153 ret void
4154 }
4155
Peter Collingbourne7b5b7c72017-01-25 21:50:14 +00004156 @g1 = global i32 0, !dbg !22
4157 @g2 = external global i32, !dbg !22
4158
4159A transformation is required to drop any metadata attachment that it does not
4160know or know it can't preserve. Currently there is an exception for metadata
4161attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be
4162unconditionally dropped unless the global is itself deleted.
4163
4164Metadata attached to a module using named metadata may not be dropped, with
4165the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
4166
Sean Silvab084af42012-12-07 10:36:55 +00004167More information about specific metadata nodes recognized by the
4168optimizers and code generator is found below.
4169
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004170.. _specialized-metadata:
4171
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004172Specialized Metadata Nodes
4173^^^^^^^^^^^^^^^^^^^^^^^^^^
4174
4175Specialized metadata nodes are custom data structures in metadata (as opposed
Sean Silvaa1190322015-08-06 22:56:48 +00004176to generic tuples). Their fields are labelled, and can be specified in any
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004177order.
4178
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004179These aren't inherently debug info centric, but currently all the specialized
4180metadata nodes are related to debug info.
4181
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004182.. _DICompileUnit:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004183
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004184DICompileUnit
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004185"""""""""""""
4186
Sean Silvaa1190322015-08-06 22:56:48 +00004187``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004188``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
4189containing the debug info to be emitted along with the compile unit, regardless
4190of code optimizations (some nodes are only emitted if there are references to
4191them from instructions). The ``debugInfoForProfiling:`` field is a boolean
4192indicating whether or not line-table discriminators are updated to provide
4193more-accurate debug info for profiling results.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004194
Renato Golin124f2592016-07-20 12:16:38 +00004195.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004196
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004197 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004198 isOptimized: true, flags: "-O2", runtimeVersion: 2,
Adrian Prantlb8089512016-04-01 00:16:49 +00004199 splitDebugFilename: "abc.debug", emissionKind: FullDebug,
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004200 enums: !2, retainedTypes: !3, globals: !4, imports: !5,
4201 macros: !6, dwoId: 0x0abcd)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004202
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004203Compile unit descriptors provide the root scope for objects declared in a
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004204specific compilation unit. File descriptors are defined using this scope. These
4205descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
4206track of global variables, type information, and imported entities (declarations
4207and namespaces).
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004208
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004209.. _DIFile:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004210
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004211DIFile
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004212""""""
4213
Sean Silvaa1190322015-08-06 22:56:48 +00004214``DIFile`` nodes represent files. The ``filename:`` can include slashes.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004215
Aaron Ballmanb3c51512017-01-17 21:48:31 +00004216.. code-block:: none
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004217
Amjad Aboud7faeecc2016-12-25 10:12:09 +00004218 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
4219 checksumkind: CSK_MD5,
4220 checksum: "000102030405060708090a0b0c0d0e0f")
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004221
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004222Files are sometimes used in ``scope:`` fields, and are the only valid target
4223for ``file:`` fields.
Amjad Aboud7faeecc2016-12-25 10:12:09 +00004224Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1}
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004225
Michael Kuperstein605308a2015-05-14 10:58:59 +00004226.. _DIBasicType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004227
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004228DIBasicType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004229"""""""""""
4230
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004231``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
Sean Silvaa1190322015-08-06 22:56:48 +00004232``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004233
Renato Golin124f2592016-07-20 12:16:38 +00004234.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004235
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004236 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004237 encoding: DW_ATE_unsigned_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004238 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004239
Sean Silvaa1190322015-08-06 22:56:48 +00004240The ``encoding:`` describes the details of the type. Usually it's one of the
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004241following:
4242
Renato Golin124f2592016-07-20 12:16:38 +00004243.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004244
4245 DW_ATE_address = 1
4246 DW_ATE_boolean = 2
4247 DW_ATE_float = 4
4248 DW_ATE_signed = 5
4249 DW_ATE_signed_char = 6
4250 DW_ATE_unsigned = 7
4251 DW_ATE_unsigned_char = 8
4252
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004253.. _DISubroutineType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004254
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004255DISubroutineType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004256""""""""""""""""
4257
Sean Silvaa1190322015-08-06 22:56:48 +00004258``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004259refers to a tuple; the first operand is the return type, while the rest are the
Sean Silvaa1190322015-08-06 22:56:48 +00004260types of the formal arguments in order. If the first operand is ``null``, that
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004261represents a function with no return value (such as ``void foo() {}`` in C++).
4262
Renato Golin124f2592016-07-20 12:16:38 +00004263.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004264
4265 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
4266 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004267 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004268
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004269.. _DIDerivedType:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004270
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004271DIDerivedType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004272"""""""""""""
4273
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004274``DIDerivedType`` nodes represent types derived from other types, such as
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004275qualified types.
4276
Renato Golin124f2592016-07-20 12:16:38 +00004277.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004278
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004279 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004280 encoding: DW_ATE_unsigned_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004281 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004282 align: 32)
4283
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004284The following ``tag:`` values are valid:
4285
Renato Golin124f2592016-07-20 12:16:38 +00004286.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004287
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004288 DW_TAG_member = 13
4289 DW_TAG_pointer_type = 15
4290 DW_TAG_reference_type = 16
4291 DW_TAG_typedef = 22
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004292 DW_TAG_inheritance = 28
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004293 DW_TAG_ptr_to_member_type = 31
4294 DW_TAG_const_type = 38
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004295 DW_TAG_friend = 42
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004296 DW_TAG_volatile_type = 53
4297 DW_TAG_restrict_type = 55
Victor Leschuke1156c22016-10-31 19:09:38 +00004298 DW_TAG_atomic_type = 71
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004299
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004300.. _DIDerivedTypeMember:
4301
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004302``DW_TAG_member`` is used to define a member of a :ref:`composite type
Duncan P. N. Exon Smith90990cd2016-04-17 00:45:00 +00004303<DICompositeType>`. The type of the member is the ``baseType:``. The
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004304``offset:`` is the member's bit offset. If the composite type has an ODR
4305``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
4306uniqued based only on its ``name:`` and ``scope:``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004307
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004308``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
4309field of :ref:`composite types <DICompositeType>` to describe parents and
4310friends.
4311
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004312``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
4313
4314``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
Victor Leschuke1156c22016-10-31 19:09:38 +00004315``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type``
4316are used to qualify the ``baseType:``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004317
4318Note that the ``void *`` type is expressed as a type derived from NULL.
4319
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004320.. _DICompositeType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004321
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004322DICompositeType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004323"""""""""""""""
4324
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004325``DICompositeType`` nodes represent types composed of other types, like
Sean Silvaa1190322015-08-06 22:56:48 +00004326structures and unions. ``elements:`` points to a tuple of the composed types.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004327
4328If the source language supports ODR, the ``identifier:`` field gives the unique
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004329identifier used for type merging between modules. When specified,
4330:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
4331derived types <DIDerivedTypeMember>` that reference the ODR-type in their
4332``scope:`` change uniquing rules.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004333
Duncan P. N. Exon Smith5ab2be02016-04-17 03:58:21 +00004334For a given ``identifier:``, there should only be a single composite type that
4335does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules
4336together will unique such definitions at parse time via the ``identifier:``
4337field, even if the nodes are ``distinct``.
4338
Renato Golin124f2592016-07-20 12:16:38 +00004339.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004340
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004341 !0 = !DIEnumerator(name: "SixKind", value: 7)
4342 !1 = !DIEnumerator(name: "SevenKind", value: 7)
4343 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
4344 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004345 line: 2, size: 32, align: 32, identifier: "_M4Enum",
4346 elements: !{!0, !1, !2})
4347
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004348The following ``tag:`` values are valid:
4349
Renato Golin124f2592016-07-20 12:16:38 +00004350.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004351
4352 DW_TAG_array_type = 1
4353 DW_TAG_class_type = 2
4354 DW_TAG_enumeration_type = 4
4355 DW_TAG_structure_type = 19
4356 DW_TAG_union_type = 23
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004357
4358For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004359descriptors <DISubrange>`, each representing the range of subscripts at that
Sean Silvaa1190322015-08-06 22:56:48 +00004360level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004361array type is a native packed vector.
4362
4363For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004364descriptors <DIEnumerator>`, each representing the definition of an enumeration
Sean Silvaa1190322015-08-06 22:56:48 +00004365value for the set. All enumeration type descriptors are collected in the
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004366``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004367
4368For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
4369``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004370<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
4371``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
4372``isDefinition: false``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004373
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004374.. _DISubrange:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004375
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004376DISubrange
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004377""""""""""
4378
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004379``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
Sander de Smalen1cb94312018-01-24 10:30:23 +00004380:ref:`DICompositeType`.
4381
4382- ``count: -1`` indicates an empty array.
4383- ``count: !9`` describes the count with a :ref:`DILocalVariable`.
4384- ``count: !11`` describes the count with a :ref:`DIGlobalVariable`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004385
4386.. code-block:: llvm
4387
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004388 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
4389 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
4390 !2 = !DISubrange(count: -1) ; empty array.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004391
Sander de Smalenfdf40912018-01-24 09:56:07 +00004392 ; Scopes used in rest of example
4393 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
4394 !7 = distinct !DICompileUnit(language: DW_LANG_C99, ...
4395 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5, ...
4396
4397 ; Use of local variable as count value
4398 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
4399 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
4400 !11 = !DISubrange(count !10, lowerBound: 0)
4401
4402 ; Use of global variable as count value
4403 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
4404 !13 = !DISubrange(count !12, lowerBound: 0)
4405
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004406.. _DIEnumerator:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004407
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004408DIEnumerator
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004409""""""""""""
4410
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004411``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
4412variants of :ref:`DICompositeType`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004413
4414.. code-block:: llvm
4415
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004416 !0 = !DIEnumerator(name: "SixKind", value: 7)
4417 !1 = !DIEnumerator(name: "SevenKind", value: 7)
4418 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004419
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004420DITemplateTypeParameter
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004421"""""""""""""""""""""""
4422
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004423``DITemplateTypeParameter`` nodes represent type parameters to generic source
Sean Silvaa1190322015-08-06 22:56:48 +00004424language constructs. They are used (optionally) in :ref:`DICompositeType` and
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004425:ref:`DISubprogram` ``templateParams:`` fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004426
4427.. code-block:: llvm
4428
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004429 !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004430
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004431DITemplateValueParameter
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004432""""""""""""""""""""""""
4433
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004434``DITemplateValueParameter`` nodes represent value parameters to generic source
Sean Silvaa1190322015-08-06 22:56:48 +00004435language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004436but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
Sean Silvaa1190322015-08-06 22:56:48 +00004437``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004438:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004439
4440.. code-block:: llvm
4441
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004442 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004443
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004444DINamespace
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004445"""""""""""
4446
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004447``DINamespace`` nodes represent namespaces in the source language.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004448
4449.. code-block:: llvm
4450
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004451 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004452
Sander de Smalen1cb94312018-01-24 10:30:23 +00004453.. _DIGlobalVariable:
4454
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004455DIGlobalVariable
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004456""""""""""""""""
4457
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004458``DIGlobalVariable`` nodes represent global variables in the source language.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004459
4460.. code-block:: llvm
4461
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004462 !0 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !1,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004463 file: !2, line: 7, type: !3, isLocal: true,
4464 isDefinition: false, variable: i32* @foo,
4465 declaration: !4)
4466
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004467All global variables should be referenced by the `globals:` field of a
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004468:ref:`compile unit <DICompileUnit>`.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004469
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004470.. _DISubprogram:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004471
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004472DISubprogram
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004473""""""""""""
4474
Peter Collingbourne50108682015-11-06 02:41:02 +00004475``DISubprogram`` nodes represent functions from the source language. A
4476``DISubprogram`` may be attached to a function definition using ``!dbg``
4477metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>`
4478that must be retained, even if their IR counterparts are optimized out of
4479the IR. The ``type:`` field must point at an :ref:`DISubroutineType`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004480
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004481.. _DISubprogramDeclaration:
4482
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004483When ``isDefinition: false``, subprograms describe a declaration in the type
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004484tree as opposed to a definition of a function. If the scope is a composite
4485type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
4486then the subprogram declaration is uniqued based only on its ``linkageName:``
4487and ``scope:``.
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004488
Renato Golin124f2592016-07-20 12:16:38 +00004489.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004490
Peter Collingbourne50108682015-11-06 02:41:02 +00004491 define void @_Z3foov() !dbg !0 {
4492 ...
4493 }
4494
4495 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
4496 file: !2, line: 7, type: !3, isLocal: true,
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004497 isDefinition: true, scopeLine: 8,
Peter Collingbourne50108682015-11-06 02:41:02 +00004498 containingType: !4,
4499 virtuality: DW_VIRTUALITY_pure_virtual,
4500 virtualIndex: 10, flags: DIFlagPrototyped,
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004501 isOptimized: true, unit: !5, templateParams: !6,
4502 declaration: !7, variables: !8, thrownTypes: !9)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004503
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004504.. _DILexicalBlock:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004505
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004506DILexicalBlock
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004507""""""""""""""
4508
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004509``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004510<DISubprogram>`. The line number and column numbers are used to distinguish
Sean Silvaa1190322015-08-06 22:56:48 +00004511two lexical blocks at same depth. They are valid targets for ``scope:``
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004512fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004513
Renato Golin124f2592016-07-20 12:16:38 +00004514.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004515
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004516 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004517
4518Usually lexical blocks are ``distinct`` to prevent node merging based on
4519operands.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004520
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004521.. _DILexicalBlockFile:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004522
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004523DILexicalBlockFile
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004524""""""""""""""""""
4525
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004526``DILexicalBlockFile`` nodes are used to discriminate between sections of a
Sean Silvaa1190322015-08-06 22:56:48 +00004527:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004528indicate textual inclusion, or the ``discriminator:`` field can be used to
4529discriminate between control flow within a single block in the source language.
4530
4531.. code-block:: llvm
4532
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004533 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
4534 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
4535 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004536
Michael Kuperstein605308a2015-05-14 10:58:59 +00004537.. _DILocation:
4538
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004539DILocation
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004540""""""""""
4541
Sean Silvaa1190322015-08-06 22:56:48 +00004542``DILocation`` nodes represent source debug locations. The ``scope:`` field is
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004543mandatory, and points at an :ref:`DILexicalBlockFile`, an
4544:ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004545
4546.. code-block:: llvm
4547
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004548 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004549
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004550.. _DILocalVariable:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004551
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004552DILocalVariable
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004553"""""""""""""""
4554
Sean Silvaa1190322015-08-06 22:56:48 +00004555``DILocalVariable`` nodes represent local variables in the source language. If
Duncan P. N. Exon Smithed013cd2015-07-31 18:58:39 +00004556the ``arg:`` field is set to non-zero, then this variable is a subprogram
4557parameter, and it will be included in the ``variables:`` field of its
4558:ref:`DISubprogram`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004559
Renato Golin124f2592016-07-20 12:16:38 +00004560.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004561
Duncan P. N. Exon Smithed013cd2015-07-31 18:58:39 +00004562 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
4563 type: !3, flags: DIFlagArtificial)
4564 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
4565 type: !3)
4566 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004567
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004568DIExpression
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004569""""""""""""
4570
Adrian Prantlb44c7762017-03-22 18:01:01 +00004571``DIExpression`` nodes represent expressions that are inspired by the DWARF
4572expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
4573(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
4574referenced LLVM variable relates to the source language variable.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004575
4576The current supported vocabulary is limited:
4577
Adrian Prantl6825fb62017-04-18 01:21:53 +00004578- ``DW_OP_deref`` dereferences the top of the expression stack.
Florian Hahnffc498d2017-06-14 13:14:38 +00004579- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
4580 them together and appends the result to the expression stack.
4581- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
4582 the last entry from the second last entry and appends the result to the
4583 expression stack.
Florian Hahnc9c403c2017-06-13 16:54:44 +00004584- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
Adrian Prantlb44c7762017-03-22 18:01:01 +00004585- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
4586 here, respectively) of the variable fragment from the working expression. Note
Hiroshi Inoue760c0c92018-01-16 13:19:48 +00004587 that contrary to DW_OP_bit_piece, the offset is describing the location
Adrian Prantlb44c7762017-03-22 18:01:01 +00004588 within the described source variable.
Konstantin Zhuravlyovf9b41cd2017-03-08 00:28:57 +00004589- ``DW_OP_swap`` swaps top two stack entries.
4590- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
4591 of the stack is treated as an address. The second stack entry is treated as an
4592 address space identifier.
Adrian Prantlb44c7762017-03-22 18:01:01 +00004593- ``DW_OP_stack_value`` marks a constant value.
4594
Adrian Prantl6825fb62017-04-18 01:21:53 +00004595DWARF specifies three kinds of simple location descriptions: Register, memory,
4596and implicit location descriptions. Register and memory location descriptions
4597describe the *location* of a source variable (in the sense that a debugger might
4598modify its value), whereas implicit locations describe merely the *value* of a
4599source variable. DIExpressions also follow this model: A DIExpression that
4600doesn't have a trailing ``DW_OP_stack_value`` will describe an *address* when
4601combined with a concrete location.
4602
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00004603.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004604
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004605 !0 = !DIExpression(DW_OP_deref)
Florian Hahnc9c403c2017-06-13 16:54:44 +00004606 !1 = !DIExpression(DW_OP_plus_uconst, 3)
Florian Hahnffc498d2017-06-14 13:14:38 +00004607 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004608 !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
Florian Hahnffc498d2017-06-14 13:14:38 +00004609 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
Konstantin Zhuravlyovf9b41cd2017-03-08 00:28:57 +00004610 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
Adrian Prantlb44c7762017-03-22 18:01:01 +00004611 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004612
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004613DIObjCProperty
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004614""""""""""""""
4615
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004616``DIObjCProperty`` nodes represent Objective-C property nodes.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004617
4618.. code-block:: llvm
4619
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004620 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004621 getter: "getFoo", attributes: 7, type: !2)
4622
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004623DIImportedEntity
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004624""""""""""""""""
4625
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004626``DIImportedEntity`` nodes represent entities (such as modules) imported into a
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004627compile unit.
4628
Renato Golin124f2592016-07-20 12:16:38 +00004629.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004630
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004631 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004632 entity: !1, line: 7)
4633
Amjad Abouda9bcf162015-12-10 12:56:35 +00004634DIMacro
4635"""""""
4636
4637``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
4638The ``name:`` field is the macro identifier, followed by macro parameters when
Sylvestre Ledru7d540502016-07-02 19:28:40 +00004639defining a function-like macro, and the ``value`` field is the token-string
Amjad Abouda9bcf162015-12-10 12:56:35 +00004640used to expand the macro identifier.
4641
Renato Golin124f2592016-07-20 12:16:38 +00004642.. code-block:: text
Amjad Abouda9bcf162015-12-10 12:56:35 +00004643
4644 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
4645 value: "((x) + 1)")
4646 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
4647
4648DIMacroFile
4649"""""""""""
4650
4651``DIMacroFile`` nodes represent inclusion of source files.
4652The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
4653appear in the included source file.
4654
Renato Golin124f2592016-07-20 12:16:38 +00004655.. code-block:: text
Amjad Abouda9bcf162015-12-10 12:56:35 +00004656
4657 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
4658 nodes: !3)
4659
Sean Silvab084af42012-12-07 10:36:55 +00004660'``tbaa``' Metadata
4661^^^^^^^^^^^^^^^^^^^
4662
4663In LLVM IR, memory does not have types, so LLVM's own type system is not
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004664suitable for doing type based alias analysis (TBAA). Instead, metadata is
4665added to the IR to describe a type system of a higher level language. This
4666can be used to implement C/C++ strict type aliasing rules, but it can also
4667be used to implement custom alias analysis behavior for other languages.
Sean Silvab084af42012-12-07 10:36:55 +00004668
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004669This description of LLVM's TBAA system is broken into two parts:
4670:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
4671:ref:`Representation<tbaa_node_representation>` talks about the metadata
4672encoding of various entities.
Sean Silvab084af42012-12-07 10:36:55 +00004673
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004674It is always possible to trace any TBAA node to a "root" TBAA node (details
4675in the :ref:`Representation<tbaa_node_representation>` section). TBAA
4676nodes with different roots have an unknown aliasing relationship, and LLVM
4677conservatively infers ``MayAlias`` between them. The rules mentioned in
4678this section only pertain to TBAA nodes living under the same root.
Sean Silvab084af42012-12-07 10:36:55 +00004679
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004680.. _tbaa_node_semantics:
Sean Silvab084af42012-12-07 10:36:55 +00004681
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004682Semantics
4683"""""""""
Sean Silvab084af42012-12-07 10:36:55 +00004684
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004685The TBAA metadata system, referred to as "struct path TBAA" (not to be
4686confused with ``tbaa.struct``), consists of the following high level
4687concepts: *Type Descriptors*, further subdivided into scalar type
4688descriptors and struct type descriptors; and *Access Tags*.
Sean Silvab084af42012-12-07 10:36:55 +00004689
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004690**Type descriptors** describe the type system of the higher level language
4691being compiled. **Scalar type descriptors** describe types that do not
4692contain other types. Each scalar type has a parent type, which must also
4693be a scalar type or the TBAA root. Via this parent relation, scalar types
4694within a TBAA root form a tree. **Struct type descriptors** denote types
4695that contain a sequence of other type descriptors, at known offsets. These
4696contained type descriptors can either be struct type descriptors themselves
4697or scalar type descriptors.
4698
4699**Access tags** are metadata nodes attached to load and store instructions.
4700Access tags use type descriptors to describe the *location* being accessed
4701in terms of the type system of the higher level language. Access tags are
4702tuples consisting of a base type, an access type and an offset. The base
4703type is a scalar type descriptor or a struct type descriptor, the access
4704type is a scalar type descriptor, and the offset is a constant integer.
4705
4706The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
4707things:
4708
4709 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
4710 or store) of a value of type ``AccessTy`` contained in the struct type
4711 ``BaseTy`` at offset ``Offset``.
4712
4713 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
4714 ``AccessTy`` must be the same; and the access tag describes a scalar
4715 access with scalar type ``AccessTy``.
4716
4717We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
4718tuples this way:
4719
4720 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
4721 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
4722 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is
4723 undefined if ``Offset`` is non-zero.
4724
4725 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
4726 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
4727 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
4728 to be relative within that inner type.
4729
4730A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
4731aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
4732Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
4733Offset2)`` via the ``Parent`` relation or vice versa.
4734
4735As a concrete example, the type descriptor graph for the following program
4736
4737.. code-block:: c
4738
4739 struct Inner {
4740 int i; // offset 0
4741 float f; // offset 4
4742 };
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00004743
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004744 struct Outer {
4745 float f; // offset 0
4746 double d; // offset 4
4747 struct Inner inner_a; // offset 12
4748 };
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00004749
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004750 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
4751 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0)
4752 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12)
Fangrui Song74d6a742018-05-29 05:38:05 +00004753 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, FloatScalarTy, 16)
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004754 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0)
4755 }
4756
4757is (note that in C and C++, ``char`` can be used to access any arbitrary
4758type):
4759
4760.. code-block:: text
4761
4762 Root = "TBAA Root"
4763 CharScalarTy = ("char", Root, 0)
4764 FloatScalarTy = ("float", CharScalarTy, 0)
4765 DoubleScalarTy = ("double", CharScalarTy, 0)
4766 IntScalarTy = ("int", CharScalarTy, 0)
4767 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
4768 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
4769 (InnerStructTy, 12)}
4770
4771
4772with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
47730)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
4774``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
4775
4776.. _tbaa_node_representation:
4777
4778Representation
4779""""""""""""""
4780
4781The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
4782with exactly one ``MDString`` operand.
4783
4784Scalar type descriptors are represented as an ``MDNode`` s with two
4785operands. The first operand is an ``MDString`` denoting the name of the
4786struct type. LLVM does not assign meaning to the value of this operand, it
4787only cares about it being an ``MDString``. The second operand is an
4788``MDNode`` which points to the parent for said scalar type descriptor,
4789which is either another scalar type descriptor or the TBAA root. Scalar
4790type descriptors can have an optional third argument, but that must be the
4791constant integer zero.
4792
4793Struct type descriptors are represented as ``MDNode`` s with an odd number
4794of operands greater than 1. The first operand is an ``MDString`` denoting
4795the name of the struct type. Like in scalar type descriptors the actual
4796value of this name operand is irrelevant to LLVM. After the name operand,
4797the struct type descriptors have a sequence of alternating ``MDNode`` and
4798``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand,
4799an ``MDNode``, denotes a contained field, and the 2N th operand, a
4800``ConstantInt``, is the offset of the said contained field. The offsets
4801must be in non-decreasing order.
4802
4803Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
4804The first operand is an ``MDNode`` pointing to the node representing the
4805base type. The second operand is an ``MDNode`` pointing to the node
4806representing the access type. The third operand is a ``ConstantInt`` that
4807states the offset of the access. If a fourth field is present, it must be
4808a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states
4809that the location being accessed is "constant" (meaning
Sean Silvab084af42012-12-07 10:36:55 +00004810``pointsToConstantMemory`` should return true; see `other useful
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004811AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of
4812the access type and the base type of an access tag must be the same, and
4813that is the TBAA root of the access tag.
Sean Silvab084af42012-12-07 10:36:55 +00004814
4815'``tbaa.struct``' Metadata
4816^^^^^^^^^^^^^^^^^^^^^^^^^^
4817
4818The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
4819aggregate assignment operations in C and similar languages, however it
4820is defined to copy a contiguous region of memory, which is more than
4821strictly necessary for aggregate types which contain holes due to
4822padding. Also, it doesn't contain any TBAA information about the fields
4823of the aggregate.
4824
4825``!tbaa.struct`` metadata can describe which memory subregions in a
4826memcpy are padding and what the TBAA tags of the struct are.
4827
4828The current metadata format is very simple. ``!tbaa.struct`` metadata
4829nodes are a list of operands which are in conceptual groups of three.
4830For each group of three, the first operand gives the byte offset of a
4831field in bytes, the second gives its size in bytes, and the third gives
4832its tbaa tag. e.g.:
4833
4834.. code-block:: llvm
4835
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004836 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
Sean Silvab084af42012-12-07 10:36:55 +00004837
4838This describes a struct with two fields. The first is at offset 0 bytes
4839with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
4840and has size 4 bytes and has tbaa tag !2.
4841
4842Note that the fields need not be contiguous. In this example, there is a
48434 byte gap between the two fields. This gap represents padding which
4844does not carry useful data and need not be preserved.
4845
Hal Finkel94146652014-07-24 14:25:39 +00004846'``noalias``' and '``alias.scope``' Metadata
Dan Liewbafdcba2014-07-28 13:33:51 +00004847^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Hal Finkel94146652014-07-24 14:25:39 +00004848
4849``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
4850noalias memory-access sets. This means that some collection of memory access
4851instructions (loads, stores, memory-accessing calls, etc.) that carry
4852``noalias`` metadata can specifically be specified not to alias with some other
4853collection of memory access instructions that carry ``alias.scope`` metadata.
Hal Finkel029cde62014-07-25 15:50:02 +00004854Each type of metadata specifies a list of scopes where each scope has an id and
Adam Nemet569a5b32016-04-27 00:52:48 +00004855a domain.
4856
4857When evaluating an aliasing query, if for some domain, the set
Hal Finkel029cde62014-07-25 15:50:02 +00004858of scopes with that domain in one instruction's ``alias.scope`` list is a
Arch D. Robison96cf7ab2015-02-24 20:11:49 +00004859subset of (or equal to) the set of scopes for that domain in another
Hal Finkel029cde62014-07-25 15:50:02 +00004860instruction's ``noalias`` list, then the two memory accesses are assumed not to
4861alias.
Hal Finkel94146652014-07-24 14:25:39 +00004862
Adam Nemet569a5b32016-04-27 00:52:48 +00004863Because scopes in one domain don't affect scopes in other domains, separate
4864domains can be used to compose multiple independent noalias sets. This is
4865used for example during inlining. As the noalias function parameters are
4866turned into noalias scope metadata, a new domain is used every time the
4867function is inlined.
4868
Hal Finkel029cde62014-07-25 15:50:02 +00004869The metadata identifying each domain is itself a list containing one or two
4870entries. The first entry is the name of the domain. Note that if the name is a
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004871string then it can be combined across functions and translation units. A
Hal Finkel029cde62014-07-25 15:50:02 +00004872self-reference can be used to create globally unique domain names. A
4873descriptive string may optionally be provided as a second list entry.
4874
4875The metadata identifying each scope is also itself a list containing two or
4876three entries. The first entry is the name of the scope. Note that if the name
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004877is a string then it can be combined across functions and translation units. A
Hal Finkel029cde62014-07-25 15:50:02 +00004878self-reference can be used to create globally unique scope names. A metadata
4879reference to the scope's domain is the second entry. A descriptive string may
4880optionally be provided as a third list entry.
Hal Finkel94146652014-07-24 14:25:39 +00004881
4882For example,
4883
4884.. code-block:: llvm
4885
Hal Finkel029cde62014-07-25 15:50:02 +00004886 ; Two scope domains:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004887 !0 = !{!0}
4888 !1 = !{!1}
Hal Finkel94146652014-07-24 14:25:39 +00004889
Hal Finkel029cde62014-07-25 15:50:02 +00004890 ; Some scopes in these domains:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004891 !2 = !{!2, !0}
4892 !3 = !{!3, !0}
4893 !4 = !{!4, !1}
Hal Finkel94146652014-07-24 14:25:39 +00004894
Hal Finkel029cde62014-07-25 15:50:02 +00004895 ; Some scope lists:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004896 !5 = !{!4} ; A list containing only scope !4
4897 !6 = !{!4, !3, !2}
4898 !7 = !{!3}
Hal Finkel94146652014-07-24 14:25:39 +00004899
4900 ; These two instructions don't alias:
David Blaikiec7aabbb2015-03-04 22:06:14 +00004901 %0 = load float, float* %c, align 4, !alias.scope !5
Hal Finkel029cde62014-07-25 15:50:02 +00004902 store float %0, float* %arrayidx.i, align 4, !noalias !5
Hal Finkel94146652014-07-24 14:25:39 +00004903
Hal Finkel029cde62014-07-25 15:50:02 +00004904 ; These two instructions also don't alias (for domain !1, the set of scopes
4905 ; in the !alias.scope equals that in the !noalias list):
David Blaikiec7aabbb2015-03-04 22:06:14 +00004906 %2 = load float, float* %c, align 4, !alias.scope !5
Hal Finkel029cde62014-07-25 15:50:02 +00004907 store float %2, float* %arrayidx.i2, align 4, !noalias !6
Hal Finkel94146652014-07-24 14:25:39 +00004908
Adam Nemet0a8416f2015-05-11 08:30:28 +00004909 ; These two instructions may alias (for domain !0, the set of scopes in
Hal Finkel029cde62014-07-25 15:50:02 +00004910 ; the !noalias list is not a superset of, or equal to, the scopes in the
4911 ; !alias.scope list):
David Blaikiec7aabbb2015-03-04 22:06:14 +00004912 %2 = load float, float* %c, align 4, !alias.scope !6
Hal Finkel029cde62014-07-25 15:50:02 +00004913 store float %0, float* %arrayidx.i, align 4, !noalias !7
Hal Finkel94146652014-07-24 14:25:39 +00004914
Sean Silvab084af42012-12-07 10:36:55 +00004915'``fpmath``' Metadata
4916^^^^^^^^^^^^^^^^^^^^^
4917
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00004918``fpmath`` metadata may be attached to any instruction of floating-point
Sean Silvab084af42012-12-07 10:36:55 +00004919type. It can be used to express the maximum acceptable error in the
4920result of that instruction, in ULPs, thus potentially allowing the
4921compiler to use a more efficient but less accurate method of computing
4922it. ULP is defined as follows:
4923
4924 If ``x`` is a real number that lies between two finite consecutive
4925 floating-point numbers ``a`` and ``b``, without being equal to one
4926 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
4927 distance between the two non-equal finite floating-point numbers
4928 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
4929
Matt Arsenault82f41512016-06-27 19:43:15 +00004930The metadata node shall consist of a single positive float type number
4931representing the maximum relative error, for example:
Sean Silvab084af42012-12-07 10:36:55 +00004932
4933.. code-block:: llvm
4934
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004935 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
Sean Silvab084af42012-12-07 10:36:55 +00004936
Philip Reamesf8bf9dd2015-02-27 23:14:50 +00004937.. _range-metadata:
4938
Sean Silvab084af42012-12-07 10:36:55 +00004939'``range``' Metadata
4940^^^^^^^^^^^^^^^^^^^^
4941
Jingyue Wu37fcb592014-06-19 16:50:16 +00004942``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
4943integer types. It expresses the possible ranges the loaded value or the value
4944returned by the called function at this call site is in. The ranges are
4945represented with a flattened list of integers. The loaded value or the value
4946returned is known to be in the union of the ranges defined by each consecutive
4947pair. Each pair has the following properties:
Sean Silvab084af42012-12-07 10:36:55 +00004948
4949- The type must match the type loaded by the instruction.
4950- The pair ``a,b`` represents the range ``[a,b)``.
4951- Both ``a`` and ``b`` are constants.
4952- The range is allowed to wrap.
4953- The range should not represent the full or empty set. That is,
4954 ``a!=b``.
4955
4956In addition, the pairs must be in signed order of the lower bound and
4957they must be non-contiguous.
4958
4959Examples:
4960
4961.. code-block:: llvm
4962
David Blaikiec7aabbb2015-03-04 22:06:14 +00004963 %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
4964 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
Jingyue Wu37fcb592014-06-19 16:50:16 +00004965 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
4966 %d = invoke i8 @bar() to label %cont
4967 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
Sean Silvab084af42012-12-07 10:36:55 +00004968 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004969 !0 = !{ i8 0, i8 2 }
4970 !1 = !{ i8 255, i8 2 }
4971 !2 = !{ i8 0, i8 2, i8 3, i8 6 }
4972 !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
Sean Silvab084af42012-12-07 10:36:55 +00004973
Peter Collingbourne235c2752016-12-08 19:01:00 +00004974'``absolute_symbol``' Metadata
4975^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4976
4977``absolute_symbol`` metadata may be attached to a global variable
4978declaration. It marks the declaration as a reference to an absolute symbol,
4979which causes the backend to use absolute relocations for the symbol even
4980in position independent code, and expresses the possible ranges that the
4981global variable's *address* (not its value) is in, in the same format as
Peter Collingbourned88f9282017-01-20 21:56:37 +00004982``range`` metadata, with the extension that the pair ``all-ones,all-ones``
4983may be used to represent the full set.
Peter Collingbourne235c2752016-12-08 19:01:00 +00004984
Peter Collingbourned88f9282017-01-20 21:56:37 +00004985Example (assuming 64-bit pointers):
Peter Collingbourne235c2752016-12-08 19:01:00 +00004986
4987.. code-block:: llvm
4988
4989 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
Peter Collingbourned88f9282017-01-20 21:56:37 +00004990 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
Peter Collingbourne235c2752016-12-08 19:01:00 +00004991
4992 ...
4993 !0 = !{ i64 0, i64 256 }
Peter Collingbourned88f9282017-01-20 21:56:37 +00004994 !1 = !{ i64 -1, i64 -1 }
Peter Collingbourne235c2752016-12-08 19:01:00 +00004995
Matthew Simpson36bbc8c2017-10-16 22:22:11 +00004996'``callees``' Metadata
4997^^^^^^^^^^^^^^^^^^^^^^
4998
4999``callees`` metadata may be attached to indirect call sites. If ``callees``
5000metadata is attached to a call site, and any callee is not among the set of
5001functions provided by the metadata, the behavior is undefined. The intent of
5002this metadata is to facilitate optimizations such as indirect-call promotion.
5003For example, in the code below, the call instruction may only target the
5004``add`` or ``sub`` functions:
5005
5006.. code-block:: llvm
5007
5008 %result = call i64 %binop(i64 %x, i64 %y), !callees !0
5009
5010 ...
5011 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}
5012
Sanjay Patela99ab1f2015-09-02 19:06:43 +00005013'``unpredictable``' Metadata
Sanjay Patel1f12b342015-09-02 19:35:31 +00005014^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sanjay Patela99ab1f2015-09-02 19:06:43 +00005015
5016``unpredictable`` metadata may be attached to any branch or switch
5017instruction. It can be used to express the unpredictability of control
5018flow. Similar to the llvm.expect intrinsic, it may be used to alter
5019optimizations related to compare and branch instructions. The metadata
5020is treated as a boolean value; if it exists, it signals that the branch
5021or switch that it is attached to is completely unpredictable.
5022
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005023'``llvm.loop``'
5024^^^^^^^^^^^^^^^
5025
5026It is sometimes useful to attach information to loop constructs. Currently,
5027loop metadata is implemented as metadata attached to the branch instruction
5028in the loop latch block. This type of metadata refer to a metadata node that is
Matt Arsenault24b49c42013-07-31 17:49:08 +00005029guaranteed to be separate for each loop. The loop identifier metadata is
Paul Redmond5fdf8362013-05-28 20:00:34 +00005030specified with the name ``llvm.loop``.
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005031
5032The loop identifier metadata is implemented using a metadata that refers to
Michael Liaoa7699082013-03-06 18:24:34 +00005033itself to avoid merging it with any other identifier metadata, e.g.,
5034during module linkage or function inlining. That is, each loop should refer
5035to their own identification metadata even if they reside in separate functions.
5036The following example contains loop identifier metadata for two separate loop
Pekka Jaaskelainen119a2b62013-02-22 12:03:07 +00005037constructs:
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005038
5039.. code-block:: llvm
Paul Redmondeaaed3b2013-02-21 17:20:45 +00005040
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005041 !0 = !{!0}
5042 !1 = !{!1}
Pekka Jaaskelainen119a2b62013-02-22 12:03:07 +00005043
Mark Heffernan893752a2014-07-18 19:24:51 +00005044The loop identifier metadata can be used to specify additional
5045per-loop metadata. Any operands after the first operand can be treated
5046as user-defined metadata. For example the ``llvm.loop.unroll.count``
5047suggests an unroll factor to the loop unroller:
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005048
Paul Redmond5fdf8362013-05-28 20:00:34 +00005049.. code-block:: llvm
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005050
Paul Redmond5fdf8362013-05-28 20:00:34 +00005051 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
5052 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005053 !0 = !{!0, !1}
5054 !1 = !{!"llvm.loop.unroll.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00005055
Mark Heffernan9d20e422014-07-21 23:11:03 +00005056'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
5057^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mark Heffernan893752a2014-07-18 19:24:51 +00005058
Mark Heffernan9d20e422014-07-21 23:11:03 +00005059Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
5060used to control per-loop vectorization and interleaving parameters such as
Sean Silvaa1190322015-08-06 22:56:48 +00005061vectorization width and interleave count. These metadata should be used in
5062conjunction with ``llvm.loop`` loop identification metadata. The
Mark Heffernan9d20e422014-07-21 23:11:03 +00005063``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
5064optimization hints and the optimizer will only interleave and vectorize loops if
Sean Silvaa1190322015-08-06 22:56:48 +00005065it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata
Mark Heffernan9d20e422014-07-21 23:11:03 +00005066which contains information about loop-carried memory dependencies can be helpful
5067in determining the safety of these transformations.
Mark Heffernan893752a2014-07-18 19:24:51 +00005068
Mark Heffernan9d20e422014-07-21 23:11:03 +00005069'``llvm.loop.interleave.count``' Metadata
Mark Heffernan893752a2014-07-18 19:24:51 +00005070^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5071
Mark Heffernan9d20e422014-07-21 23:11:03 +00005072This metadata suggests an interleave count to the loop interleaver.
5073The first operand is the string ``llvm.loop.interleave.count`` and the
Mark Heffernan893752a2014-07-18 19:24:51 +00005074second operand is an integer specifying the interleave count. For
5075example:
5076
5077.. code-block:: llvm
5078
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005079 !0 = !{!"llvm.loop.interleave.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00005080
Mark Heffernan9d20e422014-07-21 23:11:03 +00005081Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
Sean Silvaa1190322015-08-06 22:56:48 +00005082multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
Mark Heffernan9d20e422014-07-21 23:11:03 +00005083then the interleave count will be determined automatically.
5084
5085'``llvm.loop.vectorize.enable``' Metadata
Dan Liew9a1829d2014-07-22 14:59:38 +00005086^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mark Heffernan9d20e422014-07-21 23:11:03 +00005087
5088This metadata selectively enables or disables vectorization for the loop. The
5089first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
Sean Silvaa1190322015-08-06 22:56:48 +00005090is a bit. If the bit operand value is 1 vectorization is enabled. A value of
Mark Heffernan9d20e422014-07-21 23:11:03 +000050910 disables vectorization:
5092
5093.. code-block:: llvm
5094
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005095 !0 = !{!"llvm.loop.vectorize.enable", i1 0}
5096 !1 = !{!"llvm.loop.vectorize.enable", i1 1}
Mark Heffernan893752a2014-07-18 19:24:51 +00005097
5098'``llvm.loop.vectorize.width``' Metadata
5099^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5100
5101This metadata sets the target width of the vectorizer. The first
5102operand is the string ``llvm.loop.vectorize.width`` and the second
5103operand is an integer specifying the width. For example:
5104
5105.. code-block:: llvm
5106
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005107 !0 = !{!"llvm.loop.vectorize.width", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00005108
5109Note that setting ``llvm.loop.vectorize.width`` to 1 disables
Sean Silvaa1190322015-08-06 22:56:48 +00005110vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
Mark Heffernan893752a2014-07-18 19:24:51 +000051110 or if the loop does not have this metadata the width will be
5112determined automatically.
5113
5114'``llvm.loop.unroll``'
5115^^^^^^^^^^^^^^^^^^^^^^
5116
5117Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
5118optimization hints such as the unroll factor. ``llvm.loop.unroll``
5119metadata should be used in conjunction with ``llvm.loop`` loop
5120identification metadata. The ``llvm.loop.unroll`` metadata are only
5121optimization hints and the unrolling will only be performed if the
5122optimizer believes it is safe to do so.
5123
Mark Heffernan893752a2014-07-18 19:24:51 +00005124'``llvm.loop.unroll.count``' Metadata
5125^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5126
5127This metadata suggests an unroll factor to the loop unroller. The
5128first operand is the string ``llvm.loop.unroll.count`` and the second
5129operand is a positive integer specifying the unroll factor. For
5130example:
5131
5132.. code-block:: llvm
5133
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005134 !0 = !{!"llvm.loop.unroll.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00005135
5136If the trip count of the loop is less than the unroll count the loop
5137will be partially unrolled.
5138
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005139'``llvm.loop.unroll.disable``' Metadata
5140^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5141
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00005142This metadata disables loop unrolling. The metadata has a single operand
Sean Silvaa1190322015-08-06 22:56:48 +00005143which is the string ``llvm.loop.unroll.disable``. For example:
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005144
5145.. code-block:: llvm
5146
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005147 !0 = !{!"llvm.loop.unroll.disable"}
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005148
Kevin Qin715b01e2015-03-09 06:14:18 +00005149'``llvm.loop.unroll.runtime.disable``' Metadata
Dan Liew868b0742015-03-11 13:34:49 +00005150^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Kevin Qin715b01e2015-03-09 06:14:18 +00005151
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00005152This metadata disables runtime loop unrolling. The metadata has a single
Sean Silvaa1190322015-08-06 22:56:48 +00005153operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
Kevin Qin715b01e2015-03-09 06:14:18 +00005154
5155.. code-block:: llvm
5156
5157 !0 = !{!"llvm.loop.unroll.runtime.disable"}
5158
Mark Heffernan89391542015-08-10 17:28:08 +00005159'``llvm.loop.unroll.enable``' Metadata
5160^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5161
5162This metadata suggests that the loop should be fully unrolled if the trip count
5163is known at compile time and partially unrolled if the trip count is not known
5164at compile time. The metadata has a single operand which is the string
5165``llvm.loop.unroll.enable``. For example:
5166
5167.. code-block:: llvm
5168
5169 !0 = !{!"llvm.loop.unroll.enable"}
5170
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005171'``llvm.loop.unroll.full``' Metadata
5172^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5173
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00005174This metadata suggests that the loop should be unrolled fully. The
5175metadata has a single operand which is the string ``llvm.loop.unroll.full``.
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005176For example:
5177
5178.. code-block:: llvm
5179
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005180 !0 = !{!"llvm.loop.unroll.full"}
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005181
Ashutosh Nemadf6763a2016-02-06 07:47:48 +00005182'``llvm.loop.licm_versioning.disable``' Metadata
Ashutosh Nema5f0e4722016-02-06 09:24:37 +00005183^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Ashutosh Nemadf6763a2016-02-06 07:47:48 +00005184
5185This metadata indicates that the loop should not be versioned for the purpose
5186of enabling loop-invariant code motion (LICM). The metadata has a single operand
5187which is the string ``llvm.loop.licm_versioning.disable``. For example:
5188
5189.. code-block:: llvm
5190
5191 !0 = !{!"llvm.loop.licm_versioning.disable"}
5192
Adam Nemetd2fa4142016-04-27 05:28:18 +00005193'``llvm.loop.distribute.enable``' Metadata
Adam Nemet55dc0af2016-04-27 05:59:51 +00005194^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Adam Nemetd2fa4142016-04-27 05:28:18 +00005195
5196Loop distribution allows splitting a loop into multiple loops. Currently,
5197this is only performed if the entire loop cannot be vectorized due to unsafe
Hiroshi Inoueb93daec2017-07-02 12:44:27 +00005198memory dependencies. The transformation will attempt to isolate the unsafe
Adam Nemetd2fa4142016-04-27 05:28:18 +00005199dependencies into their own loop.
5200
5201This metadata can be used to selectively enable or disable distribution of the
5202loop. The first operand is the string ``llvm.loop.distribute.enable`` and the
5203second operand is a bit. If the bit operand value is 1 distribution is
5204enabled. A value of 0 disables distribution:
5205
5206.. code-block:: llvm
5207
5208 !0 = !{!"llvm.loop.distribute.enable", i1 0}
5209 !1 = !{!"llvm.loop.distribute.enable", i1 1}
5210
5211This metadata should be used in conjunction with ``llvm.loop`` loop
5212identification metadata.
5213
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005214'``llvm.mem``'
5215^^^^^^^^^^^^^^^
5216
5217Metadata types used to annotate memory accesses with information helpful
5218for optimizations are prefixed with ``llvm.mem``.
5219
5220'``llvm.mem.parallel_loop_access``' Metadata
5221^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5222
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005223The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier,
5224or metadata containing a list of loop identifiers for nested loops.
5225The metadata is attached to memory accessing instructions and denotes that
5226no loop carried memory dependence exist between it and other instructions denoted
Hal Finkel411d31a2016-04-26 02:00:36 +00005227with the same loop identifier. The metadata on memory reads also implies that
5228if conversion (i.e. speculative execution within a loop iteration) is safe.
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005229
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005230Precisely, given two instructions ``m1`` and ``m2`` that both have the
5231``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the
5232set of loops associated with that metadata, respectively, then there is no loop
5233carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005234``L2``.
5235
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005236As a special case, if all memory accessing instructions in a loop have
5237``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the
5238loop has no loop carried memory dependences and is considered to be a parallel
5239loop.
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005240
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005241Note that if not all memory access instructions have such metadata referring to
5242the loop, then the loop is considered not being trivially parallel. Additional
Sean Silvaa1190322015-08-06 22:56:48 +00005243memory dependence analysis is required to make that determination. As a fail
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005244safe mechanism, this causes loops that were originally parallel to be considered
5245sequential (if optimization passes that are unaware of the parallel semantics
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005246insert new memory instructions into the loop body).
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005247
5248Example of a loop that is considered parallel due to its correct use of
Paul Redmond5fdf8362013-05-28 20:00:34 +00005249both ``llvm.loop`` and ``llvm.mem.parallel_loop_access``
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005250metadata types that refer to the same loop identifier metadata.
5251
5252.. code-block:: llvm
5253
5254 for.body:
Paul Redmond5fdf8362013-05-28 20:00:34 +00005255 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00005256 %val0 = load i32, i32* %arrayidx, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005257 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005258 store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005259 ...
5260 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005261
5262 for.end:
5263 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005264 !0 = !{!0}
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005265
5266It is also possible to have nested parallel loops. In that case the
5267memory accesses refer to a list of loop identifier metadata nodes instead of
5268the loop identifier metadata node directly:
5269
5270.. code-block:: llvm
5271
5272 outer.for.body:
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005273 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00005274 %val1 = load i32, i32* %arrayidx3, !llvm.mem.parallel_loop_access !2
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005275 ...
5276 br label %inner.for.body
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005277
5278 inner.for.body:
Paul Redmond5fdf8362013-05-28 20:00:34 +00005279 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00005280 %val0 = load i32, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005281 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005282 store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005283 ...
5284 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005285
5286 inner.for.end:
Paul Redmond5fdf8362013-05-28 20:00:34 +00005287 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005288 store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2
Paul Redmond5fdf8362013-05-28 20:00:34 +00005289 ...
5290 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005291
5292 outer.for.end: ; preds = %for.body
5293 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005294 !0 = !{!1, !2} ; a list of loop identifiers
5295 !1 = !{!1} ; an identifier for the inner loop
5296 !2 = !{!2} ; an identifier for the outer loop
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005297
Hiroshi Yamauchidce9def2017-11-02 22:26:51 +00005298'``irr_loop``' Metadata
5299^^^^^^^^^^^^^^^^^^^^^^^
5300
5301``irr_loop`` metadata may be attached to the terminator instruction of a basic
5302block that's an irreducible loop header (note that an irreducible loop has more
5303than once header basic blocks.) If ``irr_loop`` metadata is attached to the
5304terminator instruction of a basic block that is not really an irreducible loop
5305header, the behavior is undefined. The intent of this metadata is to improve the
5306accuracy of the block frequency propagation. For example, in the code below, the
5307block ``header0`` may have a loop header weight (relative to the other headers of
5308the irreducible loop) of 100:
5309
5310.. code-block:: llvm
5311
5312 header0:
5313 ...
5314 br i1 %cmp, label %t1, label %t2, !irr_loop !0
5315
5316 ...
5317 !0 = !{"loop_header_weight", i64 100}
5318
5319Irreducible loop header weights are typically based on profile data.
5320
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005321'``invariant.group``' Metadata
5322^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5323
Piotr Padlewski74b155f2018-04-08 13:53:04 +00005324The experimental ``invariant.group`` metadata may be attached to
Piotr Padlewskice358262018-05-18 23:53:46 +00005325``load``/``store`` instructions referencing a single metadata with no entries.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005326The existence of the ``invariant.group`` metadata on the instruction tells
5327the optimizer that every ``load`` and ``store`` to the same pointer operand
Piotr Padlewskice358262018-05-18 23:53:46 +00005328can be assumed to load or store the same
Piotr Padlewski5dde8092018-05-03 11:03:01 +00005329value (but see the ``llvm.launder.invariant.group`` intrinsic which affects
Piotr Padlewskida362152016-12-30 18:45:07 +00005330when two pointers are considered the same). Pointers returned by bitcast or
5331getelementptr with only zero indices are considered the same.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005332
5333Examples:
5334
5335.. code-block:: llvm
5336
5337 @unknownPtr = external global i8
5338 ...
5339 %ptr = alloca i8
5340 store i8 42, i8* %ptr, !invariant.group !0
5341 call void @foo(i8* %ptr)
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005342
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005343 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
5344 call void @foo(i8* %ptr)
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005345
5346 %newPtr = call i8* @getPointer(i8* %ptr)
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005347 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005348
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005349 %unknownValue = load i8, i8* @unknownPtr
5350 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005351
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005352 call void @foo(i8* %ptr)
Piotr Padlewski5dde8092018-05-03 11:03:01 +00005353 %newPtr2 = call i8* @llvm.launder.invariant.group(i8* %ptr)
5354 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through launder.invariant.group to get value of %ptr
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005355
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005356 ...
5357 declare void @foo(i8*)
5358 declare i8* @getPointer(i8*)
Piotr Padlewski5dde8092018-05-03 11:03:01 +00005359 declare i8* @llvm.launder.invariant.group(i8*)
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005360
Piotr Padlewskice358262018-05-18 23:53:46 +00005361 !0 = !{}
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005362
Piotr Padlewskif8486e32017-04-12 07:59:35 +00005363The invariant.group metadata must be dropped when replacing one pointer by
5364another based on aliasing information. This is because invariant.group is tied
5365to the SSA value of the pointer operand.
5366
5367.. code-block:: llvm
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005368
Piotr Padlewskif8486e32017-04-12 07:59:35 +00005369 %v = load i8, i8* %x, !invariant.group !0
5370 ; if %x mustalias %y then we can replace the above instruction with
5371 %v = load i8, i8* %y
5372
Piotr Padlewski74b155f2018-04-08 13:53:04 +00005373Note that this is an experimental feature, which means that its semantics might
5374change in the future.
Piotr Padlewskif8486e32017-04-12 07:59:35 +00005375
Peter Collingbournea333db82016-07-26 22:31:30 +00005376'``type``' Metadata
5377^^^^^^^^^^^^^^^^^^^
5378
5379See :doc:`TypeMetadata`.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005380
Evgeniy Stepanov51c962f722017-03-17 22:17:24 +00005381'``associated``' Metadata
Evgeniy Stepanov4d490de2017-03-17 22:31:13 +00005382^^^^^^^^^^^^^^^^^^^^^^^^^
Evgeniy Stepanov51c962f722017-03-17 22:17:24 +00005383
5384The ``associated`` metadata may be attached to a global object
5385declaration with a single argument that references another global object.
5386
5387This metadata prevents discarding of the global object in linker GC
5388unless the referenced object is also discarded. The linker support for
5389this feature is spotty. For best compatibility, globals carrying this
5390metadata may also:
5391
5392- Be in a comdat with the referenced global.
5393- Be in @llvm.compiler.used.
5394- Have an explicit section with a name which is a valid C identifier.
5395
5396It does not have any effect on non-ELF targets.
5397
5398Example:
5399
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005400.. code-block:: text
Evgeniy Stepanov4d490de2017-03-17 22:31:13 +00005401
Evgeniy Stepanov51c962f722017-03-17 22:17:24 +00005402 $a = comdat any
5403 @a = global i32 1, comdat $a
5404 @b = internal global i32 2, comdat $a, section "abc", !associated !0
5405 !0 = !{i32* @a}
5406
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005407
Teresa Johnsond72f51c2017-06-15 15:57:12 +00005408'``prof``' Metadata
5409^^^^^^^^^^^^^^^^^^^
5410
5411The ``prof`` metadata is used to record profile data in the IR.
5412The first operand of the metadata node indicates the profile metadata
5413type. There are currently 3 types:
5414:ref:`branch_weights<prof_node_branch_weights>`,
5415:ref:`function_entry_count<prof_node_function_entry_count>`, and
5416:ref:`VP<prof_node_VP>`.
5417
5418.. _prof_node_branch_weights:
5419
5420branch_weights
5421""""""""""""""
5422
5423Branch weight metadata attached to a branch, select, switch or call instruction
5424represents the likeliness of the associated branch being taken.
5425For more information, see :doc:`BranchWeightMetadata`.
5426
5427.. _prof_node_function_entry_count:
5428
5429function_entry_count
5430""""""""""""""""""""
5431
5432Function entry count metadata can be attached to function definitions
5433to record the number of times the function is called. Used with BFI
5434information, it is also used to derive the basic block profile count.
5435For more information, see :doc:`BranchWeightMetadata`.
5436
5437.. _prof_node_VP:
5438
5439VP
5440""
5441
5442VP (value profile) metadata can be attached to instructions that have
5443value profile information. Currently this is indirect calls (where it
5444records the hottest callees) and calls to memory intrinsics such as memcpy,
5445memmove, and memset (where it records the hottest byte lengths).
5446
5447Each VP metadata node contains "VP" string, then a uint32_t value for the value
5448profiling kind, a uint64_t value for the total number of times the instruction
5449is executed, followed by uint64_t value and execution count pairs.
5450The value profiling kind is 0 for indirect call targets and 1 for memory
5451operations. For indirect call targets, each profile value is a hash
5452of the callee function name, and for memory operations each value is the
5453byte length.
5454
5455Note that the value counts do not need to add up to the total count
5456listed in the third operand (in practice only the top hottest values
5457are tracked and reported).
5458
5459Indirect call example:
5460
5461.. code-block:: llvm
5462
5463 call void %f(), !prof !1
5464 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
5465
5466Note that the VP type is 0 (the second operand), which indicates this is
5467an indirect call value profile data. The third operand indicates that the
5468indirect call executed 1600 times. The 4th and 6th operands give the
5469hashes of the 2 hottest target functions' names (this is the same hash used
5470to represent function names in the profile database), and the 5th and 7th
5471operands give the execution count that each of the respective prior target
5472functions was called.
5473
Sean Silvab084af42012-12-07 10:36:55 +00005474Module Flags Metadata
5475=====================
5476
5477Information about the module as a whole is difficult to convey to LLVM's
5478subsystems. The LLVM IR isn't sufficient to transmit this information.
5479The ``llvm.module.flags`` named metadata exists in order to facilitate
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005480this. These flags are in the form of key / value pairs --- much like a
5481dictionary --- making it easy for any subsystem who cares about a flag to
Sean Silvab084af42012-12-07 10:36:55 +00005482look it up.
5483
5484The ``llvm.module.flags`` metadata contains a list of metadata triplets.
5485Each triplet has the following form:
5486
5487- The first element is a *behavior* flag, which specifies the behavior
5488 when two (or more) modules are merged together, and it encounters two
5489 (or more) metadata with the same ID. The supported behaviors are
5490 described below.
5491- The second element is a metadata string that is a unique ID for the
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005492 metadata. Each module may only have one flag entry for each unique ID (not
5493 including entries with the **Require** behavior).
Sean Silvab084af42012-12-07 10:36:55 +00005494- The third element is the value of the flag.
5495
5496When two (or more) modules are merged together, the resulting
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005497``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
5498each unique metadata ID string, there will be exactly one entry in the merged
5499modules ``llvm.module.flags`` metadata table, and the value for that entry will
5500be determined by the merge behavior flag, as described below. The only exception
5501is that entries with the *Require* behavior are always preserved.
Sean Silvab084af42012-12-07 10:36:55 +00005502
5503The following behaviors are supported:
5504
5505.. list-table::
5506 :header-rows: 1
5507 :widths: 10 90
5508
5509 * - Value
5510 - Behavior
5511
5512 * - 1
5513 - **Error**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005514 Emits an error if two values disagree, otherwise the resulting value
5515 is that of the operands.
Sean Silvab084af42012-12-07 10:36:55 +00005516
5517 * - 2
5518 - **Warning**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005519 Emits a warning if two values disagree. The result value will be the
5520 operand for the flag from the first module being linked.
Sean Silvab084af42012-12-07 10:36:55 +00005521
5522 * - 3
5523 - **Require**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005524 Adds a requirement that another module flag be present and have a
5525 specified value after linking is performed. The value must be a
5526 metadata pair, where the first element of the pair is the ID of the
5527 module flag to be restricted, and the second element of the pair is
5528 the value the module flag should be restricted to. This behavior can
5529 be used to restrict the allowable results (via triggering of an
5530 error) of linking IDs with the **Override** behavior.
Sean Silvab084af42012-12-07 10:36:55 +00005531
5532 * - 4
5533 - **Override**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005534 Uses the specified value, regardless of the behavior or value of the
5535 other module. If both modules specify **Override**, but the values
5536 differ, an error will be emitted.
5537
Daniel Dunbard77d9fb2013-01-16 21:38:56 +00005538 * - 5
5539 - **Append**
5540 Appends the two values, which are required to be metadata nodes.
5541
5542 * - 6
5543 - **AppendUnique**
5544 Appends the two values, which are required to be metadata
5545 nodes. However, duplicate entries in the second list are dropped
5546 during the append operation.
5547
Steven Wu86a511e2017-08-15 16:16:33 +00005548 * - 7
5549 - **Max**
5550 Takes the max of the two values, which are required to be integers.
5551
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005552It is an error for a particular unique flag ID to have multiple behaviors,
5553except in the case of **Require** (which adds restrictions on another metadata
5554value) or **Override**.
Sean Silvab084af42012-12-07 10:36:55 +00005555
5556An example of module flags:
5557
5558.. code-block:: llvm
5559
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005560 !0 = !{ i32 1, !"foo", i32 1 }
5561 !1 = !{ i32 4, !"bar", i32 37 }
5562 !2 = !{ i32 2, !"qux", i32 42 }
5563 !3 = !{ i32 3, !"qux",
5564 !{
5565 !"foo", i32 1
Sean Silvab084af42012-12-07 10:36:55 +00005566 }
5567 }
5568 !llvm.module.flags = !{ !0, !1, !2, !3 }
5569
5570- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
5571 if two or more ``!"foo"`` flags are seen is to emit an error if their
5572 values are not equal.
5573
5574- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
5575 behavior if two or more ``!"bar"`` flags are seen is to use the value
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005576 '37'.
Sean Silvab084af42012-12-07 10:36:55 +00005577
5578- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
5579 behavior if two or more ``!"qux"`` flags are seen is to emit a
5580 warning if their values are not equal.
5581
5582- Metadata ``!3`` has the ID ``!"qux"`` and the value:
5583
5584 ::
5585
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005586 !{ !"foo", i32 1 }
Sean Silvab084af42012-12-07 10:36:55 +00005587
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005588 The behavior is to emit an error if the ``llvm.module.flags`` does not
5589 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
5590 performed.
Sean Silvab084af42012-12-07 10:36:55 +00005591
5592Objective-C Garbage Collection Module Flags Metadata
5593----------------------------------------------------
5594
5595On the Mach-O platform, Objective-C stores metadata about garbage
5596collection in a special section called "image info". The metadata
5597consists of a version number and a bitmask specifying what types of
5598garbage collection are supported (if any) by the file. If two or more
5599modules are linked together their garbage collection metadata needs to
5600be merged rather than appended together.
5601
5602The Objective-C garbage collection module flags metadata consists of the
5603following key-value pairs:
5604
5605.. list-table::
5606 :header-rows: 1
5607 :widths: 30 70
5608
5609 * - Key
5610 - Value
5611
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005612 * - ``Objective-C Version``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005613 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
Sean Silvab084af42012-12-07 10:36:55 +00005614
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005615 * - ``Objective-C Image Info Version``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005616 - **[Required]** --- The version of the image info section. Currently
Sean Silvab084af42012-12-07 10:36:55 +00005617 always 0.
5618
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005619 * - ``Objective-C Image Info Section``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005620 - **[Required]** --- The section to place the metadata. Valid values are
Sean Silvab084af42012-12-07 10:36:55 +00005621 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
5622 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
5623 Objective-C ABI version 2.
5624
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005625 * - ``Objective-C Garbage Collection``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005626 - **[Required]** --- Specifies whether garbage collection is supported or
Sean Silvab084af42012-12-07 10:36:55 +00005627 not. Valid values are 0, for no garbage collection, and 2, for garbage
5628 collection supported.
5629
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005630 * - ``Objective-C GC Only``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005631 - **[Optional]** --- Specifies that only garbage collection is supported.
Sean Silvab084af42012-12-07 10:36:55 +00005632 If present, its value must be 6. This flag requires that the
5633 ``Objective-C Garbage Collection`` flag have the value 2.
5634
5635Some important flag interactions:
5636
5637- If a module with ``Objective-C Garbage Collection`` set to 0 is
5638 merged with a module with ``Objective-C Garbage Collection`` set to
5639 2, then the resulting module has the
5640 ``Objective-C Garbage Collection`` flag set to 0.
5641- A module with ``Objective-C Garbage Collection`` set to 0 cannot be
5642 merged with a module with ``Objective-C GC Only`` set to 6.
5643
Oliver Stannard5dc29342014-06-20 10:08:11 +00005644C type width Module Flags Metadata
5645----------------------------------
5646
5647The ARM backend emits a section into each generated object file describing the
5648options that it was compiled with (in a compiler-independent way) to prevent
5649linking incompatible objects, and to allow automatic library selection. Some
5650of these options are not visible at the IR level, namely wchar_t width and enum
5651width.
5652
5653To pass this information to the backend, these options are encoded in module
5654flags metadata, using the following key-value pairs:
5655
5656.. list-table::
5657 :header-rows: 1
5658 :widths: 30 70
5659
5660 * - Key
5661 - Value
5662
5663 * - short_wchar
5664 - * 0 --- sizeof(wchar_t) == 4
5665 * 1 --- sizeof(wchar_t) == 2
5666
5667 * - short_enum
5668 - * 0 --- Enums are at least as large as an ``int``.
5669 * 1 --- Enums are stored in the smallest integer type which can
5670 represent all of its values.
5671
5672For example, the following metadata section specifies that the module was
5673compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
5674enum is the smallest type which can represent all of its values::
5675
5676 !llvm.module.flags = !{!0, !1}
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005677 !0 = !{i32 1, !"short_wchar", i32 1}
5678 !1 = !{i32 1, !"short_enum", i32 0}
Oliver Stannard5dc29342014-06-20 10:08:11 +00005679
Peter Collingbourne89061b22017-06-12 20:10:48 +00005680Automatic Linker Flags Named Metadata
5681=====================================
5682
5683Some targets support embedding flags to the linker inside individual object
5684files. Typically this is used in conjunction with language extensions which
5685allow source files to explicitly declare the libraries they depend on, and have
5686these automatically be transmitted to the linker via object files.
5687
5688These flags are encoded in the IR using named metadata with the name
5689``!llvm.linker.options``. Each operand is expected to be a metadata node
5690which should be a list of other metadata nodes, each of which should be a
5691list of metadata strings defining linker options.
5692
5693For example, the following metadata section specifies two separate sets of
5694linker options, presumably to link against ``libz`` and the ``Cocoa``
5695framework::
5696
5697 !0 = !{ !"-lz" },
5698 !1 = !{ !"-framework", !"Cocoa" } } }
5699 !llvm.linker.options = !{ !0, !1 }
5700
5701The metadata encoding as lists of lists of options, as opposed to a collapsed
5702list of options, is chosen so that the IR encoding can use multiple option
5703strings to specify e.g., a single library, while still having that specifier be
5704preserved as an atomic element that can be recognized by a target specific
5705assembly writer or object file emitter.
5706
5707Each individual option is required to be either a valid option for the target's
5708linker, or an option that is reserved by the target specific assembly writer or
5709object file emitter. No other aspect of these options is defined by the IR.
5710
Teresa Johnson08d5b4e2018-05-26 02:34:13 +00005711.. _summary:
5712
5713ThinLTO Summary
5714===============
5715
5716Compiling with `ThinLTO <https://clang.llvm.org/docs/ThinLTO.html>`_
5717causes the building of a compact summary of the module that is emitted into
5718the bitcode. The summary is emitted into the LLVM assembly and identified
5719in syntax by a caret ('``^``').
5720
5721*Note that temporarily the summary entries are skipped when parsing the
5722assembly, although the parsing support is actively being implemented. The
5723following describes when the summary entries will be parsed once implemented.*
5724The summary will be parsed into a ModuleSummaryIndex object under the
5725same conditions where summary index is currently built from bitcode.
5726Specifically, tools that test the Thin Link portion of a ThinLTO compile
5727(i.e. llvm-lto and llvm-lto2), or when parsing a combined index
5728for a distributed ThinLTO backend via clang's "``-fthinlto-index=<>``" flag.
5729Additionally, it will be parsed into a bitcode output, along with the Module
5730IR, via the "``llvm-as``" tool. Tools that parse the Module IR for the purposes
5731of optimization (e.g. "``clang -x ir``" and "``opt``"), will ignore the
5732summary entries (just as they currently ignore summary entries in a bitcode
5733input file).
5734
5735There are currently 3 types of summary entries in the LLVM assembly:
5736:ref:`module paths<module_path_summary>`,
5737:ref:`global values<gv_summary>`, and
5738:ref:`type identifiers<typeid_summary>`.
5739
5740.. _module_path_summary:
5741
5742Module Path Summary Entry
5743-------------------------
5744
5745Each module path summary entry lists a module containing global values included
5746in the summary. For a single IR module there will be one such entry, but
5747in a combined summary index produced during the thin link, there will be
5748one module path entry per linked module with summary.
5749
5750Example:
5751
5752.. code-block:: llvm
5753
5754 ^0 = module: (path: "/path/to/file.o", hash: (2468601609, 1329373163, 1565878005, 638838075, 3148790418))
5755
5756The ``path`` field is a string path to the bitcode file, and the ``hash``
5757field is the 160-bit SHA-1 hash of the IR bitcode contents, used for
5758incremental builds and caching.
5759
5760.. _gv_summary:
5761
5762Global Value Summary Entry
5763--------------------------
5764
5765Each global value summary entry corresponds to a global value defined or
5766referenced by a summarized module.
5767
5768Example:
5769
5770.. code-block:: llvm
5771
5772 ^4 = gv: (name: "f"[, summaries: (Summary)[, (Summary)]*]?) ; guid = 14740650423002898831
5773
5774For declarations, there will not be a summary list. For definitions, a
5775global value will contain a list of summaries, one per module containing
5776a definition. There can be multiple entries in a combined summary index
5777for symbols with weak linkage.
5778
5779Each ``Summary`` format will depend on whether the global value is a
5780:ref:`function<function_summary>`, :ref:`variable<variable_summary>`, or
5781:ref:`alias<alias_summary>`.
5782
5783.. _function_summary:
5784
5785Function Summary
5786^^^^^^^^^^^^^^^^
5787
5788If the global value is a function, the ``Summary`` entry will look like:
5789
5790.. code-block:: llvm
5791
5792 function: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), insts: 2[, FuncFlags]?[, Calls]?[, TypeIdInfo]?[, Refs]?
5793
5794The ``module`` field includes the summary entry id for the module containing
5795this definition, and the ``flags`` field contains information such as
5796the linkage type, a flag indicating whether it is legal to import the
5797definition, whether it is globally live and whether the linker resolved it
5798to a local definition (the latter two are populated during the thin link).
5799The ``insts`` field contains the number of IR instructions in the function.
5800Finally, there are several optional fields: :ref:`FuncFlags<funcflags_summary>`,
5801:ref:`Calls<calls_summary>`, :ref:`TypeIdInfo<typeidinfo_summary>`,
5802:ref:`Refs<refs_summary>`.
5803
5804.. _variable_summary:
5805
5806Global Variable Summary
5807^^^^^^^^^^^^^^^^^^^^^^^
5808
5809If the global value is a variable, the ``Summary`` entry will look like:
5810
5811.. code-block:: llvm
5812
5813 variable: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0)[, Refs]?
5814
5815The variable entry contains a subset of the fields in a
5816:ref:`function summary <function_summary>`, see the descriptions there.
5817
5818.. _alias_summary:
5819
5820Alias Summary
5821^^^^^^^^^^^^^
5822
5823If the global value is an alias, the ``Summary`` entry will look like:
5824
5825.. code-block:: llvm
5826
5827 alias: (module: ^0, flags: (linkage: external, notEligibleToImport: 0, live: 0, dsoLocal: 0), aliasee: ^2)
5828
5829The ``module`` and ``flags`` fields are as described for a
5830:ref:`function summary <function_summary>`. The ``aliasee`` field
5831contains a reference to the global value summary entry of the aliasee.
5832
5833.. _funcflags_summary:
5834
5835Function Flags
5836^^^^^^^^^^^^^^
5837
5838The optional ``FuncFlags`` field looks like:
5839
5840.. code-block:: llvm
5841
5842 funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0)
5843
5844If unspecified, flags are assumed to hold the conservative ``false`` value of
5845``0``.
5846
5847.. _calls_summary:
5848
5849Calls
5850^^^^^
5851
5852The optional ``Calls`` field looks like:
5853
5854.. code-block:: llvm
5855
5856 calls: ((Callee)[, (Callee)]*)
5857
5858where each ``Callee`` looks like:
5859
5860.. code-block:: llvm
5861
5862 callee: ^1[, hotness: None]?[, relbf: 0]?
5863
5864The ``callee`` refers to the summary entry id of the callee. At most one
5865of ``hotness`` (which can take the values ``Unknown``, ``Cold``, ``None``,
5866``Hot``, and ``Critical``), and ``relbf`` (which holds the integer
5867branch frequency relative to the entry frequency, scaled down by 2^8)
5868may be specified. The defaults are ``Unknown`` and ``0``, respectively.
5869
5870.. _refs_summary:
5871
5872Refs
5873^^^^
5874
5875The optional ``Refs`` field looks like:
5876
5877.. code-block:: llvm
5878
5879 refs: ((Ref)[, (Ref)]*)
5880
5881where each ``Ref`` contains a reference to the summary id of the referenced
5882value (e.g. ``^1``).
5883
5884.. _typeidinfo_summary:
5885
5886TypeIdInfo
5887^^^^^^^^^^
5888
5889The optional ``TypeIdInfo`` field, used for
5890`Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
5891looks like:
5892
5893.. code-block:: llvm
5894
5895 typeIdInfo: [(TypeTests)]?[, (TypeTestAssumeVCalls)]?[, (TypeCheckedLoadVCalls)]?[, (TypeTestAssumeConstVCalls)]?[, (TypeCheckedLoadConstVCalls)]?
5896
5897These optional fields have the following forms:
5898
5899TypeTests
5900"""""""""
5901
5902.. code-block:: llvm
5903
5904 typeTests: (TypeIdRef[, TypeIdRef]*)
5905
5906Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
5907by summary id or ``GUID``.
5908
5909TypeTestAssumeVCalls
5910""""""""""""""""""""
5911
5912.. code-block:: llvm
5913
5914 typeTestAssumeVCalls: (VFuncId[, VFuncId]*)
5915
5916Where each VFuncId has the format:
5917
5918.. code-block:: llvm
5919
5920 vFuncId: (TypeIdRef, offset: 16)
5921
5922Where each ``TypeIdRef`` refers to a :ref:`type id<typeid_summary>`
5923by summary id or ``GUID`` preceeded by a ``guid:`` tag.
5924
5925TypeCheckedLoadVCalls
5926"""""""""""""""""""""
5927
5928.. code-block:: llvm
5929
5930 typeCheckedLoadVCalls: (VFuncId[, VFuncId]*)
5931
5932Where each VFuncId has the format described for ``TypeTestAssumeVCalls``.
5933
5934TypeTestAssumeConstVCalls
5935"""""""""""""""""""""""""
5936
5937.. code-block:: llvm
5938
5939 typeTestAssumeConstVCalls: (ConstVCall[, ConstVCall]*)
5940
5941Where each ConstVCall has the format:
5942
5943.. code-block:: llvm
5944
5945 VFuncId, args: (Arg[, Arg]*)
5946
5947and where each VFuncId has the format described for ``TypeTestAssumeVCalls``,
5948and each Arg is an integer argument number.
5949
5950TypeCheckedLoadConstVCalls
5951""""""""""""""""""""""""""
5952
5953.. code-block:: llvm
5954
5955 typeCheckedLoadConstVCalls: (ConstVCall[, ConstVCall]*)
5956
5957Where each ConstVCall has the format described for
5958``TypeTestAssumeConstVCalls``.
5959
5960.. _typeid_summary:
5961
5962Type ID Summary Entry
5963---------------------
5964
5965Each type id summary entry corresponds to a type identifier resolution
5966which is generated during the LTO link portion of the compile when building
5967with `Control Flow Integrity <http://clang.llvm.org/docs/ControlFlowIntegrity.html>`_,
5968so these are only present in a combined summary index.
5969
5970Example:
5971
5972.. code-block:: llvm
5973
5974 ^4 = typeid: (name: "_ZTS1A", summary: (typeTestRes: (kind: allOnes, sizeM1BitWidth: 7[, alignLog2: 0]?[, sizeM1: 0]?[, bitMask: 0]?[, inlineBits: 0]?)[, WpdResolutions]?)) ; guid = 7004155349499253778
5975
5976The ``typeTestRes`` gives the type test resolution ``kind`` (which may
5977be ``unsat``, ``byteArray``, ``inline``, ``single``, or ``allOnes``), and
5978the ``size-1`` bit width. It is followed by optional flags, which default to 0,
5979and an optional WpdResolutions (whole program devirtualization resolution)
5980field that looks like:
5981
5982.. code-block:: llvm
5983
5984 wpdResolutions: ((offset: 0, WpdRes)[, (offset: 1, WpdRes)]*
5985
5986where each entry is a mapping from the given byte offset to the whole-program
5987devirtualization resolution WpdRes, that has one of the following formats:
5988
5989.. code-block:: llvm
5990
5991 wpdRes: (kind: branchFunnel)
5992 wpdRes: (kind: singleImpl, singleImplName: "_ZN1A1nEi")
5993 wpdRes: (kind: indir)
5994
5995Additionally, each wpdRes has an optional ``resByArg`` field, which
5996describes the resolutions for calls with all constant integer arguments:
5997
5998.. code-block:: llvm
5999
6000 resByArg: (ResByArg[, ResByArg]*)
6001
6002where ResByArg is:
6003
6004.. code-block:: llvm
6005
6006 args: (Arg[, Arg]*), byArg: (kind: UniformRetVal[, info: 0][, byte: 0][, bit: 0])
6007
6008Where the ``kind`` can be ``Indir``, ``UniformRetVal``, ``UniqueRetVal``
6009or ``VirtualConstProp``. The ``info`` field is only used if the kind
6010is ``UniformRetVal`` (indicates the uniform return value), or
6011``UniqueRetVal`` (holds the return value associated with the unique vtable
6012(0 or 1)). The ``byte`` and ``bit`` fields are only used if the target does
6013not support the use of absolute symbols to store constants.
6014
Eli Bendersky0220e6b2013-06-07 20:24:43 +00006015.. _intrinsicglobalvariables:
6016
Sean Silvab084af42012-12-07 10:36:55 +00006017Intrinsic Global Variables
6018==========================
6019
6020LLVM has a number of "magic" global variables that contain data that
6021affect code generation or other IR semantics. These are documented here.
6022All globals of this sort should have a section specified as
6023"``llvm.metadata``". This section and all globals that start with
6024"``llvm.``" are reserved for use by LLVM.
6025
Eli Bendersky0220e6b2013-06-07 20:24:43 +00006026.. _gv_llvmused:
6027
Sean Silvab084af42012-12-07 10:36:55 +00006028The '``llvm.used``' Global Variable
6029-----------------------------------
6030
Rafael Espindola74f2e462013-04-22 14:58:02 +00006031The ``@llvm.used`` global is an array which has
Paul Redmond219ef812013-05-30 17:24:32 +00006032:ref:`appending linkage <linkage_appending>`. This array contains a list of
Rafael Espindola70a729d2013-06-11 13:18:13 +00006033pointers to named global variables, functions and aliases which may optionally
6034have a pointer cast formed of bitcast or getelementptr. For example, a legal
Sean Silvab084af42012-12-07 10:36:55 +00006035use of it is:
6036
6037.. code-block:: llvm
6038
6039 @X = global i8 4
6040 @Y = global i32 123
6041
6042 @llvm.used = appending global [2 x i8*] [
6043 i8* @X,
6044 i8* bitcast (i32* @Y to i8*)
6045 ], section "llvm.metadata"
6046
Rafael Espindola74f2e462013-04-22 14:58:02 +00006047If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
6048and linker are required to treat the symbol as if there is a reference to the
Rafael Espindola70a729d2013-06-11 13:18:13 +00006049symbol that it cannot see (which is why they have to be named). For example, if
6050a variable has internal linkage and no references other than that from the
6051``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
6052references from inline asms and other things the compiler cannot "see", and
6053corresponds to "``attribute((used))``" in GNU C.
Sean Silvab084af42012-12-07 10:36:55 +00006054
6055On some targets, the code generator must emit a directive to the
6056assembler or object file to prevent the assembler and linker from
6057molesting the symbol.
6058
Eli Bendersky0220e6b2013-06-07 20:24:43 +00006059.. _gv_llvmcompilerused:
6060
Sean Silvab084af42012-12-07 10:36:55 +00006061The '``llvm.compiler.used``' Global Variable
6062--------------------------------------------
6063
6064The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
6065directive, except that it only prevents the compiler from touching the
6066symbol. On targets that support it, this allows an intelligent linker to
6067optimize references to the symbol without being impeded as it would be
6068by ``@llvm.used``.
6069
6070This is a rare construct that should only be used in rare circumstances,
6071and should not be exposed to source languages.
6072
Eli Bendersky0220e6b2013-06-07 20:24:43 +00006073.. _gv_llvmglobalctors:
6074
Sean Silvab084af42012-12-07 10:36:55 +00006075The '``llvm.global_ctors``' Global Variable
6076-------------------------------------------
6077
6078.. code-block:: llvm
6079
Reid Klecknerfceb76f2014-05-16 20:39:27 +00006080 %0 = type { i32, void ()*, i8* }
6081 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
Sean Silvab084af42012-12-07 10:36:55 +00006082
6083The ``@llvm.global_ctors`` array contains a list of constructor
Reid Klecknerfceb76f2014-05-16 20:39:27 +00006084functions, priorities, and an optional associated global or function.
6085The functions referenced by this array will be called in ascending order
6086of priority (i.e. lowest first) when the module is loaded. The order of
6087functions with the same priority is not defined.
6088
6089If the third field is present, non-null, and points to a global variable
6090or function, the initializer function will only run if the associated
6091data from the current module is not discarded.
Sean Silvab084af42012-12-07 10:36:55 +00006092
Eli Bendersky0220e6b2013-06-07 20:24:43 +00006093.. _llvmglobaldtors:
6094
Sean Silvab084af42012-12-07 10:36:55 +00006095The '``llvm.global_dtors``' Global Variable
6096-------------------------------------------
6097
6098.. code-block:: llvm
6099
Reid Klecknerfceb76f2014-05-16 20:39:27 +00006100 %0 = type { i32, void ()*, i8* }
6101 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
Sean Silvab084af42012-12-07 10:36:55 +00006102
Reid Klecknerfceb76f2014-05-16 20:39:27 +00006103The ``@llvm.global_dtors`` array contains a list of destructor
6104functions, priorities, and an optional associated global or function.
6105The functions referenced by this array will be called in descending
Reid Klecknerbffbcc52014-05-27 21:35:17 +00006106order of priority (i.e. highest first) when the module is unloaded. The
Reid Klecknerfceb76f2014-05-16 20:39:27 +00006107order of functions with the same priority is not defined.
6108
6109If the third field is present, non-null, and points to a global variable
6110or function, the destructor function will only run if the associated
6111data from the current module is not discarded.
Sean Silvab084af42012-12-07 10:36:55 +00006112
6113Instruction Reference
6114=====================
6115
6116The LLVM instruction set consists of several different classifications
6117of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
6118instructions <binaryops>`, :ref:`bitwise binary
6119instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
6120:ref:`other instructions <otherops>`.
6121
6122.. _terminators:
6123
6124Terminator Instructions
6125-----------------------
6126
6127As mentioned :ref:`previously <functionstructure>`, every basic block in a
6128program ends with a "Terminator" instruction, which indicates which
6129block should be executed after the current block is finished. These
6130terminator instructions typically yield a '``void``' value: they produce
6131control flow, not values (the one exception being the
6132':ref:`invoke <i_invoke>`' instruction).
6133
6134The terminator instructions are: ':ref:`ret <i_ret>`',
6135':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
6136':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
David Majnemer8a1c45d2015-12-12 05:38:55 +00006137':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
David Majnemer654e1302015-07-31 17:58:14 +00006138':ref:`catchret <i_catchret>`',
6139':ref:`cleanupret <i_cleanupret>`',
David Majnemer654e1302015-07-31 17:58:14 +00006140and ':ref:`unreachable <i_unreachable>`'.
Sean Silvab084af42012-12-07 10:36:55 +00006141
6142.. _i_ret:
6143
6144'``ret``' Instruction
6145^^^^^^^^^^^^^^^^^^^^^
6146
6147Syntax:
6148"""""""
6149
6150::
6151
6152 ret <type> <value> ; Return a value from a non-void function
6153 ret void ; Return from void function
6154
6155Overview:
6156"""""""""
6157
6158The '``ret``' instruction is used to return control flow (and optionally
6159a value) from a function back to the caller.
6160
6161There are two forms of the '``ret``' instruction: one that returns a
6162value and then causes control flow, and one that just causes control
6163flow to occur.
6164
6165Arguments:
6166""""""""""
6167
6168The '``ret``' instruction optionally accepts a single argument, the
6169return value. The type of the return value must be a ':ref:`first
6170class <t_firstclass>`' type.
6171
6172A function is not :ref:`well formed <wellformed>` if it it has a non-void
6173return type and contains a '``ret``' instruction with no return value or
6174a return value with a type that does not match its type, or if it has a
6175void return type and contains a '``ret``' instruction with a return
6176value.
6177
6178Semantics:
6179""""""""""
6180
6181When the '``ret``' instruction is executed, control flow returns back to
6182the calling function's context. If the caller is a
6183":ref:`call <i_call>`" instruction, execution continues at the
6184instruction after the call. If the caller was an
6185":ref:`invoke <i_invoke>`" instruction, execution continues at the
6186beginning of the "normal" destination block. If the instruction returns
6187a value, that value shall set the call or invoke instruction's return
6188value.
6189
6190Example:
6191""""""""
6192
6193.. code-block:: llvm
6194
6195 ret i32 5 ; Return an integer value of 5
6196 ret void ; Return from a void function
6197 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
6198
6199.. _i_br:
6200
6201'``br``' Instruction
6202^^^^^^^^^^^^^^^^^^^^
6203
6204Syntax:
6205"""""""
6206
6207::
6208
6209 br i1 <cond>, label <iftrue>, label <iffalse>
6210 br label <dest> ; Unconditional branch
6211
6212Overview:
6213"""""""""
6214
6215The '``br``' instruction is used to cause control flow to transfer to a
6216different basic block in the current function. There are two forms of
6217this instruction, corresponding to a conditional branch and an
6218unconditional branch.
6219
6220Arguments:
6221""""""""""
6222
6223The conditional branch form of the '``br``' instruction takes a single
6224'``i1``' value and two '``label``' values. The unconditional form of the
6225'``br``' instruction takes a single '``label``' value as a target.
6226
6227Semantics:
6228""""""""""
6229
6230Upon execution of a conditional '``br``' instruction, the '``i1``'
6231argument is evaluated. If the value is ``true``, control flows to the
6232'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
6233to the '``iffalse``' ``label`` argument.
6234
6235Example:
6236""""""""
6237
6238.. code-block:: llvm
6239
6240 Test:
6241 %cond = icmp eq i32 %a, %b
6242 br i1 %cond, label %IfEqual, label %IfUnequal
6243 IfEqual:
6244 ret i32 1
6245 IfUnequal:
6246 ret i32 0
6247
6248.. _i_switch:
6249
6250'``switch``' Instruction
6251^^^^^^^^^^^^^^^^^^^^^^^^
6252
6253Syntax:
6254"""""""
6255
6256::
6257
6258 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
6259
6260Overview:
6261"""""""""
6262
6263The '``switch``' instruction is used to transfer control flow to one of
6264several different places. It is a generalization of the '``br``'
6265instruction, allowing a branch to occur to one of many possible
6266destinations.
6267
6268Arguments:
6269""""""""""
6270
6271The '``switch``' instruction uses three parameters: an integer
6272comparison value '``value``', a default '``label``' destination, and an
6273array of pairs of comparison value constants and '``label``'s. The table
6274is not allowed to contain duplicate constant entries.
6275
6276Semantics:
6277""""""""""
6278
6279The ``switch`` instruction specifies a table of values and destinations.
6280When the '``switch``' instruction is executed, this table is searched
6281for the given value. If the value is found, control flow is transferred
6282to the corresponding destination; otherwise, control flow is transferred
6283to the default destination.
6284
6285Implementation:
6286"""""""""""""""
6287
6288Depending on properties of the target machine and the particular
6289``switch`` instruction, this instruction may be code generated in
6290different ways. For example, it could be generated as a series of
6291chained conditional branches or with a lookup table.
6292
6293Example:
6294""""""""
6295
6296.. code-block:: llvm
6297
6298 ; Emulate a conditional br instruction
6299 %Val = zext i1 %value to i32
6300 switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
6301
6302 ; Emulate an unconditional br instruction
6303 switch i32 0, label %dest [ ]
6304
6305 ; Implement a jump table:
6306 switch i32 %val, label %otherwise [ i32 0, label %onzero
6307 i32 1, label %onone
6308 i32 2, label %ontwo ]
6309
6310.. _i_indirectbr:
6311
6312'``indirectbr``' Instruction
6313^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6314
6315Syntax:
6316"""""""
6317
6318::
6319
6320 indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
6321
6322Overview:
6323"""""""""
6324
6325The '``indirectbr``' instruction implements an indirect branch to a
6326label within the current function, whose address is specified by
6327"``address``". Address must be derived from a
6328:ref:`blockaddress <blockaddress>` constant.
6329
6330Arguments:
6331""""""""""
6332
6333The '``address``' argument is the address of the label to jump to. The
6334rest of the arguments indicate the full set of possible destinations
6335that the address may point to. Blocks are allowed to occur multiple
6336times in the destination list, though this isn't particularly useful.
6337
6338This destination list is required so that dataflow analysis has an
6339accurate understanding of the CFG.
6340
6341Semantics:
6342""""""""""
6343
6344Control transfers to the block specified in the address argument. All
6345possible destination blocks must be listed in the label list, otherwise
6346this instruction has undefined behavior. This implies that jumps to
6347labels defined in other functions have undefined behavior as well.
6348
6349Implementation:
6350"""""""""""""""
6351
6352This is typically implemented with a jump through a register.
6353
6354Example:
6355""""""""
6356
6357.. code-block:: llvm
6358
6359 indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
6360
6361.. _i_invoke:
6362
6363'``invoke``' Instruction
6364^^^^^^^^^^^^^^^^^^^^^^^^
6365
6366Syntax:
6367"""""""
6368
6369::
6370
David Blaikieb83cf102016-07-13 17:21:34 +00006371 <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00006372 [operand bundles] to label <normal label> unwind label <exception label>
Sean Silvab084af42012-12-07 10:36:55 +00006373
6374Overview:
6375"""""""""
6376
6377The '``invoke``' instruction causes control to transfer to a specified
6378function, with the possibility of control flow transfer to either the
6379'``normal``' label or the '``exception``' label. If the callee function
6380returns with the "``ret``" instruction, control flow will return to the
6381"normal" label. If the callee (or any indirect callees) returns via the
6382":ref:`resume <i_resume>`" instruction or other exception handling
6383mechanism, control is interrupted and continued at the dynamically
6384nearest "exception" label.
6385
6386The '``exception``' label is a `landing
6387pad <ExceptionHandling.html#overview>`_ for the exception. As such,
6388'``exception``' label is required to have the
6389":ref:`landingpad <i_landingpad>`" instruction, which contains the
6390information about the behavior of the program after unwinding happens,
6391as its first non-PHI instruction. The restrictions on the
6392"``landingpad``" instruction's tightly couples it to the "``invoke``"
6393instruction, so that the important information contained within the
6394"``landingpad``" instruction can't be lost through normal code motion.
6395
6396Arguments:
6397""""""""""
6398
6399This instruction requires several arguments:
6400
6401#. The optional "cconv" marker indicates which :ref:`calling
6402 convention <callingconv>` the call should use. If none is
6403 specified, the call defaults to using C calling conventions.
6404#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
6405 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
6406 are valid here.
David Blaikieb83cf102016-07-13 17:21:34 +00006407#. '``ty``': the type of the call instruction itself which is also the
6408 type of the return value. Functions that return no value are marked
6409 ``void``.
6410#. '``fnty``': shall be the signature of the function being invoked. The
6411 argument types must match the types implied by this signature. This
6412 type can be omitted if the function is not varargs.
6413#. '``fnptrval``': An LLVM value containing a pointer to a function to
6414 be invoked. In most cases, this is a direct function invocation, but
6415 indirect ``invoke``'s are just as possible, calling an arbitrary pointer
6416 to function value.
Sean Silvab084af42012-12-07 10:36:55 +00006417#. '``function args``': argument list whose types match the function
6418 signature argument types and parameter attributes. All arguments must
6419 be of :ref:`first class <t_firstclass>` type. If the function signature
6420 indicates the function accepts a variable number of arguments, the
6421 extra arguments can be specified.
6422#. '``normal label``': the label reached when the called function
6423 executes a '``ret``' instruction.
6424#. '``exception label``': the label reached when a callee returns via
6425 the :ref:`resume <i_resume>` instruction or other exception handling
6426 mechanism.
George Burgess IV8a464a72017-04-13 05:00:31 +00006427#. The optional :ref:`function attributes <fnattrs>` list.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00006428#. The optional :ref:`operand bundles <opbundles>` list.
Sean Silvab084af42012-12-07 10:36:55 +00006429
6430Semantics:
6431""""""""""
6432
6433This instruction is designed to operate as a standard '``call``'
6434instruction in most regards. The primary difference is that it
6435establishes an association with a label, which is used by the runtime
6436library to unwind the stack.
6437
6438This instruction is used in languages with destructors to ensure that
6439proper cleanup is performed in the case of either a ``longjmp`` or a
6440thrown exception. Additionally, this is important for implementation of
6441'``catch``' clauses in high-level languages that support them.
6442
6443For the purposes of the SSA form, the definition of the value returned
6444by the '``invoke``' instruction is deemed to occur on the edge from the
6445current block to the "normal" label. If the callee unwinds then no
6446return value is available.
6447
6448Example:
6449""""""""
6450
6451.. code-block:: llvm
6452
6453 %retval = invoke i32 @Test(i32 15) to label %Continue
Tim Northover675a0962014-06-13 14:24:23 +00006454 unwind label %TestCleanup ; i32:retval set
Sean Silvab084af42012-12-07 10:36:55 +00006455 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
Tim Northover675a0962014-06-13 14:24:23 +00006456 unwind label %TestCleanup ; i32:retval set
Sean Silvab084af42012-12-07 10:36:55 +00006457
6458.. _i_resume:
6459
6460'``resume``' Instruction
6461^^^^^^^^^^^^^^^^^^^^^^^^
6462
6463Syntax:
6464"""""""
6465
6466::
6467
6468 resume <type> <value>
6469
6470Overview:
6471"""""""""
6472
6473The '``resume``' instruction is a terminator instruction that has no
6474successors.
6475
6476Arguments:
6477""""""""""
6478
6479The '``resume``' instruction requires one argument, which must have the
6480same type as the result of any '``landingpad``' instruction in the same
6481function.
6482
6483Semantics:
6484""""""""""
6485
6486The '``resume``' instruction resumes propagation of an existing
6487(in-flight) exception whose unwinding was interrupted with a
6488:ref:`landingpad <i_landingpad>` instruction.
6489
6490Example:
6491""""""""
6492
6493.. code-block:: llvm
6494
6495 resume { i8*, i32 } %exn
6496
David Majnemer8a1c45d2015-12-12 05:38:55 +00006497.. _i_catchswitch:
6498
6499'``catchswitch``' Instruction
Akira Hatanakacedf8e92015-12-14 05:15:40 +00006500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
David Majnemer8a1c45d2015-12-12 05:38:55 +00006501
6502Syntax:
6503"""""""
6504
6505::
6506
6507 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
6508 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
6509
6510Overview:
6511"""""""""
6512
6513The '``catchswitch``' instruction is used by `LLVM's exception handling system
6514<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
6515that may be executed by the :ref:`EH personality routine <personalityfn>`.
6516
6517Arguments:
6518""""""""""
6519
6520The ``parent`` argument is the token of the funclet that contains the
6521``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
6522this operand may be the token ``none``.
6523
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006524The ``default`` argument is the label of another basic block beginning with
6525either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination
6526must be a legal target with respect to the ``parent`` links, as described in
6527the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
David Majnemer8a1c45d2015-12-12 05:38:55 +00006528
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006529The ``handlers`` are a nonempty list of successor blocks that each begin with a
David Majnemer8a1c45d2015-12-12 05:38:55 +00006530:ref:`catchpad <i_catchpad>` instruction.
6531
6532Semantics:
6533""""""""""
6534
6535Executing this instruction transfers control to one of the successors in
6536``handlers``, if appropriate, or continues to unwind via the unwind label if
6537present.
6538
6539The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
6540it must be both the first non-phi instruction and last instruction in the basic
6541block. Therefore, it must be the only non-phi instruction in the block.
6542
6543Example:
6544""""""""
6545
Renato Golin124f2592016-07-20 12:16:38 +00006546.. code-block:: text
David Majnemer8a1c45d2015-12-12 05:38:55 +00006547
6548 dispatch1:
6549 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
6550 dispatch2:
6551 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
6552
David Majnemer654e1302015-07-31 17:58:14 +00006553.. _i_catchret:
6554
6555'``catchret``' Instruction
6556^^^^^^^^^^^^^^^^^^^^^^^^^^
6557
6558Syntax:
6559"""""""
6560
6561::
6562
David Majnemer8a1c45d2015-12-12 05:38:55 +00006563 catchret from <token> to label <normal>
David Majnemer654e1302015-07-31 17:58:14 +00006564
6565Overview:
6566"""""""""
6567
6568The '``catchret``' instruction is a terminator instruction that has a
6569single successor.
6570
6571
6572Arguments:
6573""""""""""
6574
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00006575The first argument to a '``catchret``' indicates which ``catchpad`` it
6576exits. It must be a :ref:`catchpad <i_catchpad>`.
6577The second argument to a '``catchret``' specifies where control will
6578transfer to next.
David Majnemer654e1302015-07-31 17:58:14 +00006579
6580Semantics:
6581""""""""""
6582
David Majnemer8a1c45d2015-12-12 05:38:55 +00006583The '``catchret``' instruction ends an existing (in-flight) exception whose
6584unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The
6585:ref:`personality function <personalityfn>` gets a chance to execute arbitrary
6586code to, for example, destroy the active exception. Control then transfers to
6587``normal``.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00006588
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006589The ``token`` argument must be a token produced by a ``catchpad`` instruction.
6590If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
6591funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
6592the ``catchret``'s behavior is undefined.
David Majnemer654e1302015-07-31 17:58:14 +00006593
6594Example:
6595""""""""
6596
Renato Golin124f2592016-07-20 12:16:38 +00006597.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00006598
David Majnemer8a1c45d2015-12-12 05:38:55 +00006599 catchret from %catch label %continue
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00006600
David Majnemer654e1302015-07-31 17:58:14 +00006601.. _i_cleanupret:
6602
6603'``cleanupret``' Instruction
6604^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6605
6606Syntax:
6607"""""""
6608
6609::
6610
David Majnemer8a1c45d2015-12-12 05:38:55 +00006611 cleanupret from <value> unwind label <continue>
6612 cleanupret from <value> unwind to caller
David Majnemer654e1302015-07-31 17:58:14 +00006613
6614Overview:
6615"""""""""
6616
6617The '``cleanupret``' instruction is a terminator instruction that has
6618an optional successor.
6619
6620
6621Arguments:
6622""""""""""
6623
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00006624The '``cleanupret``' instruction requires one argument, which indicates
6625which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006626If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
6627funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
6628the ``cleanupret``'s behavior is undefined.
6629
6630The '``cleanupret``' instruction also has an optional successor, ``continue``,
6631which must be the label of another basic block beginning with either a
6632``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must
6633be a legal target with respect to the ``parent`` links, as described in the
6634`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
David Majnemer654e1302015-07-31 17:58:14 +00006635
6636Semantics:
6637""""""""""
6638
6639The '``cleanupret``' instruction indicates to the
6640:ref:`personality function <personalityfn>` that one
6641:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
6642It transfers control to ``continue`` or unwinds out of the function.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00006643
David Majnemer654e1302015-07-31 17:58:14 +00006644Example:
6645""""""""
6646
Renato Golin124f2592016-07-20 12:16:38 +00006647.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00006648
David Majnemer8a1c45d2015-12-12 05:38:55 +00006649 cleanupret from %cleanup unwind to caller
6650 cleanupret from %cleanup unwind label %continue
David Majnemer654e1302015-07-31 17:58:14 +00006651
Sean Silvab084af42012-12-07 10:36:55 +00006652.. _i_unreachable:
6653
6654'``unreachable``' Instruction
6655^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6656
6657Syntax:
6658"""""""
6659
6660::
6661
6662 unreachable
6663
6664Overview:
6665"""""""""
6666
6667The '``unreachable``' instruction has no defined semantics. This
6668instruction is used to inform the optimizer that a particular portion of
6669the code is not reachable. This can be used to indicate that the code
6670after a no-return function cannot be reached, and other facts.
6671
6672Semantics:
6673""""""""""
6674
6675The '``unreachable``' instruction has no defined semantics.
6676
6677.. _binaryops:
6678
6679Binary Operations
6680-----------------
6681
6682Binary operators are used to do most of the computation in a program.
6683They require two operands of the same type, execute an operation on
6684them, and produce a single value. The operands might represent multiple
6685data, as is the case with the :ref:`vector <t_vector>` data type. The
6686result value has the same type as its operands.
6687
6688There are several different binary operators:
6689
6690.. _i_add:
6691
6692'``add``' Instruction
6693^^^^^^^^^^^^^^^^^^^^^
6694
6695Syntax:
6696"""""""
6697
6698::
6699
Tim Northover675a0962014-06-13 14:24:23 +00006700 <result> = add <ty> <op1>, <op2> ; yields ty:result
6701 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
6702 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
6703 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006704
6705Overview:
6706"""""""""
6707
6708The '``add``' instruction returns the sum of its two operands.
6709
6710Arguments:
6711""""""""""
6712
6713The two arguments to the '``add``' instruction must be
6714:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6715arguments must have identical types.
6716
6717Semantics:
6718""""""""""
6719
6720The value produced is the integer sum of the two operands.
6721
6722If the sum has unsigned overflow, the result returned is the
6723mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
6724the result.
6725
6726Because LLVM integers use a two's complement representation, this
6727instruction is appropriate for both signed and unsigned integers.
6728
6729``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6730respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6731result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
6732unsigned and/or signed overflow, respectively, occurs.
6733
6734Example:
6735""""""""
6736
Renato Golin124f2592016-07-20 12:16:38 +00006737.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006738
Tim Northover675a0962014-06-13 14:24:23 +00006739 <result> = add i32 4, %var ; yields i32:result = 4 + %var
Sean Silvab084af42012-12-07 10:36:55 +00006740
6741.. _i_fadd:
6742
6743'``fadd``' Instruction
6744^^^^^^^^^^^^^^^^^^^^^^
6745
6746Syntax:
6747"""""""
6748
6749::
6750
Tim Northover675a0962014-06-13 14:24:23 +00006751 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006752
6753Overview:
6754"""""""""
6755
6756The '``fadd``' instruction returns the sum of its two operands.
6757
6758Arguments:
6759""""""""""
6760
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00006761The two arguments to the '``fadd``' instruction must be
6762:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
6763floating-point values. Both arguments must have identical types.
Sean Silvab084af42012-12-07 10:36:55 +00006764
6765Semantics:
6766""""""""""
6767
Sanjay Patel7b722402018-03-07 17:18:22 +00006768The value produced is the floating-point sum of the two operands.
Sanjay Patelec95e0e2018-03-20 17:05:19 +00006769This instruction is assumed to execute in the default :ref:`floating-point
6770environment <floatenv>`.
Sanjay Patel7b722402018-03-07 17:18:22 +00006771This instruction can also take any number of :ref:`fast-math
6772flags <fastmath>`, which are optimization hints to enable otherwise
6773unsafe floating-point optimizations:
Sean Silvab084af42012-12-07 10:36:55 +00006774
6775Example:
6776""""""""
6777
Renato Golin124f2592016-07-20 12:16:38 +00006778.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006779
Tim Northover675a0962014-06-13 14:24:23 +00006780 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
Sean Silvab084af42012-12-07 10:36:55 +00006781
6782'``sub``' Instruction
6783^^^^^^^^^^^^^^^^^^^^^
6784
6785Syntax:
6786"""""""
6787
6788::
6789
Tim Northover675a0962014-06-13 14:24:23 +00006790 <result> = sub <ty> <op1>, <op2> ; yields ty:result
6791 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
6792 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
6793 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006794
6795Overview:
6796"""""""""
6797
6798The '``sub``' instruction returns the difference of its two operands.
6799
6800Note that the '``sub``' instruction is used to represent the '``neg``'
6801instruction present in most other intermediate representations.
6802
6803Arguments:
6804""""""""""
6805
6806The two arguments to the '``sub``' instruction must be
6807:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6808arguments must have identical types.
6809
6810Semantics:
6811""""""""""
6812
6813The value produced is the integer difference of the two operands.
6814
6815If the difference has unsigned overflow, the result returned is the
6816mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
6817the result.
6818
6819Because LLVM integers use a two's complement representation, this
6820instruction is appropriate for both signed and unsigned integers.
6821
6822``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6823respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6824result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
6825unsigned and/or signed overflow, respectively, occurs.
6826
6827Example:
6828""""""""
6829
Renato Golin124f2592016-07-20 12:16:38 +00006830.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006831
Tim Northover675a0962014-06-13 14:24:23 +00006832 <result> = sub i32 4, %var ; yields i32:result = 4 - %var
6833 <result> = sub i32 0, %val ; yields i32:result = -%var
Sean Silvab084af42012-12-07 10:36:55 +00006834
6835.. _i_fsub:
6836
6837'``fsub``' Instruction
6838^^^^^^^^^^^^^^^^^^^^^^
6839
6840Syntax:
6841"""""""
6842
6843::
6844
Tim Northover675a0962014-06-13 14:24:23 +00006845 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006846
6847Overview:
6848"""""""""
6849
6850The '``fsub``' instruction returns the difference of its two operands.
6851
6852Note that the '``fsub``' instruction is used to represent the '``fneg``'
6853instruction present in most other intermediate representations.
6854
6855Arguments:
6856""""""""""
6857
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00006858The two arguments to the '``fsub``' instruction must be
6859:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
6860floating-point values. Both arguments must have identical types.
Sean Silvab084af42012-12-07 10:36:55 +00006861
6862Semantics:
6863""""""""""
6864
Sanjay Patel7b722402018-03-07 17:18:22 +00006865The value produced is the floating-point difference of the two operands.
Sanjay Patelec95e0e2018-03-20 17:05:19 +00006866This instruction is assumed to execute in the default :ref:`floating-point
6867environment <floatenv>`.
Sean Silvab084af42012-12-07 10:36:55 +00006868This instruction can also take any number of :ref:`fast-math
6869flags <fastmath>`, which are optimization hints to enable otherwise
Sanjay Patel7b722402018-03-07 17:18:22 +00006870unsafe floating-point optimizations:
Sean Silvab084af42012-12-07 10:36:55 +00006871
6872Example:
6873""""""""
6874
Renato Golin124f2592016-07-20 12:16:38 +00006875.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006876
Tim Northover675a0962014-06-13 14:24:23 +00006877 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
6878 <result> = fsub float -0.0, %val ; yields float:result = -%var
Sean Silvab084af42012-12-07 10:36:55 +00006879
6880'``mul``' Instruction
6881^^^^^^^^^^^^^^^^^^^^^
6882
6883Syntax:
6884"""""""
6885
6886::
6887
Tim Northover675a0962014-06-13 14:24:23 +00006888 <result> = mul <ty> <op1>, <op2> ; yields ty:result
6889 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
6890 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
6891 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006892
6893Overview:
6894"""""""""
6895
6896The '``mul``' instruction returns the product of its two operands.
6897
6898Arguments:
6899""""""""""
6900
6901The two arguments to the '``mul``' instruction must be
6902:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6903arguments must have identical types.
6904
6905Semantics:
6906""""""""""
6907
6908The value produced is the integer product of the two operands.
6909
6910If the result of the multiplication has unsigned overflow, the result
6911returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
6912bit width of the result.
6913
6914Because LLVM integers use a two's complement representation, and the
6915result is the same width as the operands, this instruction returns the
6916correct result for both signed and unsigned integers. If a full product
6917(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
6918sign-extended or zero-extended as appropriate to the width of the full
6919product.
6920
6921``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6922respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6923result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
6924unsigned and/or signed overflow, respectively, occurs.
6925
6926Example:
6927""""""""
6928
Renato Golin124f2592016-07-20 12:16:38 +00006929.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006930
Tim Northover675a0962014-06-13 14:24:23 +00006931 <result> = mul i32 4, %var ; yields i32:result = 4 * %var
Sean Silvab084af42012-12-07 10:36:55 +00006932
6933.. _i_fmul:
6934
6935'``fmul``' Instruction
6936^^^^^^^^^^^^^^^^^^^^^^
6937
6938Syntax:
6939"""""""
6940
6941::
6942
Tim Northover675a0962014-06-13 14:24:23 +00006943 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006944
6945Overview:
6946"""""""""
6947
6948The '``fmul``' instruction returns the product of its two operands.
6949
6950Arguments:
6951""""""""""
6952
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00006953The two arguments to the '``fmul``' instruction must be
6954:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
6955floating-point values. Both arguments must have identical types.
Sean Silvab084af42012-12-07 10:36:55 +00006956
6957Semantics:
6958""""""""""
6959
Sanjay Patel7b722402018-03-07 17:18:22 +00006960The value produced is the floating-point product of the two operands.
Sanjay Patelec95e0e2018-03-20 17:05:19 +00006961This instruction is assumed to execute in the default :ref:`floating-point
6962environment <floatenv>`.
Sean Silvab084af42012-12-07 10:36:55 +00006963This instruction can also take any number of :ref:`fast-math
6964flags <fastmath>`, which are optimization hints to enable otherwise
Sanjay Patel7b722402018-03-07 17:18:22 +00006965unsafe floating-point optimizations:
Sean Silvab084af42012-12-07 10:36:55 +00006966
6967Example:
6968""""""""
6969
Renato Golin124f2592016-07-20 12:16:38 +00006970.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006971
Tim Northover675a0962014-06-13 14:24:23 +00006972 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
Sean Silvab084af42012-12-07 10:36:55 +00006973
6974'``udiv``' Instruction
6975^^^^^^^^^^^^^^^^^^^^^^
6976
6977Syntax:
6978"""""""
6979
6980::
6981
Tim Northover675a0962014-06-13 14:24:23 +00006982 <result> = udiv <ty> <op1>, <op2> ; yields ty:result
6983 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006984
6985Overview:
6986"""""""""
6987
6988The '``udiv``' instruction returns the quotient of its two operands.
6989
6990Arguments:
6991""""""""""
6992
6993The two arguments to the '``udiv``' instruction must be
6994:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6995arguments must have identical types.
6996
6997Semantics:
6998""""""""""
6999
7000The value produced is the unsigned integer quotient of the two operands.
7001
7002Note that unsigned integer division and signed integer division are
7003distinct operations; for signed integer division, use '``sdiv``'.
7004
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00007005Division by zero is undefined behavior. For vectors, if any element
7006of the divisor is zero, the operation has undefined behavior.
7007
Sean Silvab084af42012-12-07 10:36:55 +00007008
7009If the ``exact`` keyword is present, the result value of the ``udiv`` is
7010a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
7011such, "((a udiv exact b) mul b) == a").
7012
7013Example:
7014""""""""
7015
Renato Golin124f2592016-07-20 12:16:38 +00007016.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007017
Tim Northover675a0962014-06-13 14:24:23 +00007018 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
Sean Silvab084af42012-12-07 10:36:55 +00007019
7020'``sdiv``' Instruction
7021^^^^^^^^^^^^^^^^^^^^^^
7022
7023Syntax:
7024"""""""
7025
7026::
7027
Tim Northover675a0962014-06-13 14:24:23 +00007028 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
7029 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007030
7031Overview:
7032"""""""""
7033
7034The '``sdiv``' instruction returns the quotient of its two operands.
7035
7036Arguments:
7037""""""""""
7038
7039The two arguments to the '``sdiv``' instruction must be
7040:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7041arguments must have identical types.
7042
7043Semantics:
7044""""""""""
7045
7046The value produced is the signed integer quotient of the two operands
7047rounded towards zero.
7048
7049Note that signed integer division and unsigned integer division are
7050distinct operations; for unsigned integer division, use '``udiv``'.
7051
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00007052Division by zero is undefined behavior. For vectors, if any element
7053of the divisor is zero, the operation has undefined behavior.
7054Overflow also leads to undefined behavior; this is a rare case, but can
7055occur, for example, by doing a 32-bit division of -2147483648 by -1.
Sean Silvab084af42012-12-07 10:36:55 +00007056
7057If the ``exact`` keyword is present, the result value of the ``sdiv`` is
7058a :ref:`poison value <poisonvalues>` if the result would be rounded.
7059
7060Example:
7061""""""""
7062
Renato Golin124f2592016-07-20 12:16:38 +00007063.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007064
Tim Northover675a0962014-06-13 14:24:23 +00007065 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
Sean Silvab084af42012-12-07 10:36:55 +00007066
7067.. _i_fdiv:
7068
7069'``fdiv``' Instruction
7070^^^^^^^^^^^^^^^^^^^^^^
7071
7072Syntax:
7073"""""""
7074
7075::
7076
Tim Northover675a0962014-06-13 14:24:23 +00007077 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007078
7079Overview:
7080"""""""""
7081
7082The '``fdiv``' instruction returns the quotient of its two operands.
7083
7084Arguments:
7085""""""""""
7086
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00007087The two arguments to the '``fdiv``' instruction must be
7088:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
7089floating-point values. Both arguments must have identical types.
Sean Silvab084af42012-12-07 10:36:55 +00007090
7091Semantics:
7092""""""""""
7093
Sanjay Patel7b722402018-03-07 17:18:22 +00007094The value produced is the floating-point quotient of the two operands.
Sanjay Patelec95e0e2018-03-20 17:05:19 +00007095This instruction is assumed to execute in the default :ref:`floating-point
7096environment <floatenv>`.
Sean Silvab084af42012-12-07 10:36:55 +00007097This instruction can also take any number of :ref:`fast-math
7098flags <fastmath>`, which are optimization hints to enable otherwise
Sanjay Patel7b722402018-03-07 17:18:22 +00007099unsafe floating-point optimizations:
Sean Silvab084af42012-12-07 10:36:55 +00007100
7101Example:
7102""""""""
7103
Renato Golin124f2592016-07-20 12:16:38 +00007104.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007105
Tim Northover675a0962014-06-13 14:24:23 +00007106 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
Sean Silvab084af42012-12-07 10:36:55 +00007107
7108'``urem``' Instruction
7109^^^^^^^^^^^^^^^^^^^^^^
7110
7111Syntax:
7112"""""""
7113
7114::
7115
Tim Northover675a0962014-06-13 14:24:23 +00007116 <result> = urem <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007117
7118Overview:
7119"""""""""
7120
7121The '``urem``' instruction returns the remainder from the unsigned
7122division of its two arguments.
7123
7124Arguments:
7125""""""""""
7126
7127The two arguments to the '``urem``' instruction must be
7128:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7129arguments must have identical types.
7130
7131Semantics:
7132""""""""""
7133
7134This instruction returns the unsigned integer *remainder* of a division.
7135This instruction always performs an unsigned division to get the
7136remainder.
7137
7138Note that unsigned integer remainder and signed integer remainder are
7139distinct operations; for signed integer remainder, use '``srem``'.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00007140
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00007141Taking the remainder of a division by zero is undefined behavior.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00007142For vectors, if any element of the divisor is zero, the operation has
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00007143undefined behavior.
Sean Silvab084af42012-12-07 10:36:55 +00007144
7145Example:
7146""""""""
7147
Renato Golin124f2592016-07-20 12:16:38 +00007148.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007149
Tim Northover675a0962014-06-13 14:24:23 +00007150 <result> = urem i32 4, %var ; yields i32:result = 4 % %var
Sean Silvab084af42012-12-07 10:36:55 +00007151
7152'``srem``' Instruction
7153^^^^^^^^^^^^^^^^^^^^^^
7154
7155Syntax:
7156"""""""
7157
7158::
7159
Tim Northover675a0962014-06-13 14:24:23 +00007160 <result> = srem <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007161
7162Overview:
7163"""""""""
7164
7165The '``srem``' instruction returns the remainder from the signed
7166division of its two operands. This instruction can also take
7167:ref:`vector <t_vector>` versions of the values in which case the elements
7168must be integers.
7169
7170Arguments:
7171""""""""""
7172
7173The two arguments to the '``srem``' instruction must be
7174:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7175arguments must have identical types.
7176
7177Semantics:
7178""""""""""
7179
7180This instruction returns the *remainder* of a division (where the result
7181is either zero or has the same sign as the dividend, ``op1``), not the
7182*modulo* operator (where the result is either zero or has the same sign
7183as the divisor, ``op2``) of a value. For more information about the
7184difference, see `The Math
7185Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
7186table of how this is implemented in various languages, please see
7187`Wikipedia: modulo
7188operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
7189
7190Note that signed integer remainder and unsigned integer remainder are
7191distinct operations; for unsigned integer remainder, use '``urem``'.
7192
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00007193Taking the remainder of a division by zero is undefined behavior.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00007194For vectors, if any element of the divisor is zero, the operation has
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00007195undefined behavior.
Sean Silvab084af42012-12-07 10:36:55 +00007196Overflow also leads to undefined behavior; this is a rare case, but can
7197occur, for example, by taking the remainder of a 32-bit division of
7198-2147483648 by -1. (The remainder doesn't actually overflow, but this
7199rule lets srem be implemented using instructions that return both the
7200result of the division and the remainder.)
7201
7202Example:
7203""""""""
7204
Renato Golin124f2592016-07-20 12:16:38 +00007205.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007206
Tim Northover675a0962014-06-13 14:24:23 +00007207 <result> = srem i32 4, %var ; yields i32:result = 4 % %var
Sean Silvab084af42012-12-07 10:36:55 +00007208
7209.. _i_frem:
7210
7211'``frem``' Instruction
7212^^^^^^^^^^^^^^^^^^^^^^
7213
7214Syntax:
7215"""""""
7216
7217::
7218
Tim Northover675a0962014-06-13 14:24:23 +00007219 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007220
7221Overview:
7222"""""""""
7223
7224The '``frem``' instruction returns the remainder from the division of
7225its two operands.
7226
7227Arguments:
7228""""""""""
7229
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00007230The two arguments to the '``frem``' instruction must be
7231:ref:`floating-point <t_floating>` or :ref:`vector <t_vector>` of
7232floating-point values. Both arguments must have identical types.
Sean Silvab084af42012-12-07 10:36:55 +00007233
7234Semantics:
7235""""""""""
7236
Sanjay Patel7b722402018-03-07 17:18:22 +00007237The value produced is the floating-point remainder of the two operands.
7238This is the same output as a libm '``fmod``' function, but without any
7239possibility of setting ``errno``. The remainder has the same sign as the
7240dividend.
Sanjay Patelec95e0e2018-03-20 17:05:19 +00007241This instruction is assumed to execute in the default :ref:`floating-point
7242environment <floatenv>`.
Sanjay Patel7b722402018-03-07 17:18:22 +00007243This instruction can also take any number of :ref:`fast-math
7244flags <fastmath>`, which are optimization hints to enable otherwise
7245unsafe floating-point optimizations:
Sean Silvab084af42012-12-07 10:36:55 +00007246
7247Example:
7248""""""""
7249
Renato Golin124f2592016-07-20 12:16:38 +00007250.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007251
Tim Northover675a0962014-06-13 14:24:23 +00007252 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
Sean Silvab084af42012-12-07 10:36:55 +00007253
7254.. _bitwiseops:
7255
7256Bitwise Binary Operations
7257-------------------------
7258
7259Bitwise binary operators are used to do various forms of bit-twiddling
7260in a program. They are generally very efficient instructions and can
7261commonly be strength reduced from other instructions. They require two
7262operands of the same type, execute an operation on them, and produce a
7263single value. The resulting value is the same type as its operands.
7264
7265'``shl``' Instruction
7266^^^^^^^^^^^^^^^^^^^^^
7267
7268Syntax:
7269"""""""
7270
7271::
7272
Tim Northover675a0962014-06-13 14:24:23 +00007273 <result> = shl <ty> <op1>, <op2> ; yields ty:result
7274 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
7275 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
7276 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007277
7278Overview:
7279"""""""""
7280
7281The '``shl``' instruction returns the first operand shifted to the left
7282a specified number of bits.
7283
7284Arguments:
7285""""""""""
7286
7287Both arguments to the '``shl``' instruction must be the same
7288:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
7289'``op2``' is treated as an unsigned value.
7290
7291Semantics:
7292""""""""""
7293
7294The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
7295where ``n`` is the width of the result. If ``op2`` is (statically or
Sean Silvab8a108c2015-04-17 21:58:55 +00007296dynamically) equal to or larger than the number of bits in
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007297``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
7298If the arguments are vectors, each vector element of ``op1`` is shifted
7299by the corresponding shift amount in ``op2``.
Sean Silvab084af42012-12-07 10:36:55 +00007300
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007301If the ``nuw`` keyword is present, then the shift produces a poison
7302value if it shifts out any non-zero bits.
7303If the ``nsw`` keyword is present, then the shift produces a poison
Sanjay Patel2896c772018-06-01 15:21:14 +00007304value if it shifts out any bits that disagree with the resultant sign bit.
Sean Silvab084af42012-12-07 10:36:55 +00007305
7306Example:
7307""""""""
7308
Renato Golin124f2592016-07-20 12:16:38 +00007309.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007310
Tim Northover675a0962014-06-13 14:24:23 +00007311 <result> = shl i32 4, %var ; yields i32: 4 << %var
7312 <result> = shl i32 4, 2 ; yields i32: 16
7313 <result> = shl i32 1, 10 ; yields i32: 1024
Sean Silvab084af42012-12-07 10:36:55 +00007314 <result> = shl i32 1, 32 ; undefined
7315 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
7316
7317'``lshr``' Instruction
7318^^^^^^^^^^^^^^^^^^^^^^
7319
7320Syntax:
7321"""""""
7322
7323::
7324
Tim Northover675a0962014-06-13 14:24:23 +00007325 <result> = lshr <ty> <op1>, <op2> ; yields ty:result
7326 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007327
7328Overview:
7329"""""""""
7330
7331The '``lshr``' instruction (logical shift right) returns the first
7332operand shifted to the right a specified number of bits with zero fill.
7333
7334Arguments:
7335""""""""""
7336
7337Both arguments to the '``lshr``' instruction must be the same
7338:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
7339'``op2``' is treated as an unsigned value.
7340
7341Semantics:
7342""""""""""
7343
7344This instruction always performs a logical shift right operation. The
7345most significant bits of the result will be filled with zero bits after
7346the shift. If ``op2`` is (statically or dynamically) equal to or larger
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007347than the number of bits in ``op1``, this instruction returns a :ref:`poison
7348value <poisonvalues>`. If the arguments are vectors, each vector element
7349of ``op1`` is shifted by the corresponding shift amount in ``op2``.
Sean Silvab084af42012-12-07 10:36:55 +00007350
7351If the ``exact`` keyword is present, the result value of the ``lshr`` is
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007352a poison value if any of the bits shifted out are non-zero.
Sean Silvab084af42012-12-07 10:36:55 +00007353
7354Example:
7355""""""""
7356
Renato Golin124f2592016-07-20 12:16:38 +00007357.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007358
Tim Northover675a0962014-06-13 14:24:23 +00007359 <result> = lshr i32 4, 1 ; yields i32:result = 2
7360 <result> = lshr i32 4, 2 ; yields i32:result = 1
7361 <result> = lshr i8 4, 3 ; yields i8:result = 0
7362 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
Sean Silvab084af42012-12-07 10:36:55 +00007363 <result> = lshr i32 1, 32 ; undefined
7364 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
7365
7366'``ashr``' Instruction
7367^^^^^^^^^^^^^^^^^^^^^^
7368
7369Syntax:
7370"""""""
7371
7372::
7373
Tim Northover675a0962014-06-13 14:24:23 +00007374 <result> = ashr <ty> <op1>, <op2> ; yields ty:result
7375 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007376
7377Overview:
7378"""""""""
7379
7380The '``ashr``' instruction (arithmetic shift right) returns the first
7381operand shifted to the right a specified number of bits with sign
7382extension.
7383
7384Arguments:
7385""""""""""
7386
7387Both arguments to the '``ashr``' instruction must be the same
7388:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
7389'``op2``' is treated as an unsigned value.
7390
7391Semantics:
7392""""""""""
7393
7394This instruction always performs an arithmetic shift right operation,
7395The most significant bits of the result will be filled with the sign bit
7396of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007397than the number of bits in ``op1``, this instruction returns a :ref:`poison
7398value <poisonvalues>`. If the arguments are vectors, each vector element
7399of ``op1`` is shifted by the corresponding shift amount in ``op2``.
Sean Silvab084af42012-12-07 10:36:55 +00007400
7401If the ``exact`` keyword is present, the result value of the ``ashr`` is
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007402a poison value if any of the bits shifted out are non-zero.
Sean Silvab084af42012-12-07 10:36:55 +00007403
7404Example:
7405""""""""
7406
Renato Golin124f2592016-07-20 12:16:38 +00007407.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007408
Tim Northover675a0962014-06-13 14:24:23 +00007409 <result> = ashr i32 4, 1 ; yields i32:result = 2
7410 <result> = ashr i32 4, 2 ; yields i32:result = 1
7411 <result> = ashr i8 4, 3 ; yields i8:result = 0
7412 <result> = ashr i8 -2, 1 ; yields i8:result = -1
Sean Silvab084af42012-12-07 10:36:55 +00007413 <result> = ashr i32 1, 32 ; undefined
7414 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
7415
7416'``and``' Instruction
7417^^^^^^^^^^^^^^^^^^^^^
7418
7419Syntax:
7420"""""""
7421
7422::
7423
Tim Northover675a0962014-06-13 14:24:23 +00007424 <result> = and <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007425
7426Overview:
7427"""""""""
7428
7429The '``and``' instruction returns the bitwise logical and of its two
7430operands.
7431
7432Arguments:
7433""""""""""
7434
7435The two arguments to the '``and``' instruction must be
7436:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7437arguments must have identical types.
7438
7439Semantics:
7440""""""""""
7441
7442The truth table used for the '``and``' instruction is:
7443
7444+-----+-----+-----+
7445| In0 | In1 | Out |
7446+-----+-----+-----+
7447| 0 | 0 | 0 |
7448+-----+-----+-----+
7449| 0 | 1 | 0 |
7450+-----+-----+-----+
7451| 1 | 0 | 0 |
7452+-----+-----+-----+
7453| 1 | 1 | 1 |
7454+-----+-----+-----+
7455
7456Example:
7457""""""""
7458
Renato Golin124f2592016-07-20 12:16:38 +00007459.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007460
Tim Northover675a0962014-06-13 14:24:23 +00007461 <result> = and i32 4, %var ; yields i32:result = 4 & %var
7462 <result> = and i32 15, 40 ; yields i32:result = 8
7463 <result> = and i32 4, 8 ; yields i32:result = 0
Sean Silvab084af42012-12-07 10:36:55 +00007464
7465'``or``' Instruction
7466^^^^^^^^^^^^^^^^^^^^
7467
7468Syntax:
7469"""""""
7470
7471::
7472
Tim Northover675a0962014-06-13 14:24:23 +00007473 <result> = or <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007474
7475Overview:
7476"""""""""
7477
7478The '``or``' instruction returns the bitwise logical inclusive or of its
7479two operands.
7480
7481Arguments:
7482""""""""""
7483
7484The two arguments to the '``or``' instruction must be
7485:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7486arguments must have identical types.
7487
7488Semantics:
7489""""""""""
7490
7491The truth table used for the '``or``' instruction is:
7492
7493+-----+-----+-----+
7494| In0 | In1 | Out |
7495+-----+-----+-----+
7496| 0 | 0 | 0 |
7497+-----+-----+-----+
7498| 0 | 1 | 1 |
7499+-----+-----+-----+
7500| 1 | 0 | 1 |
7501+-----+-----+-----+
7502| 1 | 1 | 1 |
7503+-----+-----+-----+
7504
7505Example:
7506""""""""
7507
7508::
7509
Tim Northover675a0962014-06-13 14:24:23 +00007510 <result> = or i32 4, %var ; yields i32:result = 4 | %var
7511 <result> = or i32 15, 40 ; yields i32:result = 47
7512 <result> = or i32 4, 8 ; yields i32:result = 12
Sean Silvab084af42012-12-07 10:36:55 +00007513
7514'``xor``' Instruction
7515^^^^^^^^^^^^^^^^^^^^^
7516
7517Syntax:
7518"""""""
7519
7520::
7521
Tim Northover675a0962014-06-13 14:24:23 +00007522 <result> = xor <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007523
7524Overview:
7525"""""""""
7526
7527The '``xor``' instruction returns the bitwise logical exclusive or of
7528its two operands. The ``xor`` is used to implement the "one's
7529complement" operation, which is the "~" operator in C.
7530
7531Arguments:
7532""""""""""
7533
7534The two arguments to the '``xor``' instruction must be
7535:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7536arguments must have identical types.
7537
7538Semantics:
7539""""""""""
7540
7541The truth table used for the '``xor``' instruction is:
7542
7543+-----+-----+-----+
7544| In0 | In1 | Out |
7545+-----+-----+-----+
7546| 0 | 0 | 0 |
7547+-----+-----+-----+
7548| 0 | 1 | 1 |
7549+-----+-----+-----+
7550| 1 | 0 | 1 |
7551+-----+-----+-----+
7552| 1 | 1 | 0 |
7553+-----+-----+-----+
7554
7555Example:
7556""""""""
7557
Renato Golin124f2592016-07-20 12:16:38 +00007558.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007559
Tim Northover675a0962014-06-13 14:24:23 +00007560 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
7561 <result> = xor i32 15, 40 ; yields i32:result = 39
7562 <result> = xor i32 4, 8 ; yields i32:result = 12
7563 <result> = xor i32 %V, -1 ; yields i32:result = ~%V
Sean Silvab084af42012-12-07 10:36:55 +00007564
7565Vector Operations
7566-----------------
7567
7568LLVM supports several instructions to represent vector operations in a
7569target-independent manner. These instructions cover the element-access
7570and vector-specific operations needed to process vectors effectively.
7571While LLVM does directly support these vector operations, many
7572sophisticated algorithms will want to use target-specific intrinsics to
7573take full advantage of a specific target.
7574
7575.. _i_extractelement:
7576
7577'``extractelement``' Instruction
7578^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7579
7580Syntax:
7581"""""""
7582
7583::
7584
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007585 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty>
Sean Silvab084af42012-12-07 10:36:55 +00007586
7587Overview:
7588"""""""""
7589
7590The '``extractelement``' instruction extracts a single scalar element
7591from a vector at a specified index.
7592
7593Arguments:
7594""""""""""
7595
7596The first operand of an '``extractelement``' instruction is a value of
7597:ref:`vector <t_vector>` type. The second operand is an index indicating
7598the position from which to extract the element. The index may be a
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007599variable of any integer type.
Sean Silvab084af42012-12-07 10:36:55 +00007600
7601Semantics:
7602""""""""""
7603
7604The result is a scalar of the same type as the element type of ``val``.
7605Its value is the value at position ``idx`` of ``val``. If ``idx``
Eli Friedman2c7a81b2018-06-08 21:23:09 +00007606exceeds the length of ``val``, the result is a
7607:ref:`poison value <poisonvalues>`.
Sean Silvab084af42012-12-07 10:36:55 +00007608
7609Example:
7610""""""""
7611
Renato Golin124f2592016-07-20 12:16:38 +00007612.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007613
7614 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
7615
7616.. _i_insertelement:
7617
7618'``insertelement``' Instruction
7619^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7620
7621Syntax:
7622"""""""
7623
7624::
7625
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007626 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>>
Sean Silvab084af42012-12-07 10:36:55 +00007627
7628Overview:
7629"""""""""
7630
7631The '``insertelement``' instruction inserts a scalar element into a
7632vector at a specified index.
7633
7634Arguments:
7635""""""""""
7636
7637The first operand of an '``insertelement``' instruction is a value of
7638:ref:`vector <t_vector>` type. The second operand is a scalar value whose
7639type must equal the element type of the first operand. The third operand
7640is an index indicating the position at which to insert the value. The
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007641index may be a variable of any integer type.
Sean Silvab084af42012-12-07 10:36:55 +00007642
7643Semantics:
7644""""""""""
7645
7646The result is a vector of the same type as ``val``. Its element values
7647are those of ``val`` except at position ``idx``, where it gets the value
Eli Friedman2c7a81b2018-06-08 21:23:09 +00007648``elt``. If ``idx`` exceeds the length of ``val``, the result
7649is a :ref:`poison value <poisonvalues>`.
Sean Silvab084af42012-12-07 10:36:55 +00007650
7651Example:
7652""""""""
7653
Renato Golin124f2592016-07-20 12:16:38 +00007654.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007655
7656 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
7657
7658.. _i_shufflevector:
7659
7660'``shufflevector``' Instruction
7661^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7662
7663Syntax:
7664"""""""
7665
7666::
7667
7668 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
7669
7670Overview:
7671"""""""""
7672
7673The '``shufflevector``' instruction constructs a permutation of elements
7674from two input vectors, returning a vector with the same element type as
7675the input and length that is the same as the shuffle mask.
7676
7677Arguments:
7678""""""""""
7679
7680The first two operands of a '``shufflevector``' instruction are vectors
7681with the same type. The third argument is a shuffle mask whose element
7682type is always 'i32'. The result of the instruction is a vector whose
7683length is the same as the shuffle mask and whose element type is the
7684same as the element type of the first two operands.
7685
7686The shuffle mask operand is required to be a constant vector with either
7687constant integer or undef values.
7688
7689Semantics:
7690""""""""""
7691
7692The elements of the two input vectors are numbered from left to right
7693across both of the vectors. The shuffle mask operand specifies, for each
7694element of the result vector, which element of the two input vectors the
Sanjay Patel6e410182017-04-12 18:39:53 +00007695result element gets. If the shuffle mask is undef, the result vector is
7696undef. If any element of the mask operand is undef, that element of the
7697result is undef. If the shuffle mask selects an undef element from one
7698of the input vectors, the resulting element is undef.
Sean Silvab084af42012-12-07 10:36:55 +00007699
7700Example:
7701""""""""
7702
Renato Golin124f2592016-07-20 12:16:38 +00007703.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007704
7705 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
7706 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
7707 <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
7708 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
7709 <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
7710 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
7711 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
7712 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
7713
7714Aggregate Operations
7715--------------------
7716
7717LLVM supports several instructions for working with
7718:ref:`aggregate <t_aggregate>` values.
7719
7720.. _i_extractvalue:
7721
7722'``extractvalue``' Instruction
7723^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7724
7725Syntax:
7726"""""""
7727
7728::
7729
7730 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
7731
7732Overview:
7733"""""""""
7734
7735The '``extractvalue``' instruction extracts the value of a member field
7736from an :ref:`aggregate <t_aggregate>` value.
7737
7738Arguments:
7739""""""""""
7740
7741The first operand of an '``extractvalue``' instruction is a value of
Arch D. Robisona7f8f252015-10-14 19:10:45 +00007742:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
Sean Silvab084af42012-12-07 10:36:55 +00007743constant indices to specify which value to extract in a similar manner
7744as indices in a '``getelementptr``' instruction.
7745
7746The major differences to ``getelementptr`` indexing are:
7747
7748- Since the value being indexed is not a pointer, the first index is
7749 omitted and assumed to be zero.
7750- At least one index must be specified.
7751- Not only struct indices but also array indices must be in bounds.
7752
7753Semantics:
7754""""""""""
7755
7756The result is the value at the position in the aggregate specified by
7757the index operands.
7758
7759Example:
7760""""""""
7761
Renato Golin124f2592016-07-20 12:16:38 +00007762.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007763
7764 <result> = extractvalue {i32, float} %agg, 0 ; yields i32
7765
7766.. _i_insertvalue:
7767
7768'``insertvalue``' Instruction
7769^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7770
7771Syntax:
7772"""""""
7773
7774::
7775
7776 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
7777
7778Overview:
7779"""""""""
7780
7781The '``insertvalue``' instruction inserts a value into a member field in
7782an :ref:`aggregate <t_aggregate>` value.
7783
7784Arguments:
7785""""""""""
7786
7787The first operand of an '``insertvalue``' instruction is a value of
7788:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
7789a first-class value to insert. The following operands are constant
7790indices indicating the position at which to insert the value in a
7791similar manner as indices in a '``extractvalue``' instruction. The value
7792to insert must have the same type as the value identified by the
7793indices.
7794
7795Semantics:
7796""""""""""
7797
7798The result is an aggregate of the same type as ``val``. Its value is
7799that of ``val`` except that the value at the position specified by the
7800indices is that of ``elt``.
7801
7802Example:
7803""""""""
7804
7805.. code-block:: llvm
7806
7807 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
7808 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
Dan Liewffcfe7f2014-09-08 21:19:46 +00007809 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}}
Sean Silvab084af42012-12-07 10:36:55 +00007810
7811.. _memoryops:
7812
7813Memory Access and Addressing Operations
7814---------------------------------------
7815
7816A key design point of an SSA-based representation is how it represents
7817memory. In LLVM, no memory locations are in SSA form, which makes things
7818very simple. This section describes how to read, write, and allocate
7819memory in LLVM.
7820
7821.. _i_alloca:
7822
7823'``alloca``' Instruction
7824^^^^^^^^^^^^^^^^^^^^^^^^
7825
7826Syntax:
7827"""""""
7828
7829::
7830
Matt Arsenault3c1fc762017-04-10 22:27:50 +00007831 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result
Sean Silvab084af42012-12-07 10:36:55 +00007832
7833Overview:
7834"""""""""
7835
7836The '``alloca``' instruction allocates memory on the stack frame of the
7837currently executing function, to be automatically released when this
7838function returns to its caller. The object is always allocated in the
Matt Arsenault3c1fc762017-04-10 22:27:50 +00007839address space for allocas indicated in the datalayout.
Sean Silvab084af42012-12-07 10:36:55 +00007840
7841Arguments:
7842""""""""""
7843
7844The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
7845bytes of memory on the runtime stack, returning a pointer of the
7846appropriate type to the program. If "NumElements" is specified, it is
7847the number of elements allocated, otherwise "NumElements" is defaulted
7848to be one. If a constant alignment is specified, the value result of the
Reid Kleckner15fe7a52014-07-15 01:16:09 +00007849allocation is guaranteed to be aligned to at least that boundary. The
7850alignment may not be greater than ``1 << 29``. If not specified, or if
7851zero, the target can choose to align the allocation on any convenient
7852boundary compatible with the type.
Sean Silvab084af42012-12-07 10:36:55 +00007853
7854'``type``' may be any sized type.
7855
7856Semantics:
7857""""""""""
7858
7859Memory is allocated; a pointer is returned. The operation is undefined
7860if there is insufficient stack space for the allocation. '``alloca``'d
7861memory is automatically released when the function returns. The
7862'``alloca``' instruction is commonly used to represent automatic
7863variables that must have an address available. When the function returns
7864(either with the ``ret`` or ``resume`` instructions), the memory is
Eli Friedman18f882c2018-07-11 00:02:01 +00007865reclaimed. Allocating zero bytes is legal, but the returned pointer may not
7866be unique. The order in which memory is allocated (ie., which way the stack
7867grows) is not specified.
Sean Silvab084af42012-12-07 10:36:55 +00007868
7869Example:
7870""""""""
7871
7872.. code-block:: llvm
7873
Tim Northover675a0962014-06-13 14:24:23 +00007874 %ptr = alloca i32 ; yields i32*:ptr
7875 %ptr = alloca i32, i32 4 ; yields i32*:ptr
7876 %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr
7877 %ptr = alloca i32, align 1024 ; yields i32*:ptr
Sean Silvab084af42012-12-07 10:36:55 +00007878
7879.. _i_load:
7880
7881'``load``' Instruction
7882^^^^^^^^^^^^^^^^^^^^^^
7883
7884Syntax:
7885"""""""
7886
7887::
7888
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007889 <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>]
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007890 <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>]
Sean Silvab084af42012-12-07 10:36:55 +00007891 !<index> = !{ i32 1 }
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007892 !<deref_bytes_node> = !{i64 <dereferenceable_bytes>}
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007893 !<align_node> = !{ i64 <value_alignment> }
Sean Silvab084af42012-12-07 10:36:55 +00007894
7895Overview:
7896"""""""""
7897
7898The '``load``' instruction is used to read from memory.
7899
7900Arguments:
7901""""""""""
7902
Sanjoy Dasc2cf6ef2016-06-01 16:13:10 +00007903The argument to the ``load`` instruction specifies the memory address from which
7904to load. The type specified must be a :ref:`first class <t_firstclass>` type of
7905known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
7906the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
7907modify the number or order of execution of this ``load`` with other
7908:ref:`volatile operations <volatile>`.
Sean Silvab084af42012-12-07 10:36:55 +00007909
JF Bastiend1fb5852015-12-17 22:09:19 +00007910If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007911<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
7912``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
7913Atomic loads produce :ref:`defined <memmodel>` results when they may see
7914multiple atomic stores. The type of the pointee must be an integer, pointer, or
7915floating-point type whose bit width is a power of two greater than or equal to
7916eight and less than or equal to a target-specific size limit. ``align`` must be
7917explicitly specified on atomic loads, and the load has undefined behavior if the
7918alignment is not set to a value which is at least the size in bytes of the
JF Bastiend1fb5852015-12-17 22:09:19 +00007919pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
Sean Silvab084af42012-12-07 10:36:55 +00007920
7921The optional constant ``align`` argument specifies the alignment of the
7922operation (that is, the alignment of the memory address). A value of 0
Eli Bendersky239a78b2013-04-17 20:17:08 +00007923or an omitted ``align`` argument means that the operation has the ABI
Sean Silvab084af42012-12-07 10:36:55 +00007924alignment for the target. It is the responsibility of the code emitter
7925to ensure that the alignment information is correct. Overestimating the
7926alignment results in undefined behavior. Underestimating the alignment
Reid Kleckner15fe7a52014-07-15 01:16:09 +00007927may produce less efficient code. An alignment of 1 is always safe. The
Matt Arsenault7020f252016-06-16 16:33:41 +00007928maximum possible alignment is ``1 << 29``. An alignment value higher
7929than the size of the loaded type implies memory up to the alignment
7930value bytes can be safely loaded without trapping in the default
7931address space. Access of the high bytes can interfere with debugging
7932tools, so should not be accessed if the function has the
7933``sanitize_thread`` or ``sanitize_address`` attributes.
Sean Silvab084af42012-12-07 10:36:55 +00007934
7935The optional ``!nontemporal`` metadata must reference a single
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007936metadata name ``<index>`` corresponding to a metadata node with one
Sean Silvab084af42012-12-07 10:36:55 +00007937``i32`` entry of value 1. The existence of the ``!nontemporal``
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007938metadata on the instruction tells the optimizer and code generator
Sean Silvab084af42012-12-07 10:36:55 +00007939that this load is not expected to be reused in the cache. The code
7940generator may select special instructions to save cache bandwidth, such
7941as the ``MOVNT`` instruction on x86.
7942
7943The optional ``!invariant.load`` metadata must reference a single
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007944metadata name ``<index>`` corresponding to a metadata node with no
Geoff Berry4bda5762016-08-31 17:39:21 +00007945entries. If a load instruction tagged with the ``!invariant.load``
7946metadata is executed, the optimizer may assume the memory location
7947referenced by the load contains the same value at all points in the
7948program where the memory location is known to be dereferenceable.
Sean Silvab084af42012-12-07 10:36:55 +00007949
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007950The optional ``!invariant.group`` metadata must reference a single metadata name
Piotr Padlewskice358262018-05-18 23:53:46 +00007951 ``<index>`` corresponding to a metadata node with no entries.
7952 See ``invariant.group`` metadata.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007953
Philip Reamescdb72f32014-10-20 22:40:55 +00007954The optional ``!nonnull`` metadata must reference a single
7955metadata name ``<index>`` corresponding to a metadata node with no
7956entries. The existence of the ``!nonnull`` metadata on the
7957instruction tells the optimizer that the value loaded is known to
Piotr Padlewskid97846e2015-09-02 20:33:16 +00007958never be null. This is analogous to the ``nonnull`` attribute
Sean Silvaa1190322015-08-06 22:56:48 +00007959on parameters and return values. This metadata can only be applied
Mehdi Amini4a121fa2015-03-14 22:04:06 +00007960to loads of a pointer type.
Philip Reamescdb72f32014-10-20 22:40:55 +00007961
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007962The optional ``!dereferenceable`` metadata must reference a single metadata
7963name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
Sean Silva706fba52015-08-06 22:56:24 +00007964entry. The existence of the ``!dereferenceable`` metadata on the instruction
Sanjoy Dasf9995472015-05-19 20:10:19 +00007965tells the optimizer that the value loaded is known to be dereferenceable.
Sean Silva706fba52015-08-06 22:56:24 +00007966The number of bytes known to be dereferenceable is specified by the integer
7967value in the metadata node. This is analogous to the ''dereferenceable''
7968attribute on parameters and return values. This metadata can only be applied
Sanjoy Dasf9995472015-05-19 20:10:19 +00007969to loads of a pointer type.
7970
7971The optional ``!dereferenceable_or_null`` metadata must reference a single
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007972metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
7973``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the
Sanjoy Dasf9995472015-05-19 20:10:19 +00007974instruction tells the optimizer that the value loaded is known to be either
7975dereferenceable or null.
Sean Silva706fba52015-08-06 22:56:24 +00007976The number of bytes known to be dereferenceable is specified by the integer
7977value in the metadata node. This is analogous to the ''dereferenceable_or_null''
7978attribute on parameters and return values. This metadata can only be applied
Sanjoy Dasf9995472015-05-19 20:10:19 +00007979to loads of a pointer type.
7980
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007981The optional ``!align`` metadata must reference a single metadata name
7982``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
7983The existence of the ``!align`` metadata on the instruction tells the
7984optimizer that the value loaded is known to be aligned to a boundary specified
7985by the integer value in the metadata node. The alignment must be a power of 2.
7986This is analogous to the ''align'' attribute on parameters and return values.
7987This metadata can only be applied to loads of a pointer type.
7988
Sean Silvab084af42012-12-07 10:36:55 +00007989Semantics:
7990""""""""""
7991
7992The location of memory pointed to is loaded. If the value being loaded
7993is of scalar type then the number of bytes read does not exceed the
7994minimum number of bytes needed to hold all bits of the type. For
7995example, loading an ``i24`` reads at most three bytes. When loading a
7996value of a type like ``i20`` with a size that is not an integral number
7997of bytes, the result is undefined if the value was not originally
7998written using a store of the same type.
7999
8000Examples:
8001"""""""""
8002
8003.. code-block:: llvm
8004
Tim Northover675a0962014-06-13 14:24:23 +00008005 %ptr = alloca i32 ; yields i32*:ptr
8006 store i32 3, i32* %ptr ; yields void
David Blaikiec7aabbb2015-03-04 22:06:14 +00008007 %val = load i32, i32* %ptr ; yields i32:val = i32 3
Sean Silvab084af42012-12-07 10:36:55 +00008008
8009.. _i_store:
8010
8011'``store``' Instruction
8012^^^^^^^^^^^^^^^^^^^^^^^
8013
8014Syntax:
8015"""""""
8016
8017::
8018
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00008019 store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.group !<index>] ; yields void
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008020 store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>] ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00008021
8022Overview:
8023"""""""""
8024
8025The '``store``' instruction is used to write to memory.
8026
8027Arguments:
8028""""""""""
8029
Sanjoy Dasc2cf6ef2016-06-01 16:13:10 +00008030There are two arguments to the ``store`` instruction: a value to store and an
8031address at which to store it. The type of the ``<pointer>`` operand must be a
8032pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
8033operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
8034allowed to modify the number or order of execution of this ``store`` with other
8035:ref:`volatile operations <volatile>`. Only values of :ref:`first class
8036<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
8037structural type <t_opaque>`) can be stored.
Sean Silvab084af42012-12-07 10:36:55 +00008038
JF Bastiend1fb5852015-12-17 22:09:19 +00008039If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008040<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
8041``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
8042Atomic loads produce :ref:`defined <memmodel>` results when they may see
8043multiple atomic stores. The type of the pointee must be an integer, pointer, or
8044floating-point type whose bit width is a power of two greater than or equal to
8045eight and less than or equal to a target-specific size limit. ``align`` must be
8046explicitly specified on atomic stores, and the store has undefined behavior if
8047the alignment is not set to a value which is at least the size in bytes of the
JF Bastiend1fb5852015-12-17 22:09:19 +00008048pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
Sean Silvab084af42012-12-07 10:36:55 +00008049
Eli Benderskyca380842013-04-17 17:17:20 +00008050The optional constant ``align`` argument specifies the alignment of the
Sean Silvab084af42012-12-07 10:36:55 +00008051operation (that is, the alignment of the memory address). A value of 0
Eli Benderskyca380842013-04-17 17:17:20 +00008052or an omitted ``align`` argument means that the operation has the ABI
Sean Silvab084af42012-12-07 10:36:55 +00008053alignment for the target. It is the responsibility of the code emitter
8054to ensure that the alignment information is correct. Overestimating the
Eli Benderskyca380842013-04-17 17:17:20 +00008055alignment results in undefined behavior. Underestimating the
Sean Silvab084af42012-12-07 10:36:55 +00008056alignment may produce less efficient code. An alignment of 1 is always
Matt Arsenault7020f252016-06-16 16:33:41 +00008057safe. The maximum possible alignment is ``1 << 29``. An alignment
8058value higher than the size of the stored type implies memory up to the
8059alignment value bytes can be stored to without trapping in the default
8060address space. Storing to the higher bytes however may result in data
8061races if another thread can access the same address. Introducing a
8062data race is not allowed. Storing to the extra bytes is not allowed
8063even in situations where a data race is known to not exist if the
8064function has the ``sanitize_address`` attribute.
Sean Silvab084af42012-12-07 10:36:55 +00008065
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00008066The optional ``!nontemporal`` metadata must reference a single metadata
Eli Benderskyca380842013-04-17 17:17:20 +00008067name ``<index>`` corresponding to a metadata node with one ``i32`` entry of
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00008068value 1. The existence of the ``!nontemporal`` metadata on the instruction
Sean Silvab084af42012-12-07 10:36:55 +00008069tells the optimizer and code generator that this load is not expected to
8070be reused in the cache. The code generator may select special
JF Bastiend2d8ffd2016-01-13 04:52:26 +00008071instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
Sean Silvab084af42012-12-07 10:36:55 +00008072x86.
8073
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00008074The optional ``!invariant.group`` metadata must reference a
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00008075single metadata name ``<index>``. See ``invariant.group`` metadata.
8076
Sean Silvab084af42012-12-07 10:36:55 +00008077Semantics:
8078""""""""""
8079
Eli Benderskyca380842013-04-17 17:17:20 +00008080The contents of memory are updated to contain ``<value>`` at the
8081location specified by the ``<pointer>`` operand. If ``<value>`` is
Sean Silvab084af42012-12-07 10:36:55 +00008082of scalar type then the number of bytes written does not exceed the
8083minimum number of bytes needed to hold all bits of the type. For
8084example, storing an ``i24`` writes at most three bytes. When writing a
8085value of a type like ``i20`` with a size that is not an integral number
8086of bytes, it is unspecified what happens to the extra bits that do not
8087belong to the type, but they will typically be overwritten.
8088
8089Example:
8090""""""""
8091
8092.. code-block:: llvm
8093
Tim Northover675a0962014-06-13 14:24:23 +00008094 %ptr = alloca i32 ; yields i32*:ptr
8095 store i32 3, i32* %ptr ; yields void
Nick Lewycky149d04c2015-08-11 01:05:16 +00008096 %val = load i32, i32* %ptr ; yields i32:val = i32 3
Sean Silvab084af42012-12-07 10:36:55 +00008097
8098.. _i_fence:
8099
8100'``fence``' Instruction
8101^^^^^^^^^^^^^^^^^^^^^^^
8102
8103Syntax:
8104"""""""
8105
8106::
8107
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008108 fence [syncscope("<target-scope>")] <ordering> ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00008109
8110Overview:
8111"""""""""
8112
8113The '``fence``' instruction is used to introduce happens-before edges
8114between operations.
8115
8116Arguments:
8117""""""""""
8118
8119'``fence``' instructions take an :ref:`ordering <ordering>` argument which
8120defines what *synchronizes-with* edges they add. They can only be given
8121``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
8122
8123Semantics:
8124""""""""""
8125
8126A fence A which has (at least) ``release`` ordering semantics
8127*synchronizes with* a fence B with (at least) ``acquire`` ordering
8128semantics if and only if there exist atomic operations X and Y, both
8129operating on some atomic object M, such that A is sequenced before X, X
8130modifies M (either directly or through some side effect of a sequence
8131headed by X), Y is sequenced before B, and Y observes M. This provides a
8132*happens-before* dependency between A and B. Rather than an explicit
8133``fence``, one (but not both) of the atomic operations X or Y might
8134provide a ``release`` or ``acquire`` (resp.) ordering constraint and
8135still *synchronize-with* the explicit ``fence`` and establish the
8136*happens-before* edge.
8137
8138A ``fence`` which has ``seq_cst`` ordering, in addition to having both
8139``acquire`` and ``release`` semantics specified above, participates in
8140the global program order of other ``seq_cst`` operations and/or fences.
8141
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008142A ``fence`` instruction can also take an optional
8143":ref:`syncscope <syncscope>`" argument.
Sean Silvab084af42012-12-07 10:36:55 +00008144
8145Example:
8146""""""""
8147
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00008148.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00008149
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008150 fence acquire ; yields void
8151 fence syncscope("singlethread") seq_cst ; yields void
8152 fence syncscope("agent") seq_cst ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00008153
8154.. _i_cmpxchg:
8155
8156'``cmpxchg``' Instruction
8157^^^^^^^^^^^^^^^^^^^^^^^^^
8158
8159Syntax:
8160"""""""
8161
8162::
8163
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008164 cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering> ; yields { ty, i1 }
Sean Silvab084af42012-12-07 10:36:55 +00008165
8166Overview:
8167"""""""""
8168
8169The '``cmpxchg``' instruction is used to atomically modify memory. It
8170loads a value in memory and compares it to a given value. If they are
Tim Northover420a2162014-06-13 14:24:07 +00008171equal, it tries to store a new value into the memory.
Sean Silvab084af42012-12-07 10:36:55 +00008172
8173Arguments:
8174""""""""""
8175
8176There are three arguments to the '``cmpxchg``' instruction: an address
8177to operate on, a value to compare to the value currently be at that
8178address, and a new value to place at that address if the compared values
Philip Reames1960cfd2016-02-19 00:06:41 +00008179are equal. The type of '<cmp>' must be an integer or pointer type whose
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00008180bit width is a power of two greater than or equal to eight and less
Philip Reames1960cfd2016-02-19 00:06:41 +00008181than or equal to a target-specific size limit. '<cmp>' and '<new>' must
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00008182have the same type, and the type of '<pointer>' must be a pointer to
8183that type. If the ``cmpxchg`` is marked as ``volatile``, then the
Philip Reames1960cfd2016-02-19 00:06:41 +00008184optimizer is not allowed to modify the number or order of execution of
8185this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
Sean Silvab084af42012-12-07 10:36:55 +00008186
Tim Northovere94a5182014-03-11 10:48:52 +00008187The success and failure :ref:`ordering <ordering>` arguments specify how this
Tim Northover1dcc9f92014-06-13 14:24:16 +00008188``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
8189must be at least ``monotonic``, the ordering constraint on failure must be no
8190stronger than that on success, and the failure ordering cannot be either
8191``release`` or ``acq_rel``.
Sean Silvab084af42012-12-07 10:36:55 +00008192
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008193A ``cmpxchg`` instruction can also take an optional
8194":ref:`syncscope <syncscope>`" argument.
Sean Silvab084af42012-12-07 10:36:55 +00008195
8196The pointer passed into cmpxchg must have alignment greater than or
8197equal to the size in memory of the operand.
8198
8199Semantics:
8200""""""""""
8201
Tim Northover420a2162014-06-13 14:24:07 +00008202The contents of memory at the location specified by the '``<pointer>``' operand
Matthias Braun93f2b4b2017-08-09 22:22:04 +00008203is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
8204written to the location. The original value at the location is returned,
8205together with a flag indicating success (true) or failure (false).
Tim Northover420a2162014-06-13 14:24:07 +00008206
8207If the cmpxchg operation is marked as ``weak`` then a spurious failure is
8208permitted: the operation may not write ``<new>`` even if the comparison
8209matched.
8210
8211If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
8212if the value loaded equals ``cmp``.
Sean Silvab084af42012-12-07 10:36:55 +00008213
Tim Northovere94a5182014-03-11 10:48:52 +00008214A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
8215identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
8216load with an ordering parameter determined the second ordering parameter.
Sean Silvab084af42012-12-07 10:36:55 +00008217
8218Example:
8219""""""""
8220
8221.. code-block:: llvm
8222
8223 entry:
Duncan P. N. Exon Smithc917c7a2016-02-07 05:06:35 +00008224 %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32
Sean Silvab084af42012-12-07 10:36:55 +00008225 br label %loop
8226
8227 loop:
Duncan P. N. Exon Smithc917c7a2016-02-07 05:06:35 +00008228 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
Sean Silvab084af42012-12-07 10:36:55 +00008229 %squared = mul i32 %cmp, %cmp
Tim Northover675a0962014-06-13 14:24:23 +00008230 %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
Tim Northover420a2162014-06-13 14:24:07 +00008231 %value_loaded = extractvalue { i32, i1 } %val_success, 0
8232 %success = extractvalue { i32, i1 } %val_success, 1
Sean Silvab084af42012-12-07 10:36:55 +00008233 br i1 %success, label %done, label %loop
8234
8235 done:
8236 ...
8237
8238.. _i_atomicrmw:
8239
8240'``atomicrmw``' Instruction
8241^^^^^^^^^^^^^^^^^^^^^^^^^^^
8242
8243Syntax:
8244"""""""
8245
8246::
8247
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008248 atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering> ; yields ty
Sean Silvab084af42012-12-07 10:36:55 +00008249
8250Overview:
8251"""""""""
8252
8253The '``atomicrmw``' instruction is used to atomically modify memory.
8254
8255Arguments:
8256""""""""""
8257
8258There are three arguments to the '``atomicrmw``' instruction: an
8259operation to apply, an address whose value to modify, an argument to the
8260operation. The operation must be one of the following keywords:
8261
8262- xchg
8263- add
8264- sub
8265- and
8266- nand
8267- or
8268- xor
8269- max
8270- min
8271- umax
8272- umin
8273
8274The type of '<value>' must be an integer type whose bit width is a power
8275of two greater than or equal to eight and less than or equal to a
8276target-specific size limit. The type of the '``<pointer>``' operand must
8277be a pointer to that type. If the ``atomicrmw`` is marked as
8278``volatile``, then the optimizer is not allowed to modify the number or
8279order of execution of this ``atomicrmw`` with other :ref:`volatile
8280operations <volatile>`.
8281
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00008282A ``atomicrmw`` instruction can also take an optional
8283":ref:`syncscope <syncscope>`" argument.
8284
Sean Silvab084af42012-12-07 10:36:55 +00008285Semantics:
8286""""""""""
8287
8288The contents of memory at the location specified by the '``<pointer>``'
8289operand are atomically read, modified, and written back. The original
8290value at the location is returned. The modification is specified by the
8291operation argument:
8292
8293- xchg: ``*ptr = val``
8294- add: ``*ptr = *ptr + val``
8295- sub: ``*ptr = *ptr - val``
8296- and: ``*ptr = *ptr & val``
8297- nand: ``*ptr = ~(*ptr & val)``
8298- or: ``*ptr = *ptr | val``
8299- xor: ``*ptr = *ptr ^ val``
8300- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
8301- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
8302- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned
8303 comparison)
8304- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned
8305 comparison)
8306
8307Example:
8308""""""""
8309
8310.. code-block:: llvm
8311
Tim Northover675a0962014-06-13 14:24:23 +00008312 %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32
Sean Silvab084af42012-12-07 10:36:55 +00008313
8314.. _i_getelementptr:
8315
8316'``getelementptr``' Instruction
8317^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8318
8319Syntax:
8320"""""""
8321
8322::
8323
Peter Collingbourned93620b2016-11-10 22:34:55 +00008324 <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
8325 <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
8326 <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx>
Sean Silvab084af42012-12-07 10:36:55 +00008327
8328Overview:
8329"""""""""
8330
8331The '``getelementptr``' instruction is used to get the address of a
8332subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008333address calculation only and does not access memory. The instruction can also
8334be used to calculate a vector of such addresses.
Sean Silvab084af42012-12-07 10:36:55 +00008335
8336Arguments:
8337""""""""""
8338
David Blaikie16a97eb2015-03-04 22:02:58 +00008339The first argument is always a type used as the basis for the calculations.
8340The second argument is always a pointer or a vector of pointers, and is the
8341base address to start from. The remaining arguments are indices
Sean Silvab084af42012-12-07 10:36:55 +00008342that indicate which of the elements of the aggregate object are indexed.
8343The interpretation of each index is dependent on the type being indexed
8344into. The first index always indexes the pointer value given as the
David Blaikief91b0302017-06-19 05:34:21 +00008345second argument, the second index indexes a value of the type pointed to
Sean Silvab084af42012-12-07 10:36:55 +00008346(not necessarily the value directly pointed to, since the first index
8347can be non-zero), etc. The first type indexed into must be a pointer
8348value, subsequent types can be arrays, vectors, and structs. Note that
8349subsequent types being indexed into can never be pointers, since that
8350would require loading the pointer before continuing calculation.
8351
8352The type of each index argument depends on the type it is indexing into.
8353When indexing into a (optionally packed) structure, only ``i32`` integer
8354**constants** are allowed (when using a vector of indices they must all
8355be the **same** ``i32`` integer constant). When indexing into an array,
8356pointer or vector, integers of any width are allowed, and they are not
8357required to be constant. These integers are treated as signed values
8358where relevant.
8359
8360For example, let's consider a C code fragment and how it gets compiled
8361to LLVM:
8362
8363.. code-block:: c
8364
8365 struct RT {
8366 char A;
8367 int B[10][20];
8368 char C;
8369 };
8370 struct ST {
8371 int X;
8372 double Y;
8373 struct RT Z;
8374 };
8375
8376 int *foo(struct ST *s) {
8377 return &s[1].Z.B[5][13];
8378 }
8379
8380The LLVM code generated by Clang is:
8381
8382.. code-block:: llvm
8383
8384 %struct.RT = type { i8, [10 x [20 x i32]], i8 }
8385 %struct.ST = type { i32, double, %struct.RT }
8386
8387 define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
8388 entry:
David Blaikie16a97eb2015-03-04 22:02:58 +00008389 %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
Sean Silvab084af42012-12-07 10:36:55 +00008390 ret i32* %arrayidx
8391 }
8392
8393Semantics:
8394""""""""""
8395
8396In the example above, the first index is indexing into the
8397'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
8398= '``{ i32, double, %struct.RT }``' type, a structure. The second index
8399indexes into the third element of the structure, yielding a
8400'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
8401structure. The third index indexes into the second element of the
8402structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
8403dimensions of the array are subscripted into, yielding an '``i32``'
8404type. The '``getelementptr``' instruction returns a pointer to this
8405element, thus computing a value of '``i32*``' type.
8406
8407Note that it is perfectly legal to index partially through a structure,
8408returning a pointer to an inner element. Because of this, the LLVM code
8409for the given testcase is equivalent to:
8410
8411.. code-block:: llvm
8412
8413 define i32* @foo(%struct.ST* %s) {
David Blaikie16a97eb2015-03-04 22:02:58 +00008414 %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1
8415 %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2
8416 %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3
8417 %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4
8418 %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5
Sean Silvab084af42012-12-07 10:36:55 +00008419 ret i32* %t5
8420 }
8421
8422If the ``inbounds`` keyword is present, the result value of the
8423``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base
8424pointer is not an *in bounds* address of an allocated object, or if any
8425of the addresses that would be formed by successive addition of the
8426offsets implied by the indices to the base address with infinitely
8427precise signed arithmetic are not an *in bounds* address of that
8428allocated object. The *in bounds* addresses for an allocated object are
8429all the addresses that point into the object, plus the address one byte
Eli Friedman13f2e352017-02-23 00:48:18 +00008430past the end. The only *in bounds* address for a null pointer in the
8431default address-space is the null pointer itself. In cases where the
8432base is a vector of pointers the ``inbounds`` keyword applies to each
8433of the computations element-wise.
Sean Silvab084af42012-12-07 10:36:55 +00008434
8435If the ``inbounds`` keyword is not present, the offsets are added to the
8436base address with silently-wrapping two's complement arithmetic. If the
8437offsets have a different width from the pointer, they are sign-extended
8438or truncated to the width of the pointer. The result value of the
8439``getelementptr`` may be outside the object pointed to by the base
8440pointer. The result value may not necessarily be used to access memory
8441though, even if it happens to point into allocated storage. See the
8442:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
8443information.
8444
Peter Collingbourned93620b2016-11-10 22:34:55 +00008445If the ``inrange`` keyword is present before any index, loading from or
8446storing to any pointer derived from the ``getelementptr`` has undefined
8447behavior if the load or store would access memory outside of the bounds of
8448the element selected by the index marked as ``inrange``. The result of a
8449pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
8450involving memory) involving a pointer derived from a ``getelementptr`` with
8451the ``inrange`` keyword is undefined, with the exception of comparisons
8452in the case where both operands are in the range of the element selected
8453by the ``inrange`` keyword, inclusive of the address one past the end of
8454that element. Note that the ``inrange`` keyword is currently only allowed
8455in constant ``getelementptr`` expressions.
8456
Sean Silvab084af42012-12-07 10:36:55 +00008457The getelementptr instruction is often confusing. For some more insight
8458into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
8459
8460Example:
8461""""""""
8462
8463.. code-block:: llvm
8464
8465 ; yields [12 x i8]*:aptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008466 %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00008467 ; yields i8*:vptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008468 %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00008469 ; yields i8*:eptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008470 %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00008471 ; yields i32*:iptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008472 %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
Sean Silvab084af42012-12-07 10:36:55 +00008473
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008474Vector of pointers:
8475"""""""""""""""""""
8476
8477The ``getelementptr`` returns a vector of pointers, instead of a single address,
8478when one or more of its arguments is a vector. In such cases, all vector
8479arguments should have the same number of elements, and every scalar argument
8480will be effectively broadcast into a vector during address calculation.
Sean Silvab084af42012-12-07 10:36:55 +00008481
8482.. code-block:: llvm
8483
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008484 ; All arguments are vectors:
8485 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8)
8486 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
Sean Silva706fba52015-08-06 22:56:24 +00008487
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008488 ; Add the same scalar offset to each pointer of a vector:
8489 ; A[i] = ptrs[i] + offset*sizeof(i8)
8490 %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
Sean Silva706fba52015-08-06 22:56:24 +00008491
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008492 ; Add distinct offsets to the same pointer:
8493 ; A[i] = ptr + offsets[i]*sizeof(i8)
8494 %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
Sean Silva706fba52015-08-06 22:56:24 +00008495
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008496 ; In all cases described above the type of the result is <4 x i8*>
8497
8498The two following instructions are equivalent:
8499
8500.. code-block:: llvm
8501
8502 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
8503 <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
8504 <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
8505 <4 x i32> %ind4,
8506 <4 x i64> <i64 13, i64 13, i64 13, i64 13>
Sean Silva706fba52015-08-06 22:56:24 +00008507
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008508 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
8509 i32 2, i32 1, <4 x i32> %ind4, i64 13
8510
8511Let's look at the C code, where the vector version of ``getelementptr``
8512makes sense:
8513
8514.. code-block:: c
8515
8516 // Let's assume that we vectorize the following loop:
Alexey Baderadec2832017-01-30 07:38:58 +00008517 double *A, *B; int *C;
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008518 for (int i = 0; i < size; ++i) {
8519 A[i] = B[C[i]];
8520 }
8521
8522.. code-block:: llvm
8523
8524 ; get pointers for 8 elements from array B
8525 %ptrs = getelementptr double, double* %B, <8 x i32> %C
8526 ; load 8 elements from array B into A
Elad Cohenef5798a2017-05-03 12:28:54 +00008527 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs,
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008528 i32 8, <8 x i1> %mask, <8 x double> %passthru)
Sean Silvab084af42012-12-07 10:36:55 +00008529
8530Conversion Operations
8531---------------------
8532
8533The instructions in this category are the conversion instructions
8534(casting) which all take a single operand and a type. They perform
8535various bit conversions on the operand.
8536
Bjorn Petterssone1285e32017-10-24 11:59:20 +00008537.. _i_trunc:
8538
Sean Silvab084af42012-12-07 10:36:55 +00008539'``trunc .. to``' Instruction
8540^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8541
8542Syntax:
8543"""""""
8544
8545::
8546
8547 <result> = trunc <ty> <value> to <ty2> ; yields ty2
8548
8549Overview:
8550"""""""""
8551
8552The '``trunc``' instruction truncates its operand to the type ``ty2``.
8553
8554Arguments:
8555""""""""""
8556
8557The '``trunc``' instruction takes a value to trunc, and a type to trunc
8558it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
8559of the same number of integers. The bit size of the ``value`` must be
8560larger than the bit size of the destination type, ``ty2``. Equal sized
8561types are not allowed.
8562
8563Semantics:
8564""""""""""
8565
8566The '``trunc``' instruction truncates the high order bits in ``value``
8567and converts the remaining bits to ``ty2``. Since the source size must
8568be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
8569It will always truncate bits.
8570
8571Example:
8572""""""""
8573
8574.. code-block:: llvm
8575
8576 %X = trunc i32 257 to i8 ; yields i8:1
8577 %Y = trunc i32 123 to i1 ; yields i1:true
8578 %Z = trunc i32 122 to i1 ; yields i1:false
8579 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
8580
Bjorn Petterssone1285e32017-10-24 11:59:20 +00008581.. _i_zext:
8582
Sean Silvab084af42012-12-07 10:36:55 +00008583'``zext .. to``' Instruction
8584^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8585
8586Syntax:
8587"""""""
8588
8589::
8590
8591 <result> = zext <ty> <value> to <ty2> ; yields ty2
8592
8593Overview:
8594"""""""""
8595
8596The '``zext``' instruction zero extends its operand to type ``ty2``.
8597
8598Arguments:
8599""""""""""
8600
8601The '``zext``' instruction takes a value to cast, and a type to cast it
8602to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
8603the same number of integers. The bit size of the ``value`` must be
8604smaller than the bit size of the destination type, ``ty2``.
8605
8606Semantics:
8607""""""""""
8608
8609The ``zext`` fills the high order bits of the ``value`` with zero bits
8610until it reaches the size of the destination type, ``ty2``.
8611
8612When zero extending from i1, the result will always be either 0 or 1.
8613
8614Example:
8615""""""""
8616
8617.. code-block:: llvm
8618
8619 %X = zext i32 257 to i64 ; yields i64:257
8620 %Y = zext i1 true to i32 ; yields i32:1
8621 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
8622
Bjorn Petterssone1285e32017-10-24 11:59:20 +00008623.. _i_sext:
8624
Sean Silvab084af42012-12-07 10:36:55 +00008625'``sext .. to``' Instruction
8626^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8627
8628Syntax:
8629"""""""
8630
8631::
8632
8633 <result> = sext <ty> <value> to <ty2> ; yields ty2
8634
8635Overview:
8636"""""""""
8637
8638The '``sext``' sign extends ``value`` to the type ``ty2``.
8639
8640Arguments:
8641""""""""""
8642
8643The '``sext``' instruction takes a value to cast, and a type to cast it
8644to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
8645the same number of integers. The bit size of the ``value`` must be
8646smaller than the bit size of the destination type, ``ty2``.
8647
8648Semantics:
8649""""""""""
8650
8651The '``sext``' instruction performs a sign extension by copying the sign
8652bit (highest order bit) of the ``value`` until it reaches the bit size
8653of the type ``ty2``.
8654
8655When sign extending from i1, the extension always results in -1 or 0.
8656
8657Example:
8658""""""""
8659
8660.. code-block:: llvm
8661
8662 %X = sext i8 -1 to i16 ; yields i16 :65535
8663 %Y = sext i1 true to i32 ; yields i32:-1
8664 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
8665
8666'``fptrunc .. to``' Instruction
8667^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8668
8669Syntax:
8670"""""""
8671
8672::
8673
8674 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2
8675
8676Overview:
8677"""""""""
8678
8679The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
8680
8681Arguments:
8682""""""""""
8683
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008684The '``fptrunc``' instruction takes a :ref:`floating-point <t_floating>`
8685value to cast and a :ref:`floating-point <t_floating>` type to cast it to.
Sean Silvab084af42012-12-07 10:36:55 +00008686The size of ``value`` must be larger than the size of ``ty2``. This
8687implies that ``fptrunc`` cannot be used to make a *no-op cast*.
8688
8689Semantics:
8690""""""""""
8691
Dan Liew50456fb2015-09-03 18:43:56 +00008692The '``fptrunc``' instruction casts a ``value`` from a larger
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008693:ref:`floating-point <t_floating>` type to a smaller :ref:`floating-point
Sanjay Pateld96a3632018-04-03 13:05:20 +00008694<t_floating>` type.
8695This instruction is assumed to execute in the default :ref:`floating-point
8696environment <floatenv>`.
Sean Silvab084af42012-12-07 10:36:55 +00008697
8698Example:
8699""""""""
8700
8701.. code-block:: llvm
8702
Sanjay Pateld96a3632018-04-03 13:05:20 +00008703 %X = fptrunc double 16777217.0 to float ; yields float:16777216.0
8704 %Y = fptrunc double 1.0E+300 to half ; yields half:+infinity
Sean Silvab084af42012-12-07 10:36:55 +00008705
8706'``fpext .. to``' Instruction
8707^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8708
8709Syntax:
8710"""""""
8711
8712::
8713
8714 <result> = fpext <ty> <value> to <ty2> ; yields ty2
8715
8716Overview:
8717"""""""""
8718
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008719The '``fpext``' extends a floating-point ``value`` to a larger floating-point
8720value.
Sean Silvab084af42012-12-07 10:36:55 +00008721
8722Arguments:
8723""""""""""
8724
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008725The '``fpext``' instruction takes a :ref:`floating-point <t_floating>`
8726``value`` to cast, and a :ref:`floating-point <t_floating>` type to cast it
Sean Silvab084af42012-12-07 10:36:55 +00008727to. The source type must be smaller than the destination type.
8728
8729Semantics:
8730""""""""""
8731
8732The '``fpext``' instruction extends the ``value`` from a smaller
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008733:ref:`floating-point <t_floating>` type to a larger :ref:`floating-point
8734<t_floating>` type. The ``fpext`` cannot be used to make a
Sean Silvab084af42012-12-07 10:36:55 +00008735*no-op cast* because it always changes bits. Use ``bitcast`` to make a
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008736*no-op cast* for a floating-point cast.
Sean Silvab084af42012-12-07 10:36:55 +00008737
8738Example:
8739""""""""
8740
8741.. code-block:: llvm
8742
8743 %X = fpext float 3.125 to double ; yields double:3.125000e+00
8744 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000
8745
8746'``fptoui .. to``' Instruction
8747^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8748
8749Syntax:
8750"""""""
8751
8752::
8753
8754 <result> = fptoui <ty> <value> to <ty2> ; yields ty2
8755
8756Overview:
8757"""""""""
8758
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008759The '``fptoui``' converts a floating-point ``value`` to its unsigned
Sean Silvab084af42012-12-07 10:36:55 +00008760integer equivalent of type ``ty2``.
8761
8762Arguments:
8763""""""""""
8764
8765The '``fptoui``' instruction takes a value to cast, which must be a
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008766scalar or vector :ref:`floating-point <t_floating>` value, and a type to
Sean Silvab084af42012-12-07 10:36:55 +00008767cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008768``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
Sean Silvab084af42012-12-07 10:36:55 +00008769type with the same number of elements as ``ty``
8770
8771Semantics:
8772""""""""""
8773
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008774The '``fptoui``' instruction converts its :ref:`floating-point
8775<t_floating>` operand into the nearest (rounding towards zero)
Eli Friedmanc065bb22018-06-08 21:33:33 +00008776unsigned integer value. If the value cannot fit in ``ty2``, the result
8777is a :ref:`poison value <poisonvalues>`.
Sean Silvab084af42012-12-07 10:36:55 +00008778
8779Example:
8780""""""""
8781
8782.. code-block:: llvm
8783
8784 %X = fptoui double 123.0 to i32 ; yields i32:123
8785 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
8786 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1
8787
8788'``fptosi .. to``' Instruction
8789^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8790
8791Syntax:
8792"""""""
8793
8794::
8795
8796 <result> = fptosi <ty> <value> to <ty2> ; yields ty2
8797
8798Overview:
8799"""""""""
8800
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008801The '``fptosi``' instruction converts :ref:`floating-point <t_floating>`
Sean Silvab084af42012-12-07 10:36:55 +00008802``value`` to type ``ty2``.
8803
8804Arguments:
8805""""""""""
8806
8807The '``fptosi``' instruction takes a value to cast, which must be a
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008808scalar or vector :ref:`floating-point <t_floating>` value, and a type to
Sean Silvab084af42012-12-07 10:36:55 +00008809cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008810``ty`` is a vector floating-point type, ``ty2`` must be a vector integer
Sean Silvab084af42012-12-07 10:36:55 +00008811type with the same number of elements as ``ty``
8812
8813Semantics:
8814""""""""""
8815
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008816The '``fptosi``' instruction converts its :ref:`floating-point
8817<t_floating>` operand into the nearest (rounding towards zero)
Eli Friedmanc065bb22018-06-08 21:33:33 +00008818signed integer value. If the value cannot fit in ``ty2``, the result
8819is a :ref:`poison value <poisonvalues>`.
Sean Silvab084af42012-12-07 10:36:55 +00008820
8821Example:
8822""""""""
8823
8824.. code-block:: llvm
8825
8826 %X = fptosi double -123.0 to i32 ; yields i32:-123
8827 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
8828 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1
8829
8830'``uitofp .. to``' Instruction
8831^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8832
8833Syntax:
8834"""""""
8835
8836::
8837
8838 <result> = uitofp <ty> <value> to <ty2> ; yields ty2
8839
8840Overview:
8841"""""""""
8842
8843The '``uitofp``' instruction regards ``value`` as an unsigned integer
8844and converts that value to the ``ty2`` type.
8845
8846Arguments:
8847""""""""""
8848
8849The '``uitofp``' instruction takes a value to cast, which must be a
8850scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008851``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
8852``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
Sean Silvab084af42012-12-07 10:36:55 +00008853type with the same number of elements as ``ty``
8854
8855Semantics:
8856""""""""""
8857
8858The '``uitofp``' instruction interprets its operand as an unsigned
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008859integer quantity and converts it to the corresponding floating-point
Eli Friedman3f1ce092018-06-14 22:58:48 +00008860value. If the value cannot be exactly represented, it is rounded using
8861the default rounding mode.
8862
Sean Silvab084af42012-12-07 10:36:55 +00008863
8864Example:
8865""""""""
8866
8867.. code-block:: llvm
8868
8869 %X = uitofp i32 257 to float ; yields float:257.0
8870 %Y = uitofp i8 -1 to double ; yields double:255.0
8871
8872'``sitofp .. to``' Instruction
8873^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8874
8875Syntax:
8876"""""""
8877
8878::
8879
8880 <result> = sitofp <ty> <value> to <ty2> ; yields ty2
8881
8882Overview:
8883"""""""""
8884
8885The '``sitofp``' instruction regards ``value`` as a signed integer and
8886converts that value to the ``ty2`` type.
8887
8888Arguments:
8889""""""""""
8890
8891The '``sitofp``' instruction takes a value to cast, which must be a
8892scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00008893``ty2``, which must be an :ref:`floating-point <t_floating>` type. If
8894``ty`` is a vector integer type, ``ty2`` must be a vector floating-point
Sean Silvab084af42012-12-07 10:36:55 +00008895type with the same number of elements as ``ty``
8896
8897Semantics:
8898""""""""""
8899
8900The '``sitofp``' instruction interprets its operand as a signed integer
Eli Friedman3f1ce092018-06-14 22:58:48 +00008901quantity and converts it to the corresponding floating-point value. If the
8902value cannot be exactly represented, it is rounded using the default rounding
8903mode.
Sean Silvab084af42012-12-07 10:36:55 +00008904
8905Example:
8906""""""""
8907
8908.. code-block:: llvm
8909
8910 %X = sitofp i32 257 to float ; yields float:257.0
8911 %Y = sitofp i8 -1 to double ; yields double:-1.0
8912
8913.. _i_ptrtoint:
8914
8915'``ptrtoint .. to``' Instruction
8916^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8917
8918Syntax:
8919"""""""
8920
8921::
8922
8923 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2
8924
8925Overview:
8926"""""""""
8927
8928The '``ptrtoint``' instruction converts the pointer or a vector of
8929pointers ``value`` to the integer (or vector of integers) type ``ty2``.
8930
8931Arguments:
8932""""""""""
8933
8934The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
Ed Maste8ed40ce2015-04-14 20:52:58 +00008935a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
Sean Silvab084af42012-12-07 10:36:55 +00008936type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
8937a vector of integers type.
8938
8939Semantics:
8940""""""""""
8941
8942The '``ptrtoint``' instruction converts ``value`` to integer type
8943``ty2`` by interpreting the pointer value as an integer and either
8944truncating or zero extending that value to the size of the integer type.
8945If ``value`` is smaller than ``ty2`` then a zero extension is done. If
8946``value`` is larger than ``ty2`` then a truncation is done. If they are
8947the same size, then nothing is done (*no-op cast*) other than a type
8948change.
8949
8950Example:
8951""""""""
8952
8953.. code-block:: llvm
8954
8955 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture
8956 %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture
8957 %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
8958
8959.. _i_inttoptr:
8960
8961'``inttoptr .. to``' Instruction
8962^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8963
8964Syntax:
8965"""""""
8966
8967::
8968
8969 <result> = inttoptr <ty> <value> to <ty2> ; yields ty2
8970
8971Overview:
8972"""""""""
8973
8974The '``inttoptr``' instruction converts an integer ``value`` to a
8975pointer type, ``ty2``.
8976
8977Arguments:
8978""""""""""
8979
8980The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
8981cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
8982type.
8983
8984Semantics:
8985""""""""""
8986
8987The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
8988applying either a zero extension or a truncation depending on the size
8989of the integer ``value``. If ``value`` is larger than the size of a
8990pointer then a truncation is done. If ``value`` is smaller than the size
8991of a pointer then a zero extension is done. If they are the same size,
8992nothing is done (*no-op cast*).
8993
8994Example:
8995""""""""
8996
8997.. code-block:: llvm
8998
8999 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture
9000 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture
9001 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture
9002 %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
9003
9004.. _i_bitcast:
9005
9006'``bitcast .. to``' Instruction
9007^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9008
9009Syntax:
9010"""""""
9011
9012::
9013
9014 <result> = bitcast <ty> <value> to <ty2> ; yields ty2
9015
9016Overview:
9017"""""""""
9018
9019The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
9020changing any bits.
9021
9022Arguments:
9023""""""""""
9024
9025The '``bitcast``' instruction takes a value to cast, which must be a
9026non-aggregate first class value, and a type to cast it to, which must
Matt Arsenault24b49c42013-07-31 17:49:08 +00009027also be a non-aggregate :ref:`first class <t_firstclass>` type. The
9028bit sizes of ``value`` and the destination type, ``ty2``, must be
Sean Silvaa1190322015-08-06 22:56:48 +00009029identical. If the source type is a pointer, the destination type must
Matt Arsenault24b49c42013-07-31 17:49:08 +00009030also be a pointer of the same size. This instruction supports bitwise
9031conversion of vectors to integers and to vectors of other types (as
9032long as they have the same size).
Sean Silvab084af42012-12-07 10:36:55 +00009033
9034Semantics:
9035""""""""""
9036
Matt Arsenault24b49c42013-07-31 17:49:08 +00009037The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
9038is always a *no-op cast* because no bits change with this
9039conversion. The conversion is done as if the ``value`` had been stored
9040to memory and read back as type ``ty2``. Pointer (or vector of
9041pointers) types may only be converted to other pointer (or vector of
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00009042pointers) types with the same address space through this instruction.
9043To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
9044or :ref:`ptrtoint <i_ptrtoint>` instructions first.
Sean Silvab084af42012-12-07 10:36:55 +00009045
9046Example:
9047""""""""
9048
Renato Golin124f2592016-07-20 12:16:38 +00009049.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00009050
9051 %X = bitcast i8 255 to i8 ; yields i8 :-1
9052 %Y = bitcast i32* %x to sint* ; yields sint*:%x
9053 %Z = bitcast <2 x int> %V to i64; ; yields i64: %V
9054 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
9055
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00009056.. _i_addrspacecast:
9057
9058'``addrspacecast .. to``' Instruction
9059^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9060
9061Syntax:
9062"""""""
9063
9064::
9065
9066 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2
9067
9068Overview:
9069"""""""""
9070
9071The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
9072address space ``n`` to type ``pty2`` in address space ``m``.
9073
9074Arguments:
9075""""""""""
9076
9077The '``addrspacecast``' instruction takes a pointer or vector of pointer value
9078to cast and a pointer type to cast it to, which must have a different
9079address space.
9080
9081Semantics:
9082""""""""""
9083
9084The '``addrspacecast``' instruction converts the pointer value
9085``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
Matt Arsenault54a2a172013-11-15 05:44:56 +00009086value modification, depending on the target and the address space
9087pair. Pointer conversions within the same address space must be
9088performed with the ``bitcast`` instruction. Note that if the address space
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00009089conversion is legal then both result and operand refer to the same memory
9090location.
9091
9092Example:
9093""""""""
9094
9095.. code-block:: llvm
9096
Matt Arsenault9c13dd02013-11-15 22:43:50 +00009097 %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x
9098 %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y
9099 %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00009100
Sean Silvab084af42012-12-07 10:36:55 +00009101.. _otherops:
9102
9103Other Operations
9104----------------
9105
9106The instructions in this category are the "miscellaneous" instructions,
9107which defy better classification.
9108
9109.. _i_icmp:
9110
9111'``icmp``' Instruction
9112^^^^^^^^^^^^^^^^^^^^^^
9113
9114Syntax:
9115"""""""
9116
9117::
9118
Tim Northover675a0962014-06-13 14:24:23 +00009119 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Sean Silvab084af42012-12-07 10:36:55 +00009120
9121Overview:
9122"""""""""
9123
9124The '``icmp``' instruction returns a boolean value or a vector of
9125boolean values based on comparison of its two integer, integer vector,
9126pointer, or pointer vector operands.
9127
9128Arguments:
9129""""""""""
9130
9131The '``icmp``' instruction takes three operands. The first operand is
9132the condition code indicating the kind of comparison to perform. It is
Sanjay Patel43d41442016-03-30 21:38:20 +00009133not a value, just a keyword. The possible condition codes are:
Sean Silvab084af42012-12-07 10:36:55 +00009134
9135#. ``eq``: equal
9136#. ``ne``: not equal
9137#. ``ugt``: unsigned greater than
9138#. ``uge``: unsigned greater or equal
9139#. ``ult``: unsigned less than
9140#. ``ule``: unsigned less or equal
9141#. ``sgt``: signed greater than
9142#. ``sge``: signed greater or equal
9143#. ``slt``: signed less than
9144#. ``sle``: signed less or equal
9145
9146The remaining two arguments must be :ref:`integer <t_integer>` or
9147:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
9148must also be identical types.
9149
9150Semantics:
9151""""""""""
9152
9153The '``icmp``' compares ``op1`` and ``op2`` according to the condition
9154code given as ``cond``. The comparison performed always yields either an
9155:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
9156
9157#. ``eq``: yields ``true`` if the operands are equal, ``false``
9158 otherwise. No sign interpretation is necessary or performed.
9159#. ``ne``: yields ``true`` if the operands are unequal, ``false``
9160 otherwise. No sign interpretation is necessary or performed.
9161#. ``ugt``: interprets the operands as unsigned values and yields
9162 ``true`` if ``op1`` is greater than ``op2``.
9163#. ``uge``: interprets the operands as unsigned values and yields
9164 ``true`` if ``op1`` is greater than or equal to ``op2``.
9165#. ``ult``: interprets the operands as unsigned values and yields
9166 ``true`` if ``op1`` is less than ``op2``.
9167#. ``ule``: interprets the operands as unsigned values and yields
9168 ``true`` if ``op1`` is less than or equal to ``op2``.
9169#. ``sgt``: interprets the operands as signed values and yields ``true``
9170 if ``op1`` is greater than ``op2``.
9171#. ``sge``: interprets the operands as signed values and yields ``true``
9172 if ``op1`` is greater than or equal to ``op2``.
9173#. ``slt``: interprets the operands as signed values and yields ``true``
9174 if ``op1`` is less than ``op2``.
9175#. ``sle``: interprets the operands as signed values and yields ``true``
9176 if ``op1`` is less than or equal to ``op2``.
9177
9178If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
9179are compared as if they were integers.
9180
9181If the operands are integer vectors, then they are compared element by
9182element. The result is an ``i1`` vector with the same number of elements
9183as the values being compared. Otherwise, the result is an ``i1``.
9184
9185Example:
9186""""""""
9187
Renato Golin124f2592016-07-20 12:16:38 +00009188.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00009189
9190 <result> = icmp eq i32 4, 5 ; yields: result=false
9191 <result> = icmp ne float* %X, %X ; yields: result=false
9192 <result> = icmp ult i16 4, 5 ; yields: result=true
9193 <result> = icmp sgt i16 4, 5 ; yields: result=false
9194 <result> = icmp ule i16 -4, 5 ; yields: result=false
9195 <result> = icmp sge i16 4, 5 ; yields: result=false
9196
Sean Silvab084af42012-12-07 10:36:55 +00009197.. _i_fcmp:
9198
9199'``fcmp``' Instruction
9200^^^^^^^^^^^^^^^^^^^^^^
9201
9202Syntax:
9203"""""""
9204
9205::
9206
James Molloy88eb5352015-07-10 12:52:00 +00009207 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Sean Silvab084af42012-12-07 10:36:55 +00009208
9209Overview:
9210"""""""""
9211
9212The '``fcmp``' instruction returns a boolean value or vector of boolean
9213values based on comparison of its operands.
9214
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00009215If the operands are floating-point scalars, then the result type is a
Sean Silvab084af42012-12-07 10:36:55 +00009216boolean (:ref:`i1 <t_integer>`).
9217
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00009218If the operands are floating-point vectors, then the result type is a
Sean Silvab084af42012-12-07 10:36:55 +00009219vector of boolean with the same number of elements as the operands being
9220compared.
9221
9222Arguments:
9223""""""""""
9224
9225The '``fcmp``' instruction takes three operands. The first operand is
9226the condition code indicating the kind of comparison to perform. It is
Sanjay Patel43d41442016-03-30 21:38:20 +00009227not a value, just a keyword. The possible condition codes are:
Sean Silvab084af42012-12-07 10:36:55 +00009228
9229#. ``false``: no comparison, always returns false
9230#. ``oeq``: ordered and equal
9231#. ``ogt``: ordered and greater than
9232#. ``oge``: ordered and greater than or equal
9233#. ``olt``: ordered and less than
9234#. ``ole``: ordered and less than or equal
9235#. ``one``: ordered and not equal
9236#. ``ord``: ordered (no nans)
9237#. ``ueq``: unordered or equal
9238#. ``ugt``: unordered or greater than
9239#. ``uge``: unordered or greater than or equal
9240#. ``ult``: unordered or less than
9241#. ``ule``: unordered or less than or equal
9242#. ``une``: unordered or not equal
9243#. ``uno``: unordered (either nans)
9244#. ``true``: no comparison, always returns true
9245
9246*Ordered* means that neither operand is a QNAN while *unordered* means
9247that either operand may be a QNAN.
9248
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00009249Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating-point
9250<t_floating>` type or a :ref:`vector <t_vector>` of floating-point type.
9251They must have identical types.
Sean Silvab084af42012-12-07 10:36:55 +00009252
9253Semantics:
9254""""""""""
9255
9256The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
9257condition code given as ``cond``. If the operands are vectors, then the
9258vectors are compared element by element. Each comparison performed
9259always yields an :ref:`i1 <t_integer>` result, as follows:
9260
9261#. ``false``: always yields ``false``, regardless of operands.
9262#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
9263 is equal to ``op2``.
9264#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
9265 is greater than ``op2``.
9266#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
9267 is greater than or equal to ``op2``.
9268#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
9269 is less than ``op2``.
9270#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
9271 is less than or equal to ``op2``.
9272#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
9273 is not equal to ``op2``.
9274#. ``ord``: yields ``true`` if both operands are not a QNAN.
9275#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
9276 equal to ``op2``.
9277#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
9278 greater than ``op2``.
9279#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
9280 greater than or equal to ``op2``.
9281#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
9282 less than ``op2``.
9283#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
9284 less than or equal to ``op2``.
9285#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
9286 not equal to ``op2``.
9287#. ``uno``: yields ``true`` if either operand is a QNAN.
9288#. ``true``: always yields ``true``, regardless of operands.
9289
James Molloy88eb5352015-07-10 12:52:00 +00009290The ``fcmp`` instruction can also optionally take any number of
9291:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
Sanjay Patel85fa9ef2018-03-21 14:15:33 +00009292otherwise unsafe floating-point optimizations.
James Molloy88eb5352015-07-10 12:52:00 +00009293
9294Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
9295only flags that have any effect on its semantics are those that allow
9296assumptions to be made about the values of input arguments; namely
9297``nnan``, ``ninf``, and ``nsz``. See :ref:`fastmath` for more information.
9298
Sean Silvab084af42012-12-07 10:36:55 +00009299Example:
9300""""""""
9301
Renato Golin124f2592016-07-20 12:16:38 +00009302.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00009303
9304 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false
9305 <result> = fcmp one float 4.0, 5.0 ; yields: result=true
9306 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true
9307 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false
9308
Sean Silvab084af42012-12-07 10:36:55 +00009309.. _i_phi:
9310
9311'``phi``' Instruction
9312^^^^^^^^^^^^^^^^^^^^^
9313
9314Syntax:
9315"""""""
9316
9317::
9318
9319 <result> = phi <ty> [ <val0>, <label0>], ...
9320
9321Overview:
9322"""""""""
9323
9324The '``phi``' instruction is used to implement the φ node in the SSA
9325graph representing the function.
9326
9327Arguments:
9328""""""""""
9329
9330The type of the incoming values is specified with the first type field.
9331After this, the '``phi``' instruction takes a list of pairs as
9332arguments, with one pair for each predecessor basic block of the current
9333block. Only values of :ref:`first class <t_firstclass>` type may be used as
9334the value arguments to the PHI node. Only labels may be used as the
9335label arguments.
9336
9337There must be no non-phi instructions between the start of a basic block
9338and the PHI instructions: i.e. PHI instructions must be first in a basic
9339block.
9340
9341For the purposes of the SSA form, the use of each incoming value is
9342deemed to occur on the edge from the corresponding predecessor block to
9343the current block (but after any definition of an '``invoke``'
9344instruction's return value on the same edge).
9345
9346Semantics:
9347""""""""""
9348
9349At runtime, the '``phi``' instruction logically takes on the value
9350specified by the pair corresponding to the predecessor basic block that
9351executed just prior to the current block.
9352
9353Example:
9354""""""""
9355
9356.. code-block:: llvm
9357
9358 Loop: ; Infinite loop that counts from 0 on up...
9359 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
9360 %nextindvar = add i32 %indvar, 1
9361 br label %Loop
9362
9363.. _i_select:
9364
9365'``select``' Instruction
9366^^^^^^^^^^^^^^^^^^^^^^^^
9367
9368Syntax:
9369"""""""
9370
9371::
9372
9373 <result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty
9374
9375 selty is either i1 or {<N x i1>}
9376
9377Overview:
9378"""""""""
9379
9380The '``select``' instruction is used to choose one value based on a
Joerg Sonnenberger94321ec2014-03-26 15:30:21 +00009381condition, without IR-level branching.
Sean Silvab084af42012-12-07 10:36:55 +00009382
9383Arguments:
9384""""""""""
9385
9386The '``select``' instruction requires an 'i1' value or a vector of 'i1'
9387values indicating the condition, and two values of the same :ref:`first
David Majnemer40a0b592015-03-03 22:45:47 +00009388class <t_firstclass>` type.
Sean Silvab084af42012-12-07 10:36:55 +00009389
9390Semantics:
9391""""""""""
9392
9393If the condition is an i1 and it evaluates to 1, the instruction returns
9394the first value argument; otherwise, it returns the second value
9395argument.
9396
9397If the condition is a vector of i1, then the value arguments must be
9398vectors of the same size, and the selection is done element by element.
9399
David Majnemer40a0b592015-03-03 22:45:47 +00009400If the condition is an i1 and the value arguments are vectors of the
9401same size, then an entire vector is selected.
9402
Sean Silvab084af42012-12-07 10:36:55 +00009403Example:
9404""""""""
9405
9406.. code-block:: llvm
9407
9408 %X = select i1 true, i8 17, i8 42 ; yields i8:17
9409
9410.. _i_call:
9411
9412'``call``' Instruction
9413^^^^^^^^^^^^^^^^^^^^^^
9414
9415Syntax:
9416"""""""
9417
9418::
9419
David Blaikieb83cf102016-07-13 17:21:34 +00009420 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00009421 [ operand bundles ]
Sean Silvab084af42012-12-07 10:36:55 +00009422
9423Overview:
9424"""""""""
9425
9426The '``call``' instruction represents a simple function call.
9427
9428Arguments:
9429""""""""""
9430
9431This instruction requires several arguments:
9432
Reid Kleckner5772b772014-04-24 20:14:34 +00009433#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
Sean Silvaa1190322015-08-06 22:56:48 +00009434 should perform tail call optimization. The ``tail`` marker is a hint that
9435 `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
Reid Kleckner5772b772014-04-24 20:14:34 +00009436 means that the call must be tail call optimized in order for the program to
Sean Silvaa1190322015-08-06 22:56:48 +00009437 be correct. The ``musttail`` marker provides these guarantees:
Reid Kleckner5772b772014-04-24 20:14:34 +00009438
9439 #. The call will not cause unbounded stack growth if it is part of a
9440 recursive cycle in the call graph.
9441 #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are
9442 forwarded in place.
9443
Florian Hahnedae5a62018-01-17 23:29:25 +00009444 Both markers imply that the callee does not access allocas from the caller.
9445 The ``tail`` marker additionally implies that the callee does not access
9446 varargs from the caller, while ``musttail`` implies that varargs from the
9447 caller are passed to the callee. Calls marked ``musttail`` must obey the
9448 following additional rules:
Reid Kleckner5772b772014-04-24 20:14:34 +00009449
9450 - The call must immediately precede a :ref:`ret <i_ret>` instruction,
9451 or a pointer bitcast followed by a ret instruction.
9452 - The ret instruction must return the (possibly bitcasted) value
9453 produced by the call or void.
Sean Silvaa1190322015-08-06 22:56:48 +00009454 - The caller and callee prototypes must match. Pointer types of
Reid Kleckner5772b772014-04-24 20:14:34 +00009455 parameters or return types may differ in pointee type, but not
9456 in address space.
9457 - The calling conventions of the caller and callee must match.
9458 - All ABI-impacting function attributes, such as sret, byval, inreg,
9459 returned, and inalloca, must match.
Reid Kleckner83498642014-08-26 00:33:28 +00009460 - The callee must be varargs iff the caller is varargs. Bitcasting a
9461 non-varargs function to the appropriate varargs type is legal so
9462 long as the non-varargs prefixes obey the other rules.
Reid Kleckner5772b772014-04-24 20:14:34 +00009463
9464 Tail call optimization for calls marked ``tail`` is guaranteed to occur if
9465 the following conditions are met:
Sean Silvab084af42012-12-07 10:36:55 +00009466
9467 - Caller and callee both have the calling convention ``fastcc``.
9468 - The call is in tail position (ret immediately follows call and ret
9469 uses value of call or is void).
9470 - Option ``-tailcallopt`` is enabled, or
9471 ``llvm::GuaranteedTailCallOpt`` is ``true``.
Alp Tokercf218752014-06-30 18:57:16 +00009472 - `Platform-specific constraints are
Sean Silvab084af42012-12-07 10:36:55 +00009473 met. <CodeGenerator.html#tailcallopt>`_
9474
Akira Hatanaka5cfcce122015-11-06 23:55:38 +00009475#. The optional ``notail`` marker indicates that the optimizers should not add
9476 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
9477 call optimization from being performed on the call.
9478
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00009479#. The optional ``fast-math flags`` marker indicates that the call has one or more
Sanjay Patelfa54ace2015-12-14 21:59:03 +00009480 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
9481 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
9482 for calls that return a floating-point scalar or vector type.
9483
Sean Silvab084af42012-12-07 10:36:55 +00009484#. The optional "cconv" marker indicates which :ref:`calling
9485 convention <callingconv>` the call should use. If none is
9486 specified, the call defaults to using C calling conventions. The
9487 calling convention of the call must match the calling convention of
9488 the target function, or else the behavior is undefined.
9489#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
9490 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
9491 are valid here.
9492#. '``ty``': the type of the call instruction itself which is also the
9493 type of the return value. Functions that return no value are marked
9494 ``void``.
David Blaikieb83cf102016-07-13 17:21:34 +00009495#. '``fnty``': shall be the signature of the function being called. The
9496 argument types must match the types implied by this signature. This
9497 type can be omitted if the function is not varargs.
Sean Silvab084af42012-12-07 10:36:55 +00009498#. '``fnptrval``': An LLVM value containing a pointer to a function to
David Blaikieb83cf102016-07-13 17:21:34 +00009499 be called. In most cases, this is a direct function call, but
Sean Silvab084af42012-12-07 10:36:55 +00009500 indirect ``call``'s are just as possible, calling an arbitrary pointer
9501 to function value.
9502#. '``function args``': argument list whose types match the function
9503 signature argument types and parameter attributes. All arguments must
9504 be of :ref:`first class <t_firstclass>` type. If the function signature
9505 indicates the function accepts a variable number of arguments, the
9506 extra arguments can be specified.
George Burgess IV39c91052017-04-13 04:01:55 +00009507#. The optional :ref:`function attributes <fnattrs>` list.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00009508#. The optional :ref:`operand bundles <opbundles>` list.
Sean Silvab084af42012-12-07 10:36:55 +00009509
9510Semantics:
9511""""""""""
9512
9513The '``call``' instruction is used to cause control flow to transfer to
9514a specified function, with its incoming arguments bound to the specified
9515values. Upon a '``ret``' instruction in the called function, control
9516flow continues with the instruction after the function call, and the
9517return value of the function is bound to the result argument.
9518
9519Example:
9520""""""""
9521
9522.. code-block:: llvm
9523
9524 %retval = call i32 @test(i32 %argc)
9525 call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32
9526 %X = tail call i32 @foo() ; yields i32
9527 %Y = tail call fastcc i32 @foo() ; yields i32
9528 call void %foo(i8 97 signext)
9529
9530 %struct.A = type { i32, i8 }
Tim Northover675a0962014-06-13 14:24:23 +00009531 %r = call %struct.A @foo() ; yields { i32, i8 }
Sean Silvab084af42012-12-07 10:36:55 +00009532 %gr = extractvalue %struct.A %r, 0 ; yields i32
9533 %gr1 = extractvalue %struct.A %r, 1 ; yields i8
9534 %Z = call void @foo() noreturn ; indicates that %foo never returns normally
9535 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
9536
9537llvm treats calls to some functions with names and arguments that match
9538the standard C99 library as being the C99 library functions, and may
9539perform optimizations or generate code for them under that assumption.
9540This is something we'd like to change in the future to provide better
9541support for freestanding environments and non-C-based languages.
9542
9543.. _i_va_arg:
9544
9545'``va_arg``' Instruction
9546^^^^^^^^^^^^^^^^^^^^^^^^
9547
9548Syntax:
9549"""""""
9550
9551::
9552
9553 <resultval> = va_arg <va_list*> <arglist>, <argty>
9554
9555Overview:
9556"""""""""
9557
9558The '``va_arg``' instruction is used to access arguments passed through
9559the "variable argument" area of a function call. It is used to implement
9560the ``va_arg`` macro in C.
9561
9562Arguments:
9563""""""""""
9564
9565This instruction takes a ``va_list*`` value and the type of the
9566argument. It returns a value of the specified argument type and
9567increments the ``va_list`` to point to the next argument. The actual
9568type of ``va_list`` is target specific.
9569
9570Semantics:
9571""""""""""
9572
9573The '``va_arg``' instruction loads an argument of the specified type
9574from the specified ``va_list`` and causes the ``va_list`` to point to
9575the next argument. For more information, see the variable argument
9576handling :ref:`Intrinsic Functions <int_varargs>`.
9577
9578It is legal for this instruction to be called in a function which does
9579not take a variable number of arguments, for example, the ``vfprintf``
9580function.
9581
9582``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
9583function <intrinsics>` because it takes a type as an argument.
9584
9585Example:
9586""""""""
9587
9588See the :ref:`variable argument processing <int_varargs>` section.
9589
9590Note that the code generator does not yet fully support va\_arg on many
9591targets. Also, it does not currently support va\_arg with aggregate
9592types on any target.
9593
9594.. _i_landingpad:
9595
9596'``landingpad``' Instruction
9597^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9598
9599Syntax:
9600"""""""
9601
9602::
9603
David Majnemer7fddecc2015-06-17 20:52:32 +00009604 <resultval> = landingpad <resultty> <clause>+
9605 <resultval> = landingpad <resultty> cleanup <clause>*
Sean Silvab084af42012-12-07 10:36:55 +00009606
9607 <clause> := catch <type> <value>
9608 <clause> := filter <array constant type> <array constant>
9609
9610Overview:
9611"""""""""
9612
9613The '``landingpad``' instruction is used by `LLVM's exception handling
9614system <ExceptionHandling.html#overview>`_ to specify that a basic block
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00009615is a landing pad --- one where the exception lands, and corresponds to the
Sean Silvab084af42012-12-07 10:36:55 +00009616code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
David Majnemer7fddecc2015-06-17 20:52:32 +00009617defines values supplied by the :ref:`personality function <personalityfn>` upon
Sean Silvab084af42012-12-07 10:36:55 +00009618re-entry to the function. The ``resultval`` has the type ``resultty``.
9619
9620Arguments:
9621""""""""""
9622
David Majnemer7fddecc2015-06-17 20:52:32 +00009623The optional
Sean Silvab084af42012-12-07 10:36:55 +00009624``cleanup`` flag indicates that the landing pad block is a cleanup.
9625
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00009626A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
Sean Silvab084af42012-12-07 10:36:55 +00009627contains the global variable representing the "type" that may be caught
9628or filtered respectively. Unlike the ``catch`` clause, the ``filter``
9629clause takes an array constant as its argument. Use
9630"``[0 x i8**] undef``" for a filter which cannot throw. The
9631'``landingpad``' instruction must contain *at least* one ``clause`` or
9632the ``cleanup`` flag.
9633
9634Semantics:
9635""""""""""
9636
9637The '``landingpad``' instruction defines the values which are set by the
David Majnemer7fddecc2015-06-17 20:52:32 +00009638:ref:`personality function <personalityfn>` upon re-entry to the function, and
Sean Silvab084af42012-12-07 10:36:55 +00009639therefore the "result type" of the ``landingpad`` instruction. As with
9640calling conventions, how the personality function results are
9641represented in LLVM IR is target specific.
9642
9643The clauses are applied in order from top to bottom. If two
9644``landingpad`` instructions are merged together through inlining, the
9645clauses from the calling function are appended to the list of clauses.
9646When the call stack is being unwound due to an exception being thrown,
9647the exception is compared against each ``clause`` in turn. If it doesn't
9648match any of the clauses, and the ``cleanup`` flag is not set, then
9649unwinding continues further up the call stack.
9650
9651The ``landingpad`` instruction has several restrictions:
9652
9653- A landing pad block is a basic block which is the unwind destination
9654 of an '``invoke``' instruction.
9655- A landing pad block must have a '``landingpad``' instruction as its
9656 first non-PHI instruction.
9657- There can be only one '``landingpad``' instruction within the landing
9658 pad block.
9659- A basic block that is not a landing pad block may not include a
9660 '``landingpad``' instruction.
Sean Silvab084af42012-12-07 10:36:55 +00009661
9662Example:
9663""""""""
9664
9665.. code-block:: llvm
9666
9667 ;; A landing pad which can catch an integer.
David Majnemer7fddecc2015-06-17 20:52:32 +00009668 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00009669 catch i8** @_ZTIi
9670 ;; A landing pad that is a cleanup.
David Majnemer7fddecc2015-06-17 20:52:32 +00009671 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00009672 cleanup
9673 ;; A landing pad which can catch an integer and can only throw a double.
David Majnemer7fddecc2015-06-17 20:52:32 +00009674 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00009675 catch i8** @_ZTIi
9676 filter [1 x i8**] [@_ZTId]
9677
Joseph Tremoulet2adaa982016-01-10 04:46:10 +00009678.. _i_catchpad:
9679
9680'``catchpad``' Instruction
9681^^^^^^^^^^^^^^^^^^^^^^^^^^
9682
9683Syntax:
9684"""""""
9685
9686::
9687
9688 <resultval> = catchpad within <catchswitch> [<args>*]
9689
9690Overview:
9691"""""""""
9692
9693The '``catchpad``' instruction is used by `LLVM's exception handling
9694system <ExceptionHandling.html#overview>`_ to specify that a basic block
9695begins a catch handler --- one where a personality routine attempts to transfer
9696control to catch an exception.
9697
9698Arguments:
9699""""""""""
9700
9701The ``catchswitch`` operand must always be a token produced by a
9702:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
9703ensures that each ``catchpad`` has exactly one predecessor block, and it always
9704terminates in a ``catchswitch``.
9705
9706The ``args`` correspond to whatever information the personality routine
9707requires to know if this is an appropriate handler for the exception. Control
9708will transfer to the ``catchpad`` if this is the first appropriate handler for
9709the exception.
9710
9711The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
9712``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
9713pads.
9714
9715Semantics:
9716""""""""""
9717
9718When the call stack is being unwound due to an exception being thrown, the
9719exception is compared against the ``args``. If it doesn't match, control will
9720not reach the ``catchpad`` instruction. The representation of ``args`` is
9721entirely target and personality function-specific.
9722
9723Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
9724instruction must be the first non-phi of its parent basic block.
9725
9726The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
9727instructions is described in the
9728`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
9729
9730When a ``catchpad`` has been "entered" but not yet "exited" (as
9731described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9732it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
9733that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
9734
9735Example:
9736""""""""
9737
Renato Golin124f2592016-07-20 12:16:38 +00009738.. code-block:: text
Joseph Tremoulet2adaa982016-01-10 04:46:10 +00009739
9740 dispatch:
9741 %cs = catchswitch within none [label %handler0] unwind to caller
9742 ;; A catch block which can catch an integer.
9743 handler0:
9744 %tok = catchpad within %cs [i8** @_ZTIi]
9745
David Majnemer654e1302015-07-31 17:58:14 +00009746.. _i_cleanuppad:
9747
9748'``cleanuppad``' Instruction
9749^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9750
9751Syntax:
9752"""""""
9753
9754::
9755
David Majnemer8a1c45d2015-12-12 05:38:55 +00009756 <resultval> = cleanuppad within <parent> [<args>*]
David Majnemer654e1302015-07-31 17:58:14 +00009757
9758Overview:
9759"""""""""
9760
9761The '``cleanuppad``' instruction is used by `LLVM's exception handling
9762system <ExceptionHandling.html#overview>`_ to specify that a basic block
9763is a cleanup block --- one where a personality routine attempts to
9764transfer control to run cleanup actions.
9765The ``args`` correspond to whatever additional
9766information the :ref:`personality function <personalityfn>` requires to
9767execute the cleanup.
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00009768The ``resultval`` has the type :ref:`token <t_token>` and is used to
David Majnemer8a1c45d2015-12-12 05:38:55 +00009769match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
9770The ``parent`` argument is the token of the funclet that contains the
9771``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
9772this operand may be the token ``none``.
David Majnemer654e1302015-07-31 17:58:14 +00009773
9774Arguments:
9775""""""""""
9776
9777The instruction takes a list of arbitrary values which are interpreted
9778by the :ref:`personality function <personalityfn>`.
9779
9780Semantics:
9781""""""""""
9782
David Majnemer654e1302015-07-31 17:58:14 +00009783When the call stack is being unwound due to an exception being thrown,
9784the :ref:`personality function <personalityfn>` transfers control to the
9785``cleanuppad`` with the aid of the personality-specific arguments.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00009786As with calling conventions, how the personality function results are
9787represented in LLVM IR is target specific.
David Majnemer654e1302015-07-31 17:58:14 +00009788
9789The ``cleanuppad`` instruction has several restrictions:
9790
9791- A cleanup block is a basic block which is the unwind destination of
9792 an exceptional instruction.
9793- A cleanup block must have a '``cleanuppad``' instruction as its
9794 first non-PHI instruction.
9795- There can be only one '``cleanuppad``' instruction within the
9796 cleanup block.
9797- A basic block that is not a cleanup block may not include a
9798 '``cleanuppad``' instruction.
David Majnemer8a1c45d2015-12-12 05:38:55 +00009799
Joseph Tremoulete28885e2016-01-10 04:28:38 +00009800When a ``cleanuppad`` has been "entered" but not yet "exited" (as
9801described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9802it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
9803that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
David Majnemer8a1c45d2015-12-12 05:38:55 +00009804
David Majnemer654e1302015-07-31 17:58:14 +00009805Example:
9806""""""""
9807
Renato Golin124f2592016-07-20 12:16:38 +00009808.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00009809
David Majnemer8a1c45d2015-12-12 05:38:55 +00009810 %tok = cleanuppad within %cs []
David Majnemer654e1302015-07-31 17:58:14 +00009811
Sean Silvab084af42012-12-07 10:36:55 +00009812.. _intrinsics:
9813
9814Intrinsic Functions
9815===================
9816
9817LLVM supports the notion of an "intrinsic function". These functions
9818have well known names and semantics and are required to follow certain
9819restrictions. Overall, these intrinsics represent an extension mechanism
9820for the LLVM language that does not require changing all of the
9821transformations in LLVM when adding to the language (or the bitcode
9822reader/writer, the parser, etc...).
9823
9824Intrinsic function names must all start with an "``llvm.``" prefix. This
9825prefix is reserved in LLVM for intrinsic names; thus, function names may
9826not begin with this prefix. Intrinsic functions must always be external
9827functions: you cannot define the body of intrinsic functions. Intrinsic
9828functions may only be used in call or invoke instructions: it is illegal
9829to take the address of an intrinsic function. Additionally, because
9830intrinsic functions are part of the LLVM language, it is required if any
9831are added that they be documented here.
9832
9833Some intrinsic functions can be overloaded, i.e., the intrinsic
9834represents a family of functions that perform the same operation but on
9835different data types. Because LLVM can represent over 8 million
9836different integer types, overloading is used commonly to allow an
9837intrinsic function to operate on any integer type. One or more of the
9838argument types or the result type can be overloaded to accept any
9839integer type. Argument types may also be defined as exactly matching a
9840previous argument's type or the result type. This allows an intrinsic
9841function which accepts multiple arguments, but needs all of them to be
9842of the same type, to only be overloaded with respect to a single
9843argument or the result.
9844
9845Overloaded intrinsics will have the names of its overloaded argument
9846types encoded into its function name, each preceded by a period. Only
9847those types which are overloaded result in a name suffix. Arguments
9848whose type is matched against another type do not. For example, the
9849``llvm.ctpop`` function can take an integer of any width and returns an
9850integer of exactly the same integer width. This leads to a family of
9851functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
9852``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
9853overloaded, and only one type suffix is required. Because the argument's
9854type is matched against the return type, it does not require its own
9855name suffix.
9856
9857To learn how to add an intrinsic function, please see the `Extending
9858LLVM Guide <ExtendingLLVM.html>`_.
9859
9860.. _int_varargs:
9861
9862Variable Argument Handling Intrinsics
9863-------------------------------------
9864
9865Variable argument support is defined in LLVM with the
9866:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
9867functions. These functions are related to the similarly named macros
9868defined in the ``<stdarg.h>`` header file.
9869
9870All of these functions operate on arguments that use a target-specific
9871value type "``va_list``". The LLVM assembly language reference manual
9872does not define what this type is, so all transformations should be
9873prepared to handle these functions regardless of the type used.
9874
9875This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
9876variable argument handling intrinsic functions are used.
9877
9878.. code-block:: llvm
9879
Tim Northoverab60bb92014-11-02 01:21:51 +00009880 ; This struct is different for every platform. For most platforms,
9881 ; it is merely an i8*.
9882 %struct.va_list = type { i8* }
9883
9884 ; For Unix x86_64 platforms, va_list is the following struct:
9885 ; %struct.va_list = type { i32, i32, i8*, i8* }
9886
Sean Silvab084af42012-12-07 10:36:55 +00009887 define i32 @test(i32 %X, ...) {
9888 ; Initialize variable argument processing
Tim Northoverab60bb92014-11-02 01:21:51 +00009889 %ap = alloca %struct.va_list
9890 %ap2 = bitcast %struct.va_list* %ap to i8*
Sean Silvab084af42012-12-07 10:36:55 +00009891 call void @llvm.va_start(i8* %ap2)
9892
9893 ; Read a single integer argument
Tim Northoverab60bb92014-11-02 01:21:51 +00009894 %tmp = va_arg i8* %ap2, i32
Sean Silvab084af42012-12-07 10:36:55 +00009895
9896 ; Demonstrate usage of llvm.va_copy and llvm.va_end
9897 %aq = alloca i8*
9898 %aq2 = bitcast i8** %aq to i8*
9899 call void @llvm.va_copy(i8* %aq2, i8* %ap2)
9900 call void @llvm.va_end(i8* %aq2)
9901
9902 ; Stop processing of arguments.
9903 call void @llvm.va_end(i8* %ap2)
9904 ret i32 %tmp
9905 }
9906
9907 declare void @llvm.va_start(i8*)
9908 declare void @llvm.va_copy(i8*, i8*)
9909 declare void @llvm.va_end(i8*)
9910
9911.. _int_va_start:
9912
9913'``llvm.va_start``' Intrinsic
9914^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9915
9916Syntax:
9917"""""""
9918
9919::
9920
Nick Lewycky04f6de02013-09-11 22:04:52 +00009921 declare void @llvm.va_start(i8* <arglist>)
Sean Silvab084af42012-12-07 10:36:55 +00009922
9923Overview:
9924"""""""""
9925
9926The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
9927subsequent use by ``va_arg``.
9928
9929Arguments:
9930""""""""""
9931
9932The argument is a pointer to a ``va_list`` element to initialize.
9933
9934Semantics:
9935""""""""""
9936
9937The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
9938available in C. In a target-dependent way, it initializes the
9939``va_list`` element to which the argument points, so that the next call
9940to ``va_arg`` will produce the first variable argument passed to the
9941function. Unlike the C ``va_start`` macro, this intrinsic does not need
9942to know the last argument of the function as the compiler can figure
9943that out.
9944
9945'``llvm.va_end``' Intrinsic
9946^^^^^^^^^^^^^^^^^^^^^^^^^^^
9947
9948Syntax:
9949"""""""
9950
9951::
9952
9953 declare void @llvm.va_end(i8* <arglist>)
9954
9955Overview:
9956"""""""""
9957
9958The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
9959initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
9960
9961Arguments:
9962""""""""""
9963
9964The argument is a pointer to a ``va_list`` to destroy.
9965
9966Semantics:
9967""""""""""
9968
9969The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
9970available in C. In a target-dependent way, it destroys the ``va_list``
9971element to which the argument points. Calls to
9972:ref:`llvm.va_start <int_va_start>` and
9973:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
9974``llvm.va_end``.
9975
9976.. _int_va_copy:
9977
9978'``llvm.va_copy``' Intrinsic
9979^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9980
9981Syntax:
9982"""""""
9983
9984::
9985
9986 declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
9987
9988Overview:
9989"""""""""
9990
9991The '``llvm.va_copy``' intrinsic copies the current argument position
9992from the source argument list to the destination argument list.
9993
9994Arguments:
9995""""""""""
9996
9997The first argument is a pointer to a ``va_list`` element to initialize.
9998The second argument is a pointer to a ``va_list`` element to copy from.
9999
10000Semantics:
10001""""""""""
10002
10003The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
10004available in C. In a target-dependent way, it copies the source
10005``va_list`` element into the destination ``va_list`` element. This
10006intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
10007arbitrarily complex and require, for example, memory allocation.
10008
10009Accurate Garbage Collection Intrinsics
10010--------------------------------------
10011
Philip Reamesc5b0f562015-02-25 23:52:06 +000010012LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
Mehdi Amini4a121fa2015-03-14 22:04:06 +000010013(GC) requires the frontend to generate code containing appropriate intrinsic
10014calls and select an appropriate GC strategy which knows how to lower these
Philip Reamesc5b0f562015-02-25 23:52:06 +000010015intrinsics in a manner which is appropriate for the target collector.
10016
Sean Silvab084af42012-12-07 10:36:55 +000010017These intrinsics allow identification of :ref:`GC roots on the
10018stack <int_gcroot>`, as well as garbage collector implementations that
10019require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
Philip Reamesc5b0f562015-02-25 23:52:06 +000010020Frontends for type-safe garbage collected languages should generate
Sean Silvab084af42012-12-07 10:36:55 +000010021these intrinsics to make use of the LLVM garbage collectors. For more
Philip Reamesf80bbff2015-02-25 23:45:20 +000010022details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
Sean Silvab084af42012-12-07 10:36:55 +000010023
Philip Reamesf80bbff2015-02-25 23:45:20 +000010024Experimental Statepoint Intrinsics
10025^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10026
10027LLVM provides an second experimental set of intrinsics for describing garbage
Sean Silvaa1190322015-08-06 22:56:48 +000010028collection safepoints in compiled code. These intrinsics are an alternative
Mehdi Amini4a121fa2015-03-14 22:04:06 +000010029to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
Sean Silvaa1190322015-08-06 22:56:48 +000010030:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
Mehdi Amini4a121fa2015-03-14 22:04:06 +000010031differences in approach are covered in the `Garbage Collection with LLVM
Sean Silvaa1190322015-08-06 22:56:48 +000010032<GarbageCollection.html>`_ documentation. The intrinsics themselves are
Philip Reamesf80bbff2015-02-25 23:45:20 +000010033described in :doc:`Statepoints`.
Sean Silvab084af42012-12-07 10:36:55 +000010034
10035.. _int_gcroot:
10036
10037'``llvm.gcroot``' Intrinsic
10038^^^^^^^^^^^^^^^^^^^^^^^^^^^
10039
10040Syntax:
10041"""""""
10042
10043::
10044
10045 declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
10046
10047Overview:
10048"""""""""
10049
10050The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
10051the code generator, and allows some metadata to be associated with it.
10052
10053Arguments:
10054""""""""""
10055
10056The first argument specifies the address of a stack object that contains
10057the root pointer. The second pointer (which must be either a constant or
10058a global value address) contains the meta-data to be associated with the
10059root.
10060
10061Semantics:
10062""""""""""
10063
10064At runtime, a call to this intrinsic stores a null pointer into the
10065"ptrloc" location. At compile-time, the code generator generates
10066information to allow the runtime to find the pointer at GC safe points.
10067The '``llvm.gcroot``' intrinsic may only be used in a function which
10068:ref:`specifies a GC algorithm <gc>`.
10069
10070.. _int_gcread:
10071
10072'``llvm.gcread``' Intrinsic
10073^^^^^^^^^^^^^^^^^^^^^^^^^^^
10074
10075Syntax:
10076"""""""
10077
10078::
10079
10080 declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
10081
10082Overview:
10083"""""""""
10084
10085The '``llvm.gcread``' intrinsic identifies reads of references from heap
10086locations, allowing garbage collector implementations that require read
10087barriers.
10088
10089Arguments:
10090""""""""""
10091
10092The second argument is the address to read from, which should be an
10093address allocated from the garbage collector. The first object is a
10094pointer to the start of the referenced object, if needed by the language
10095runtime (otherwise null).
10096
10097Semantics:
10098""""""""""
10099
10100The '``llvm.gcread``' intrinsic has the same semantics as a load
10101instruction, but may be replaced with substantially more complex code by
10102the garbage collector runtime, as needed. The '``llvm.gcread``'
10103intrinsic may only be used in a function which :ref:`specifies a GC
10104algorithm <gc>`.
10105
10106.. _int_gcwrite:
10107
10108'``llvm.gcwrite``' Intrinsic
10109^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10110
10111Syntax:
10112"""""""
10113
10114::
10115
10116 declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
10117
10118Overview:
10119"""""""""
10120
10121The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
10122locations, allowing garbage collector implementations that require write
10123barriers (such as generational or reference counting collectors).
10124
10125Arguments:
10126""""""""""
10127
10128The first argument is the reference to store, the second is the start of
10129the object to store it to, and the third is the address of the field of
10130Obj to store to. If the runtime does not require a pointer to the
10131object, Obj may be null.
10132
10133Semantics:
10134""""""""""
10135
10136The '``llvm.gcwrite``' intrinsic has the same semantics as a store
10137instruction, but may be replaced with substantially more complex code by
10138the garbage collector runtime, as needed. The '``llvm.gcwrite``'
10139intrinsic may only be used in a function which :ref:`specifies a GC
10140algorithm <gc>`.
10141
10142Code Generator Intrinsics
10143-------------------------
10144
10145These intrinsics are provided by LLVM to expose special features that
10146may only be implemented with code generator support.
10147
10148'``llvm.returnaddress``' Intrinsic
10149^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10150
10151Syntax:
10152"""""""
10153
10154::
10155
George Burgess IVfbc34982017-05-20 04:52:29 +000010156 declare i8* @llvm.returnaddress(i32 <level>)
Sean Silvab084af42012-12-07 10:36:55 +000010157
10158Overview:
10159"""""""""
10160
10161The '``llvm.returnaddress``' intrinsic attempts to compute a
10162target-specific value indicating the return address of the current
10163function or one of its callers.
10164
10165Arguments:
10166""""""""""
10167
10168The argument to this intrinsic indicates which function to return the
10169address for. Zero indicates the calling function, one indicates its
10170caller, etc. The argument is **required** to be a constant integer
10171value.
10172
10173Semantics:
10174""""""""""
10175
10176The '``llvm.returnaddress``' intrinsic either returns a pointer
10177indicating the return address of the specified call frame, or zero if it
10178cannot be identified. The value returned by this intrinsic is likely to
10179be incorrect or 0 for arguments other than zero, so it should only be
10180used for debugging purposes.
10181
10182Note that calling this intrinsic does not prevent function inlining or
10183other aggressive transformations, so the value returned may not be that
10184of the obvious source-language caller.
10185
Albert Gutowski795d7d62016-10-12 22:13:19 +000010186'``llvm.addressofreturnaddress``' Intrinsic
Albert Gutowski57ad5fe2016-10-12 23:10:02 +000010187^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Albert Gutowski795d7d62016-10-12 22:13:19 +000010188
10189Syntax:
10190"""""""
10191
10192::
10193
George Burgess IVfbc34982017-05-20 04:52:29 +000010194 declare i8* @llvm.addressofreturnaddress()
Albert Gutowski795d7d62016-10-12 22:13:19 +000010195
10196Overview:
10197"""""""""
10198
10199The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
10200pointer to the place in the stack frame where the return address of the
10201current function is stored.
10202
10203Semantics:
10204""""""""""
10205
10206Note that calling this intrinsic does not prevent function inlining or
10207other aggressive transformations, so the value returned may not be that
10208of the obvious source-language caller.
10209
10210This intrinsic is only implemented for x86.
10211
Sean Silvab084af42012-12-07 10:36:55 +000010212'``llvm.frameaddress``' Intrinsic
10213^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10214
10215Syntax:
10216"""""""
10217
10218::
10219
10220 declare i8* @llvm.frameaddress(i32 <level>)
10221
10222Overview:
10223"""""""""
10224
10225The '``llvm.frameaddress``' intrinsic attempts to return the
10226target-specific frame pointer value for the specified stack frame.
10227
10228Arguments:
10229""""""""""
10230
10231The argument to this intrinsic indicates which function to return the
10232frame pointer for. Zero indicates the calling function, one indicates
10233its caller, etc. The argument is **required** to be a constant integer
10234value.
10235
10236Semantics:
10237""""""""""
10238
10239The '``llvm.frameaddress``' intrinsic either returns a pointer
10240indicating the frame address of the specified call frame, or zero if it
10241cannot be identified. The value returned by this intrinsic is likely to
10242be incorrect or 0 for arguments other than zero, so it should only be
10243used for debugging purposes.
10244
10245Note that calling this intrinsic does not prevent function inlining or
10246other aggressive transformations, so the value returned may not be that
10247of the obvious source-language caller.
10248
Reid Kleckner60381792015-07-07 22:25:32 +000010249'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
Reid Klecknere9b89312015-01-13 00:48:10 +000010250^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10251
10252Syntax:
10253"""""""
10254
10255::
10256
Reid Kleckner60381792015-07-07 22:25:32 +000010257 declare void @llvm.localescape(...)
10258 declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
Reid Klecknere9b89312015-01-13 00:48:10 +000010259
10260Overview:
10261"""""""""
10262
Reid Kleckner60381792015-07-07 22:25:32 +000010263The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
10264allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
Reid Klecknercfb9ce52015-03-05 18:26:34 +000010265live frame pointer to recover the address of the allocation. The offset is
Reid Kleckner60381792015-07-07 22:25:32 +000010266computed during frame layout of the caller of ``llvm.localescape``.
Reid Klecknere9b89312015-01-13 00:48:10 +000010267
10268Arguments:
10269""""""""""
10270
Reid Kleckner60381792015-07-07 22:25:32 +000010271All arguments to '``llvm.localescape``' must be pointers to static allocas or
10272casts of static allocas. Each function can only call '``llvm.localescape``'
Reid Klecknercfb9ce52015-03-05 18:26:34 +000010273once, and it can only do so from the entry block.
Reid Klecknere9b89312015-01-13 00:48:10 +000010274
Reid Kleckner60381792015-07-07 22:25:32 +000010275The ``func`` argument to '``llvm.localrecover``' must be a constant
Reid Klecknere9b89312015-01-13 00:48:10 +000010276bitcasted pointer to a function defined in the current module. The code
10277generator cannot determine the frame allocation offset of functions defined in
10278other modules.
10279
Reid Klecknerd5afc62f2015-07-07 23:23:03 +000010280The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
10281call frame that is currently live. The return value of '``llvm.localaddress``'
10282is one way to produce such a value, but various runtimes also expose a suitable
10283pointer in platform-specific ways.
Reid Klecknere9b89312015-01-13 00:48:10 +000010284
Reid Kleckner60381792015-07-07 22:25:32 +000010285The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
10286'``llvm.localescape``' to recover. It is zero-indexed.
Reid Klecknercfb9ce52015-03-05 18:26:34 +000010287
Reid Klecknere9b89312015-01-13 00:48:10 +000010288Semantics:
10289""""""""""
10290
Reid Kleckner60381792015-07-07 22:25:32 +000010291These intrinsics allow a group of functions to share access to a set of local
10292stack allocations of a one parent function. The parent function may call the
10293'``llvm.localescape``' intrinsic once from the function entry block, and the
10294child functions can use '``llvm.localrecover``' to access the escaped allocas.
10295The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
10296the escaped allocas are allocated, which would break attempts to use
10297'``llvm.localrecover``'.
Reid Klecknere9b89312015-01-13 00:48:10 +000010298
Renato Golinc7aea402014-05-06 16:51:25 +000010299.. _int_read_register:
10300.. _int_write_register:
10301
10302'``llvm.read_register``' and '``llvm.write_register``' Intrinsics
10303^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10304
10305Syntax:
10306"""""""
10307
10308::
10309
10310 declare i32 @llvm.read_register.i32(metadata)
10311 declare i64 @llvm.read_register.i64(metadata)
10312 declare void @llvm.write_register.i32(metadata, i32 @value)
10313 declare void @llvm.write_register.i64(metadata, i64 @value)
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +000010314 !0 = !{!"sp\00"}
Renato Golinc7aea402014-05-06 16:51:25 +000010315
10316Overview:
10317"""""""""
10318
10319The '``llvm.read_register``' and '``llvm.write_register``' intrinsics
10320provides access to the named register. The register must be valid on
10321the architecture being compiled to. The type needs to be compatible
10322with the register being read.
10323
10324Semantics:
10325""""""""""
10326
10327The '``llvm.read_register``' intrinsic returns the current value of the
10328register, where possible. The '``llvm.write_register``' intrinsic sets
10329the current value of the register, where possible.
10330
10331This is useful to implement named register global variables that need
10332to always be mapped to a specific register, as is common practice on
10333bare-metal programs including OS kernels.
10334
10335The compiler doesn't check for register availability or use of the used
10336register in surrounding code, including inline assembly. Because of that,
10337allocatable registers are not supported.
10338
10339Warning: So far it only works with the stack pointer on selected
Tim Northover3b0846e2014-05-24 12:50:23 +000010340architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
Renato Golinc7aea402014-05-06 16:51:25 +000010341work is needed to support other registers and even more so, allocatable
10342registers.
10343
Sean Silvab084af42012-12-07 10:36:55 +000010344.. _int_stacksave:
10345
10346'``llvm.stacksave``' Intrinsic
10347^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10348
10349Syntax:
10350"""""""
10351
10352::
10353
10354 declare i8* @llvm.stacksave()
10355
10356Overview:
10357"""""""""
10358
10359The '``llvm.stacksave``' intrinsic is used to remember the current state
10360of the function stack, for use with
10361:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
10362implementing language features like scoped automatic variable sized
10363arrays in C99.
10364
10365Semantics:
10366""""""""""
10367
10368This intrinsic returns a opaque pointer value that can be passed to
10369:ref:`llvm.stackrestore <int_stackrestore>`. When an
10370``llvm.stackrestore`` intrinsic is executed with a value saved from
10371``llvm.stacksave``, it effectively restores the state of the stack to
10372the state it was in when the ``llvm.stacksave`` intrinsic executed. In
10373practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
10374were allocated after the ``llvm.stacksave`` was executed.
10375
10376.. _int_stackrestore:
10377
10378'``llvm.stackrestore``' Intrinsic
10379^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10380
10381Syntax:
10382"""""""
10383
10384::
10385
10386 declare void @llvm.stackrestore(i8* %ptr)
10387
10388Overview:
10389"""""""""
10390
10391The '``llvm.stackrestore``' intrinsic is used to restore the state of
10392the function stack to the state it was in when the corresponding
10393:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
10394useful for implementing language features like scoped automatic variable
10395sized arrays in C99.
10396
10397Semantics:
10398""""""""""
10399
10400See the description for :ref:`llvm.stacksave <int_stacksave>`.
10401
Yury Gribovd7dbb662015-12-01 11:40:55 +000010402.. _int_get_dynamic_area_offset:
10403
10404'``llvm.get.dynamic.area.offset``' Intrinsic
Yury Gribov81f3f152015-12-01 13:24:48 +000010405^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Yury Gribovd7dbb662015-12-01 11:40:55 +000010406
10407Syntax:
10408"""""""
10409
10410::
10411
10412 declare i32 @llvm.get.dynamic.area.offset.i32()
10413 declare i64 @llvm.get.dynamic.area.offset.i64()
10414
Lang Hames10239932016-10-08 00:20:42 +000010415Overview:
10416"""""""""
Yury Gribovd7dbb662015-12-01 11:40:55 +000010417
10418 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
10419 get the offset from native stack pointer to the address of the most
10420 recent dynamic alloca on the caller's stack. These intrinsics are
10421 intendend for use in combination with
10422 :ref:`llvm.stacksave <int_stacksave>` to get a
10423 pointer to the most recent dynamic alloca. This is useful, for example,
10424 for AddressSanitizer's stack unpoisoning routines.
10425
10426Semantics:
10427""""""""""
10428
10429 These intrinsics return a non-negative integer value that can be used to
10430 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
10431 on the caller's stack. In particular, for targets where stack grows downwards,
10432 adding this offset to the native stack pointer would get the address of the most
10433 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
Sylvestre Ledru0455cbe2016-07-28 09:28:58 +000010434 complicated, because subtracting this value from stack pointer would get the address
Yury Gribovd7dbb662015-12-01 11:40:55 +000010435 one past the end of the most recent dynamic alloca.
10436
10437 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
10438 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
10439 compile-time-known constant value.
10440
10441 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
Matt Arsenaultc749bdc2017-03-30 23:36:47 +000010442 must match the target's default address space's (address space 0) pointer type.
Yury Gribovd7dbb662015-12-01 11:40:55 +000010443
Sean Silvab084af42012-12-07 10:36:55 +000010444'``llvm.prefetch``' Intrinsic
10445^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10446
10447Syntax:
10448"""""""
10449
10450::
10451
10452 declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
10453
10454Overview:
10455"""""""""
10456
10457The '``llvm.prefetch``' intrinsic is a hint to the code generator to
10458insert a prefetch instruction if supported; otherwise, it is a noop.
10459Prefetches have no effect on the behavior of the program but can change
10460its performance characteristics.
10461
10462Arguments:
10463""""""""""
10464
10465``address`` is the address to be prefetched, ``rw`` is the specifier
10466determining if the fetch should be for a read (0) or write (1), and
10467``locality`` is a temporal locality specifier ranging from (0) - no
10468locality, to (3) - extremely local keep in cache. The ``cache type``
10469specifies whether the prefetch is performed on the data (1) or
10470instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
10471arguments must be constant integers.
10472
10473Semantics:
10474""""""""""
10475
10476This intrinsic does not modify the behavior of the program. In
10477particular, prefetches cannot trap and do not produce a value. On
10478targets that support this intrinsic, the prefetch can provide hints to
10479the processor cache for better performance.
10480
10481'``llvm.pcmarker``' Intrinsic
10482^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10483
10484Syntax:
10485"""""""
10486
10487::
10488
10489 declare void @llvm.pcmarker(i32 <id>)
10490
10491Overview:
10492"""""""""
10493
10494The '``llvm.pcmarker``' intrinsic is a method to export a Program
10495Counter (PC) in a region of code to simulators and other tools. The
10496method is target specific, but it is expected that the marker will use
10497exported symbols to transmit the PC of the marker. The marker makes no
10498guarantees that it will remain with any specific instruction after
10499optimizations. It is possible that the presence of a marker will inhibit
10500optimizations. The intended use is to be inserted after optimizations to
10501allow correlations of simulation runs.
10502
10503Arguments:
10504""""""""""
10505
10506``id`` is a numerical id identifying the marker.
10507
10508Semantics:
10509""""""""""
10510
10511This intrinsic does not modify the behavior of the program. Backends
10512that do not support this intrinsic may ignore it.
10513
10514'``llvm.readcyclecounter``' Intrinsic
10515^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10516
10517Syntax:
10518"""""""
10519
10520::
10521
10522 declare i64 @llvm.readcyclecounter()
10523
10524Overview:
10525"""""""""
10526
10527The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
10528counter register (or similar low latency, high accuracy clocks) on those
10529targets that support it. On X86, it should map to RDTSC. On Alpha, it
10530should map to RPCC. As the backing counters overflow quickly (on the
10531order of 9 seconds on alpha), this should only be used for small
10532timings.
10533
10534Semantics:
10535""""""""""
10536
10537When directly supported, reading the cycle counter should not modify any
10538memory. Implementations are allowed to either return a application
10539specific value or a system wide value. On backends without support, this
10540is lowered to a constant 0.
10541
Tim Northoverbc933082013-05-23 19:11:20 +000010542Note that runtime support may be conditional on the privilege-level code is
10543running at and the host platform.
10544
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010545'``llvm.clear_cache``' Intrinsic
10546^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10547
10548Syntax:
10549"""""""
10550
10551::
10552
10553 declare void @llvm.clear_cache(i8*, i8*)
10554
10555Overview:
10556"""""""""
10557
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010558The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
10559in the specified range to the execution unit of the processor. On
10560targets with non-unified instruction and data cache, the implementation
10561flushes the instruction cache.
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010562
10563Semantics:
10564""""""""""
10565
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010566On platforms with coherent instruction and data caches (e.g. x86), this
10567intrinsic is a nop. On platforms with non-coherent instruction and data
Alp Toker16f98b22014-04-09 14:47:27 +000010568cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010569instructions or a system call, if cache flushing requires special
10570privileges.
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010571
Sean Silvad02bf3e2014-04-07 22:29:53 +000010572The default behavior is to emit a call to ``__clear_cache`` from the run
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010573time library.
Renato Golin93010e62014-03-26 14:01:32 +000010574
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010575This instrinsic does *not* empty the instruction pipeline. Modifications
10576of the current function are outside the scope of the intrinsic.
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010577
Vedant Kumar51ce6682018-01-26 23:54:25 +000010578'``llvm.instrprof.increment``' Intrinsic
Justin Bogner61ba2e32014-12-08 18:02:35 +000010579^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10580
10581Syntax:
10582"""""""
10583
10584::
10585
Vedant Kumar51ce6682018-01-26 23:54:25 +000010586 declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>,
Justin Bogner61ba2e32014-12-08 18:02:35 +000010587 i32 <num-counters>, i32 <index>)
10588
10589Overview:
10590"""""""""
10591
Vedant Kumar51ce6682018-01-26 23:54:25 +000010592The '``llvm.instrprof.increment``' intrinsic can be emitted by a
Justin Bogner61ba2e32014-12-08 18:02:35 +000010593frontend for use with instrumentation based profiling. These will be
10594lowered by the ``-instrprof`` pass to generate execution counts of a
10595program at runtime.
10596
10597Arguments:
10598""""""""""
10599
10600The first argument is a pointer to a global variable containing the
10601name of the entity being instrumented. This should generally be the
10602(mangled) function name for a set of counters.
10603
10604The second argument is a hash value that can be used by the consumer
10605of the profile data to detect changes to the instrumented source, and
10606the third is the number of counters associated with ``name``. It is an
10607error if ``hash`` or ``num-counters`` differ between two instances of
Vedant Kumar51ce6682018-01-26 23:54:25 +000010608``instrprof.increment`` that refer to the same name.
Justin Bogner61ba2e32014-12-08 18:02:35 +000010609
10610The last argument refers to which of the counters for ``name`` should
10611be incremented. It should be a value between 0 and ``num-counters``.
10612
10613Semantics:
10614""""""""""
10615
10616This intrinsic represents an increment of a profiling counter. It will
10617cause the ``-instrprof`` pass to generate the appropriate data
10618structures and the code to increment the appropriate value, in a
10619format that can be written out by a compiler runtime and consumed via
10620the ``llvm-profdata`` tool.
10621
Vedant Kumar51ce6682018-01-26 23:54:25 +000010622'``llvm.instrprof.increment.step``' Intrinsic
Xinliang David Lie1117102016-09-18 22:10:19 +000010623^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Xinliang David Li4ca17332016-09-18 18:34:07 +000010624
10625Syntax:
10626"""""""
10627
10628::
10629
Vedant Kumar51ce6682018-01-26 23:54:25 +000010630 declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>,
Xinliang David Li4ca17332016-09-18 18:34:07 +000010631 i32 <num-counters>,
10632 i32 <index>, i64 <step>)
10633
10634Overview:
10635"""""""""
10636
Vedant Kumar51ce6682018-01-26 23:54:25 +000010637The '``llvm.instrprof.increment.step``' intrinsic is an extension to
10638the '``llvm.instrprof.increment``' intrinsic with an additional fifth
Xinliang David Li4ca17332016-09-18 18:34:07 +000010639argument to specify the step of the increment.
10640
10641Arguments:
10642""""""""""
Vedant Kumar51ce6682018-01-26 23:54:25 +000010643The first four arguments are the same as '``llvm.instrprof.increment``'
Pete Couperused9569d2017-08-23 20:58:22 +000010644intrinsic.
Xinliang David Li4ca17332016-09-18 18:34:07 +000010645
10646The last argument specifies the value of the increment of the counter variable.
10647
10648Semantics:
10649""""""""""
Vedant Kumar51ce6682018-01-26 23:54:25 +000010650See description of '``llvm.instrprof.increment``' instrinsic.
Xinliang David Li4ca17332016-09-18 18:34:07 +000010651
10652
Vedant Kumar51ce6682018-01-26 23:54:25 +000010653'``llvm.instrprof.value.profile``' Intrinsic
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010654^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10655
10656Syntax:
10657"""""""
10658
10659::
10660
Vedant Kumar51ce6682018-01-26 23:54:25 +000010661 declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>,
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010662 i64 <value>, i32 <value_kind>,
10663 i32 <index>)
10664
10665Overview:
10666"""""""""
10667
Vedant Kumar51ce6682018-01-26 23:54:25 +000010668The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010669frontend for use with instrumentation based profiling. This will be
10670lowered by the ``-instrprof`` pass to find out the target values,
10671instrumented expressions take in a program at runtime.
10672
10673Arguments:
10674""""""""""
10675
10676The first argument is a pointer to a global variable containing the
10677name of the entity being instrumented. ``name`` should generally be the
10678(mangled) function name for a set of counters.
10679
10680The second argument is a hash value that can be used by the consumer
10681of the profile data to detect changes to the instrumented source. It
10682is an error if ``hash`` differs between two instances of
Vedant Kumar51ce6682018-01-26 23:54:25 +000010683``llvm.instrprof.*`` that refer to the same name.
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010684
10685The third argument is the value of the expression being profiled. The profiled
10686expression's value should be representable as an unsigned 64-bit value. The
10687fourth argument represents the kind of value profiling that is being done. The
10688supported value profiling kinds are enumerated through the
10689``InstrProfValueKind`` type declared in the
10690``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
10691index of the instrumented expression within ``name``. It should be >= 0.
10692
10693Semantics:
10694""""""""""
10695
10696This intrinsic represents the point where a call to a runtime routine
10697should be inserted for value profiling of target expressions. ``-instrprof``
10698pass will generate the appropriate data structures and replace the
Vedant Kumar51ce6682018-01-26 23:54:25 +000010699``llvm.instrprof.value.profile`` intrinsic with the call to the profile
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010700runtime library with proper arguments.
10701
Marcin Koscielnicki3fdc2572016-04-19 20:51:05 +000010702'``llvm.thread.pointer``' Intrinsic
10703^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10704
10705Syntax:
10706"""""""
10707
10708::
10709
10710 declare i8* @llvm.thread.pointer()
10711
10712Overview:
10713"""""""""
10714
10715The '``llvm.thread.pointer``' intrinsic returns the value of the thread
10716pointer.
10717
10718Semantics:
10719""""""""""
10720
10721The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
10722for the current thread. The exact semantics of this value are target
10723specific: it may point to the start of TLS area, to the end, or somewhere
10724in the middle. Depending on the target, this intrinsic may read a register,
10725call a helper function, read from an alternate memory space, or perform
10726other operations necessary to locate the TLS area. Not all targets support
10727this intrinsic.
10728
Sean Silvab084af42012-12-07 10:36:55 +000010729Standard C Library Intrinsics
10730-----------------------------
10731
10732LLVM provides intrinsics for a few important standard C library
10733functions. These intrinsics allow source-language front-ends to pass
10734information about the alignment of the pointer arguments to the code
10735generator, providing opportunity for more efficient code generation.
10736
10737.. _int_memcpy:
10738
10739'``llvm.memcpy``' Intrinsic
10740^^^^^^^^^^^^^^^^^^^^^^^^^^^
10741
10742Syntax:
10743"""""""
10744
10745This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
10746integer bit width and for different address spaces. Not all targets
10747support all bit widths however.
10748
10749::
10750
10751 declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010752 i32 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010753 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010754 i64 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010755
10756Overview:
10757"""""""""
10758
10759The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
10760source location to the destination location.
10761
10762Note that, unlike the standard libc function, the ``llvm.memcpy.*``
Daniel Neilson1e687242018-01-19 17:13:12 +000010763intrinsics do not return a value, takes extra isvolatile
Sean Silvab084af42012-12-07 10:36:55 +000010764arguments and the pointers can be in specified address spaces.
10765
10766Arguments:
10767""""""""""
10768
10769The first argument is a pointer to the destination, the second is a
10770pointer to the source. The third argument is an integer argument
Daniel Neilson1e687242018-01-19 17:13:12 +000010771specifying the number of bytes to copy, and the fourth is a
Sean Silvab084af42012-12-07 10:36:55 +000010772boolean indicating a volatile access.
10773
Daniel Neilson39eb6a52018-01-19 17:24:21 +000010774The :ref:`align <attr_align>` parameter attribute can be provided
Daniel Neilson1e687242018-01-19 17:13:12 +000010775for the first and second arguments.
Sean Silvab084af42012-12-07 10:36:55 +000010776
10777If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
10778a :ref:`volatile operation <volatile>`. The detailed access behavior is not
10779very cleanly specified and it is unwise to depend on it.
10780
10781Semantics:
10782""""""""""
10783
10784The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
10785source location to the destination location, which are not allowed to
10786overlap. It copies "len" bytes of memory over. If the argument is known
10787to be aligned to some boundary, this can be specified as the fourth
Bill Wendling61163152013-10-18 23:26:55 +000010788argument, otherwise it should be set to 0 or 1 (both meaning no alignment).
Sean Silvab084af42012-12-07 10:36:55 +000010789
Daniel Neilson57226ef2017-07-12 15:25:26 +000010790.. _int_memmove:
10791
Sean Silvab084af42012-12-07 10:36:55 +000010792'``llvm.memmove``' Intrinsic
10793^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10794
10795Syntax:
10796"""""""
10797
10798This is an overloaded intrinsic. You can use llvm.memmove on any integer
10799bit width and for different address space. Not all targets support all
10800bit widths however.
10801
10802::
10803
10804 declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010805 i32 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010806 declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010807 i64 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010808
10809Overview:
10810"""""""""
10811
10812The '``llvm.memmove.*``' intrinsics move a block of memory from the
10813source location to the destination location. It is similar to the
10814'``llvm.memcpy``' intrinsic but allows the two memory locations to
10815overlap.
10816
10817Note that, unlike the standard libc function, the ``llvm.memmove.*``
Daniel Neilson1e687242018-01-19 17:13:12 +000010818intrinsics do not return a value, takes an extra isvolatile
10819argument and the pointers can be in specified address spaces.
Sean Silvab084af42012-12-07 10:36:55 +000010820
10821Arguments:
10822""""""""""
10823
10824The first argument is a pointer to the destination, the second is a
10825pointer to the source. The third argument is an integer argument
Daniel Neilson1e687242018-01-19 17:13:12 +000010826specifying the number of bytes to copy, and the fourth is a
Sean Silvab084af42012-12-07 10:36:55 +000010827boolean indicating a volatile access.
10828
Daniel Neilsonaac0f8f2018-01-19 17:32:33 +000010829The :ref:`align <attr_align>` parameter attribute can be provided
Daniel Neilson1e687242018-01-19 17:13:12 +000010830for the first and second arguments.
Sean Silvab084af42012-12-07 10:36:55 +000010831
10832If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
10833is a :ref:`volatile operation <volatile>`. The detailed access behavior is
10834not very cleanly specified and it is unwise to depend on it.
10835
10836Semantics:
10837""""""""""
10838
10839The '``llvm.memmove.*``' intrinsics copy a block of memory from the
10840source location to the destination location, which may overlap. It
10841copies "len" bytes of memory over. If the argument is known to be
10842aligned to some boundary, this can be specified as the fourth argument,
Bill Wendling61163152013-10-18 23:26:55 +000010843otherwise it should be set to 0 or 1 (both meaning no alignment).
Sean Silvab084af42012-12-07 10:36:55 +000010844
Daniel Neilson965613e2017-07-12 21:57:23 +000010845.. _int_memset:
10846
Sean Silvab084af42012-12-07 10:36:55 +000010847'``llvm.memset.*``' Intrinsics
10848^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10849
10850Syntax:
10851"""""""
10852
10853This is an overloaded intrinsic. You can use llvm.memset on any integer
10854bit width and for different address spaces. However, not all targets
10855support all bit widths.
10856
10857::
10858
10859 declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010860 i32 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010861 declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010862 i64 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010863
10864Overview:
10865"""""""""
10866
10867The '``llvm.memset.*``' intrinsics fill a block of memory with a
10868particular byte value.
10869
10870Note that, unlike the standard libc function, the ``llvm.memset``
Daniel Neilson1e687242018-01-19 17:13:12 +000010871intrinsic does not return a value and takes an extra volatile
10872argument. Also, the destination can be in an arbitrary address space.
Sean Silvab084af42012-12-07 10:36:55 +000010873
10874Arguments:
10875""""""""""
10876
10877The first argument is a pointer to the destination to fill, the second
10878is the byte value with which to fill it, the third argument is an
10879integer argument specifying the number of bytes to fill, and the fourth
Daniel Neilson1e687242018-01-19 17:13:12 +000010880is a boolean indicating a volatile access.
Sean Silvab084af42012-12-07 10:36:55 +000010881
Daniel Neilsonaac0f8f2018-01-19 17:32:33 +000010882The :ref:`align <attr_align>` parameter attribute can be provided
Daniel Neilson1e687242018-01-19 17:13:12 +000010883for the first arguments.
Sean Silvab084af42012-12-07 10:36:55 +000010884
10885If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
10886a :ref:`volatile operation <volatile>`. The detailed access behavior is not
10887very cleanly specified and it is unwise to depend on it.
10888
10889Semantics:
10890""""""""""
10891
10892The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010893at the destination location.
Sean Silvab084af42012-12-07 10:36:55 +000010894
10895'``llvm.sqrt.*``' Intrinsic
10896^^^^^^^^^^^^^^^^^^^^^^^^^^^
10897
10898Syntax:
10899"""""""
10900
10901This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010902floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010903all types however.
10904
10905::
10906
10907 declare float @llvm.sqrt.f32(float %Val)
10908 declare double @llvm.sqrt.f64(double %Val)
10909 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
10910 declare fp128 @llvm.sqrt.f128(fp128 %Val)
10911 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
10912
10913Overview:
10914"""""""""
10915
Sanjay Patel629c4112017-11-06 16:27:15 +000010916The '``llvm.sqrt``' intrinsics return the square root of the specified value.
Sean Silvab084af42012-12-07 10:36:55 +000010917
10918Arguments:
10919""""""""""
10920
Sanjay Patel629c4112017-11-06 16:27:15 +000010921The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010922
10923Semantics:
10924""""""""""
10925
Sanjay Patel629c4112017-11-06 16:27:15 +000010926Return the same value as a corresponding libm '``sqrt``' function but without
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010927trapping or setting ``errno``. For types specified by IEEE-754, the result
Sanjay Patel629c4112017-11-06 16:27:15 +000010928matches a conforming libm implementation.
10929
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010930When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010931using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010932
10933'``llvm.powi.*``' Intrinsic
10934^^^^^^^^^^^^^^^^^^^^^^^^^^^
10935
10936Syntax:
10937"""""""
10938
10939This is an overloaded intrinsic. You can use ``llvm.powi`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000010940floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010941all types however.
10942
10943::
10944
10945 declare float @llvm.powi.f32(float %Val, i32 %power)
10946 declare double @llvm.powi.f64(double %Val, i32 %power)
10947 declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power)
10948 declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power)
10949 declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power)
10950
10951Overview:
10952"""""""""
10953
10954The '``llvm.powi.*``' intrinsics return the first operand raised to the
10955specified (positive or negative) power. The order of evaluation of
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000010956multiplications is not defined. When a vector of floating-point type is
Sean Silvab084af42012-12-07 10:36:55 +000010957used, the second argument remains a scalar integer value.
10958
10959Arguments:
10960""""""""""
10961
10962The second argument is an integer power, and the first is a value to
10963raise to that power.
10964
10965Semantics:
10966""""""""""
10967
10968This function returns the first value raised to the second power with an
10969unspecified sequence of rounding operations.
10970
10971'``llvm.sin.*``' Intrinsic
10972^^^^^^^^^^^^^^^^^^^^^^^^^^
10973
10974Syntax:
10975"""""""
10976
10977This is an overloaded intrinsic. You can use ``llvm.sin`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010978floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010979all types however.
10980
10981::
10982
10983 declare float @llvm.sin.f32(float %Val)
10984 declare double @llvm.sin.f64(double %Val)
10985 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val)
10986 declare fp128 @llvm.sin.f128(fp128 %Val)
10987 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val)
10988
10989Overview:
10990"""""""""
10991
10992The '``llvm.sin.*``' intrinsics return the sine of the operand.
10993
10994Arguments:
10995""""""""""
10996
Sanjay Patel629c4112017-11-06 16:27:15 +000010997The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010998
10999Semantics:
11000""""""""""
11001
Sanjay Patel629c4112017-11-06 16:27:15 +000011002Return the same value as a corresponding libm '``sin``' function but without
11003trapping or setting ``errno``.
11004
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011005When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011006using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011007
11008'``llvm.cos.*``' Intrinsic
11009^^^^^^^^^^^^^^^^^^^^^^^^^^
11010
11011Syntax:
11012"""""""
11013
11014This is an overloaded intrinsic. You can use ``llvm.cos`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011015floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011016all types however.
11017
11018::
11019
11020 declare float @llvm.cos.f32(float %Val)
11021 declare double @llvm.cos.f64(double %Val)
11022 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val)
11023 declare fp128 @llvm.cos.f128(fp128 %Val)
11024 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val)
11025
11026Overview:
11027"""""""""
11028
11029The '``llvm.cos.*``' intrinsics return the cosine of the operand.
11030
11031Arguments:
11032""""""""""
11033
Sanjay Patel629c4112017-11-06 16:27:15 +000011034The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011035
11036Semantics:
11037""""""""""
11038
Sanjay Patel629c4112017-11-06 16:27:15 +000011039Return the same value as a corresponding libm '``cos``' function but without
11040trapping or setting ``errno``.
11041
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011042When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011043using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011044
11045'``llvm.pow.*``' Intrinsic
11046^^^^^^^^^^^^^^^^^^^^^^^^^^
11047
11048Syntax:
11049"""""""
11050
11051This is an overloaded intrinsic. You can use ``llvm.pow`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011052floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011053all types however.
11054
11055::
11056
11057 declare float @llvm.pow.f32(float %Val, float %Power)
11058 declare double @llvm.pow.f64(double %Val, double %Power)
11059 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power)
11060 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power)
11061 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power)
11062
11063Overview:
11064"""""""""
11065
11066The '``llvm.pow.*``' intrinsics return the first operand raised to the
11067specified (positive or negative) power.
11068
11069Arguments:
11070""""""""""
11071
Sanjay Patel629c4112017-11-06 16:27:15 +000011072The arguments and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011073
11074Semantics:
11075""""""""""
11076
Sanjay Patel629c4112017-11-06 16:27:15 +000011077Return the same value as a corresponding libm '``pow``' function but without
11078trapping or setting ``errno``.
11079
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011080When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011081using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011082
11083'``llvm.exp.*``' Intrinsic
11084^^^^^^^^^^^^^^^^^^^^^^^^^^
11085
11086Syntax:
11087"""""""
11088
11089This is an overloaded intrinsic. You can use ``llvm.exp`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011090floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011091all types however.
11092
11093::
11094
11095 declare float @llvm.exp.f32(float %Val)
11096 declare double @llvm.exp.f64(double %Val)
11097 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val)
11098 declare fp128 @llvm.exp.f128(fp128 %Val)
11099 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val)
11100
11101Overview:
11102"""""""""
11103
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000011104The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
11105value.
Sean Silvab084af42012-12-07 10:36:55 +000011106
11107Arguments:
11108""""""""""
11109
Sanjay Patel629c4112017-11-06 16:27:15 +000011110The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011111
11112Semantics:
11113""""""""""
11114
Sanjay Patel629c4112017-11-06 16:27:15 +000011115Return the same value as a corresponding libm '``exp``' function but without
11116trapping or setting ``errno``.
11117
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011118When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011119using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011120
11121'``llvm.exp2.*``' Intrinsic
11122^^^^^^^^^^^^^^^^^^^^^^^^^^^
11123
11124Syntax:
11125"""""""
11126
11127This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011128floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011129all types however.
11130
11131::
11132
11133 declare float @llvm.exp2.f32(float %Val)
11134 declare double @llvm.exp2.f64(double %Val)
11135 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val)
11136 declare fp128 @llvm.exp2.f128(fp128 %Val)
11137 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val)
11138
11139Overview:
11140"""""""""
11141
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000011142The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
11143specified value.
Sean Silvab084af42012-12-07 10:36:55 +000011144
11145Arguments:
11146""""""""""
11147
Sanjay Patel629c4112017-11-06 16:27:15 +000011148The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011149
11150Semantics:
11151""""""""""
11152
Sanjay Patel629c4112017-11-06 16:27:15 +000011153Return the same value as a corresponding libm '``exp2``' function but without
11154trapping or setting ``errno``.
11155
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011156When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011157using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011158
11159'``llvm.log.*``' Intrinsic
11160^^^^^^^^^^^^^^^^^^^^^^^^^^
11161
11162Syntax:
11163"""""""
11164
11165This is an overloaded intrinsic. You can use ``llvm.log`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011166floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011167all types however.
11168
11169::
11170
11171 declare float @llvm.log.f32(float %Val)
11172 declare double @llvm.log.f64(double %Val)
11173 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val)
11174 declare fp128 @llvm.log.f128(fp128 %Val)
11175 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val)
11176
11177Overview:
11178"""""""""
11179
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000011180The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
11181value.
Sean Silvab084af42012-12-07 10:36:55 +000011182
11183Arguments:
11184""""""""""
11185
Sanjay Patel629c4112017-11-06 16:27:15 +000011186The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011187
11188Semantics:
11189""""""""""
11190
Sanjay Patel629c4112017-11-06 16:27:15 +000011191Return the same value as a corresponding libm '``log``' function but without
11192trapping or setting ``errno``.
11193
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011194When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011195using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011196
11197'``llvm.log10.*``' Intrinsic
11198^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11199
11200Syntax:
11201"""""""
11202
11203This is an overloaded intrinsic. You can use ``llvm.log10`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011204floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011205all types however.
11206
11207::
11208
11209 declare float @llvm.log10.f32(float %Val)
11210 declare double @llvm.log10.f64(double %Val)
11211 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val)
11212 declare fp128 @llvm.log10.f128(fp128 %Val)
11213 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val)
11214
11215Overview:
11216"""""""""
11217
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000011218The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
11219specified value.
Sean Silvab084af42012-12-07 10:36:55 +000011220
11221Arguments:
11222""""""""""
11223
Sanjay Patel629c4112017-11-06 16:27:15 +000011224The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011225
11226Semantics:
11227""""""""""
11228
Sanjay Patel629c4112017-11-06 16:27:15 +000011229Return the same value as a corresponding libm '``log10``' function but without
11230trapping or setting ``errno``.
11231
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011232When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011233using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011234
11235'``llvm.log2.*``' Intrinsic
11236^^^^^^^^^^^^^^^^^^^^^^^^^^^
11237
11238Syntax:
11239"""""""
11240
11241This is an overloaded intrinsic. You can use ``llvm.log2`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011242floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011243all types however.
11244
11245::
11246
11247 declare float @llvm.log2.f32(float %Val)
11248 declare double @llvm.log2.f64(double %Val)
11249 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val)
11250 declare fp128 @llvm.log2.f128(fp128 %Val)
11251 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val)
11252
11253Overview:
11254"""""""""
11255
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000011256The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
11257value.
Sean Silvab084af42012-12-07 10:36:55 +000011258
11259Arguments:
11260""""""""""
11261
Sanjay Patel629c4112017-11-06 16:27:15 +000011262The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011263
11264Semantics:
11265""""""""""
11266
Sanjay Patel629c4112017-11-06 16:27:15 +000011267Return the same value as a corresponding libm '``log2``' function but without
11268trapping or setting ``errno``.
11269
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011270When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011271using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011272
11273'``llvm.fma.*``' Intrinsic
11274^^^^^^^^^^^^^^^^^^^^^^^^^^
11275
11276Syntax:
11277"""""""
11278
11279This is an overloaded intrinsic. You can use ``llvm.fma`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000011280floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011281all types however.
11282
11283::
11284
11285 declare float @llvm.fma.f32(float %a, float %b, float %c)
11286 declare double @llvm.fma.f64(double %a, double %b, double %c)
11287 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
11288 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
11289 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
11290
11291Overview:
11292"""""""""
11293
Sanjay Patel629c4112017-11-06 16:27:15 +000011294The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
Sean Silvab084af42012-12-07 10:36:55 +000011295
11296Arguments:
11297""""""""""
11298
Sanjay Patel629c4112017-11-06 16:27:15 +000011299The arguments and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000011300
11301Semantics:
11302""""""""""
11303
Sanjay Patel629c4112017-11-06 16:27:15 +000011304Return the same value as a corresponding libm '``fma``' function but without
11305trapping or setting ``errno``.
11306
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000011307When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000011308using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000011309
11310'``llvm.fabs.*``' Intrinsic
11311^^^^^^^^^^^^^^^^^^^^^^^^^^^
11312
11313Syntax:
11314"""""""
11315
11316This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011317floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011318all types however.
11319
11320::
11321
11322 declare float @llvm.fabs.f32(float %Val)
11323 declare double @llvm.fabs.f64(double %Val)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011324 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val)
Sean Silvab084af42012-12-07 10:36:55 +000011325 declare fp128 @llvm.fabs.f128(fp128 %Val)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011326 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
Sean Silvab084af42012-12-07 10:36:55 +000011327
11328Overview:
11329"""""""""
11330
11331The '``llvm.fabs.*``' intrinsics return the absolute value of the
11332operand.
11333
11334Arguments:
11335""""""""""
11336
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011337The argument and return value are floating-point numbers of the same
Sean Silvab084af42012-12-07 10:36:55 +000011338type.
11339
11340Semantics:
11341""""""""""
11342
11343This function returns the same values as the libm ``fabs`` functions
11344would, and handles error conditions in the same way.
11345
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011346'``llvm.minnum.*``' Intrinsic
Matt Arsenault9886b0d2014-10-22 00:15:53 +000011347^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011348
11349Syntax:
11350"""""""
11351
11352This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011353floating-point or vector of floating-point type. Not all targets support
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011354all types however.
11355
11356::
11357
Matt Arsenault64313c92014-10-22 18:25:02 +000011358 declare float @llvm.minnum.f32(float %Val0, float %Val1)
11359 declare double @llvm.minnum.f64(double %Val0, double %Val1)
11360 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
11361 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
11362 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011363
11364Overview:
11365"""""""""
11366
11367The '``llvm.minnum.*``' intrinsics return the minimum of the two
11368arguments.
11369
11370
11371Arguments:
11372""""""""""
11373
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011374The arguments and return value are floating-point numbers of the same
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011375type.
11376
11377Semantics:
11378""""""""""
11379
11380Follows the IEEE-754 semantics for minNum, which also match for libm's
11381fmin.
11382
11383If either operand is a NaN, returns the other non-NaN operand. Returns
11384NaN only if both operands are NaN. If the operands compare equal,
11385returns a value that compares equal to both operands. This means that
11386fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
11387
11388'``llvm.maxnum.*``' Intrinsic
Matt Arsenault9886b0d2014-10-22 00:15:53 +000011389^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011390
11391Syntax:
11392"""""""
11393
11394This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011395floating-point or vector of floating-point type. Not all targets support
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011396all types however.
11397
11398::
11399
Matt Arsenault64313c92014-10-22 18:25:02 +000011400 declare float @llvm.maxnum.f32(float %Val0, float %Val1l)
11401 declare double @llvm.maxnum.f64(double %Val0, double %Val1)
11402 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
11403 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
11404 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011405
11406Overview:
11407"""""""""
11408
11409The '``llvm.maxnum.*``' intrinsics return the maximum of the two
11410arguments.
11411
11412
11413Arguments:
11414""""""""""
11415
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011416The arguments and return value are floating-point numbers of the same
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011417type.
11418
11419Semantics:
11420""""""""""
11421Follows the IEEE-754 semantics for maxNum, which also match for libm's
11422fmax.
11423
11424If either operand is a NaN, returns the other non-NaN operand. Returns
11425NaN only if both operands are NaN. If the operands compare equal,
11426returns a value that compares equal to both operands. This means that
11427fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
11428
Hal Finkel0c5c01aa2013-08-19 23:35:46 +000011429'``llvm.copysign.*``' Intrinsic
11430^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11431
11432Syntax:
11433"""""""
11434
11435This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011436floating-point or vector of floating-point type. Not all targets support
Hal Finkel0c5c01aa2013-08-19 23:35:46 +000011437all types however.
11438
11439::
11440
11441 declare float @llvm.copysign.f32(float %Mag, float %Sgn)
11442 declare double @llvm.copysign.f64(double %Mag, double %Sgn)
11443 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn)
11444 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
11445 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn)
11446
11447Overview:
11448"""""""""
11449
11450The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
11451first operand and the sign of the second operand.
11452
11453Arguments:
11454""""""""""
11455
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011456The arguments and return value are floating-point numbers of the same
Hal Finkel0c5c01aa2013-08-19 23:35:46 +000011457type.
11458
11459Semantics:
11460""""""""""
11461
11462This function returns the same values as the libm ``copysign``
11463functions would, and handles error conditions in the same way.
11464
Sean Silvab084af42012-12-07 10:36:55 +000011465'``llvm.floor.*``' Intrinsic
11466^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11467
11468Syntax:
11469"""""""
11470
11471This is an overloaded intrinsic. You can use ``llvm.floor`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011472floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011473all types however.
11474
11475::
11476
11477 declare float @llvm.floor.f32(float %Val)
11478 declare double @llvm.floor.f64(double %Val)
11479 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val)
11480 declare fp128 @llvm.floor.f128(fp128 %Val)
11481 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val)
11482
11483Overview:
11484"""""""""
11485
11486The '``llvm.floor.*``' intrinsics return the floor of the operand.
11487
11488Arguments:
11489""""""""""
11490
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011491The argument and return value are floating-point numbers of the same
Sean Silvab084af42012-12-07 10:36:55 +000011492type.
11493
11494Semantics:
11495""""""""""
11496
11497This function returns the same values as the libm ``floor`` functions
11498would, and handles error conditions in the same way.
11499
11500'``llvm.ceil.*``' Intrinsic
11501^^^^^^^^^^^^^^^^^^^^^^^^^^^
11502
11503Syntax:
11504"""""""
11505
11506This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011507floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011508all types however.
11509
11510::
11511
11512 declare float @llvm.ceil.f32(float %Val)
11513 declare double @llvm.ceil.f64(double %Val)
11514 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val)
11515 declare fp128 @llvm.ceil.f128(fp128 %Val)
11516 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val)
11517
11518Overview:
11519"""""""""
11520
11521The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
11522
11523Arguments:
11524""""""""""
11525
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011526The argument and return value are floating-point numbers of the same
Sean Silvab084af42012-12-07 10:36:55 +000011527type.
11528
11529Semantics:
11530""""""""""
11531
11532This function returns the same values as the libm ``ceil`` functions
11533would, and handles error conditions in the same way.
11534
11535'``llvm.trunc.*``' Intrinsic
11536^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11537
11538Syntax:
11539"""""""
11540
11541This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011542floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011543all types however.
11544
11545::
11546
11547 declare float @llvm.trunc.f32(float %Val)
11548 declare double @llvm.trunc.f64(double %Val)
11549 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val)
11550 declare fp128 @llvm.trunc.f128(fp128 %Val)
11551 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val)
11552
11553Overview:
11554"""""""""
11555
11556The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
11557nearest integer not larger in magnitude than the operand.
11558
11559Arguments:
11560""""""""""
11561
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011562The argument and return value are floating-point numbers of the same
Sean Silvab084af42012-12-07 10:36:55 +000011563type.
11564
11565Semantics:
11566""""""""""
11567
11568This function returns the same values as the libm ``trunc`` functions
11569would, and handles error conditions in the same way.
11570
11571'``llvm.rint.*``' Intrinsic
11572^^^^^^^^^^^^^^^^^^^^^^^^^^^
11573
11574Syntax:
11575"""""""
11576
11577This is an overloaded intrinsic. You can use ``llvm.rint`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011578floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011579all types however.
11580
11581::
11582
11583 declare float @llvm.rint.f32(float %Val)
11584 declare double @llvm.rint.f64(double %Val)
11585 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val)
11586 declare fp128 @llvm.rint.f128(fp128 %Val)
11587 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val)
11588
11589Overview:
11590"""""""""
11591
11592The '``llvm.rint.*``' intrinsics returns the operand rounded to the
11593nearest integer. It may raise an inexact floating-point exception if the
11594operand isn't an integer.
11595
11596Arguments:
11597""""""""""
11598
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011599The argument and return value are floating-point numbers of the same
Sean Silvab084af42012-12-07 10:36:55 +000011600type.
11601
11602Semantics:
11603""""""""""
11604
11605This function returns the same values as the libm ``rint`` functions
11606would, and handles error conditions in the same way.
11607
11608'``llvm.nearbyint.*``' Intrinsic
11609^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11610
11611Syntax:
11612"""""""
11613
11614This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011615floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000011616all types however.
11617
11618::
11619
11620 declare float @llvm.nearbyint.f32(float %Val)
11621 declare double @llvm.nearbyint.f64(double %Val)
11622 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val)
11623 declare fp128 @llvm.nearbyint.f128(fp128 %Val)
11624 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val)
11625
11626Overview:
11627"""""""""
11628
11629The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
11630nearest integer.
11631
11632Arguments:
11633""""""""""
11634
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011635The argument and return value are floating-point numbers of the same
Sean Silvab084af42012-12-07 10:36:55 +000011636type.
11637
11638Semantics:
11639""""""""""
11640
11641This function returns the same values as the libm ``nearbyint``
11642functions would, and handles error conditions in the same way.
11643
Hal Finkel171817e2013-08-07 22:49:12 +000011644'``llvm.round.*``' Intrinsic
11645^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11646
11647Syntax:
11648"""""""
11649
11650This is an overloaded intrinsic. You can use ``llvm.round`` on any
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011651floating-point or vector of floating-point type. Not all targets support
Hal Finkel171817e2013-08-07 22:49:12 +000011652all types however.
11653
11654::
11655
11656 declare float @llvm.round.f32(float %Val)
11657 declare double @llvm.round.f64(double %Val)
11658 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val)
11659 declare fp128 @llvm.round.f128(fp128 %Val)
11660 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val)
11661
11662Overview:
11663"""""""""
11664
11665The '``llvm.round.*``' intrinsics returns the operand rounded to the
11666nearest integer.
11667
11668Arguments:
11669""""""""""
11670
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000011671The argument and return value are floating-point numbers of the same
Hal Finkel171817e2013-08-07 22:49:12 +000011672type.
11673
11674Semantics:
11675""""""""""
11676
11677This function returns the same values as the libm ``round``
11678functions would, and handles error conditions in the same way.
11679
Sean Silvab084af42012-12-07 10:36:55 +000011680Bit Manipulation Intrinsics
11681---------------------------
11682
11683LLVM provides intrinsics for a few important bit manipulation
11684operations. These allow efficient code generation for some algorithms.
11685
James Molloy90111f72015-11-12 12:29:09 +000011686'``llvm.bitreverse.*``' Intrinsics
Akira Hatanaka7f5562b2015-11-13 21:09:57 +000011687^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
James Molloy90111f72015-11-12 12:29:09 +000011688
11689Syntax:
11690"""""""
11691
11692This is an overloaded intrinsic function. You can use bitreverse on any
11693integer type.
11694
11695::
11696
11697 declare i16 @llvm.bitreverse.i16(i16 <id>)
11698 declare i32 @llvm.bitreverse.i32(i32 <id>)
11699 declare i64 @llvm.bitreverse.i64(i64 <id>)
11700
11701Overview:
11702"""""""""
11703
11704The '``llvm.bitreverse``' family of intrinsics is used to reverse the
Matt Arsenaultde2d6a32016-03-07 21:54:52 +000011705bitpattern of an integer value; for example ``0b10110110`` becomes
11706``0b01101101``.
James Molloy90111f72015-11-12 12:29:09 +000011707
11708Semantics:
11709""""""""""
11710
Yichao Yu5abf14b2016-11-23 16:25:31 +000011711The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
James Molloy90111f72015-11-12 12:29:09 +000011712``M`` in the input moved to bit ``N-M`` in the output.
11713
Sean Silvab084af42012-12-07 10:36:55 +000011714'``llvm.bswap.*``' Intrinsics
11715^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11716
11717Syntax:
11718"""""""
11719
11720This is an overloaded intrinsic function. You can use bswap on any
11721integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
11722
11723::
11724
11725 declare i16 @llvm.bswap.i16(i16 <id>)
11726 declare i32 @llvm.bswap.i32(i32 <id>)
11727 declare i64 @llvm.bswap.i64(i64 <id>)
11728
11729Overview:
11730"""""""""
11731
11732The '``llvm.bswap``' family of intrinsics is used to byte swap integer
11733values with an even number of bytes (positive multiple of 16 bits).
11734These are useful for performing operations on data that is not in the
11735target's native byte order.
11736
11737Semantics:
11738""""""""""
11739
11740The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
11741and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
11742intrinsic returns an i32 value that has the four bytes of the input i32
11743swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
11744returned i32 will have its bytes in 3, 2, 1, 0 order. The
11745``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
11746concept to additional even-byte lengths (6 bytes, 8 bytes and more,
11747respectively).
11748
11749'``llvm.ctpop.*``' Intrinsic
11750^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11751
11752Syntax:
11753"""""""
11754
11755This is an overloaded intrinsic. You can use llvm.ctpop on any integer
11756bit width, or on any vector with integer elements. Not all targets
11757support all bit widths or vector types, however.
11758
11759::
11760
11761 declare i8 @llvm.ctpop.i8(i8 <src>)
11762 declare i16 @llvm.ctpop.i16(i16 <src>)
11763 declare i32 @llvm.ctpop.i32(i32 <src>)
11764 declare i64 @llvm.ctpop.i64(i64 <src>)
11765 declare i256 @llvm.ctpop.i256(i256 <src>)
11766 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
11767
11768Overview:
11769"""""""""
11770
11771The '``llvm.ctpop``' family of intrinsics counts the number of bits set
11772in a value.
11773
11774Arguments:
11775""""""""""
11776
11777The only argument is the value to be counted. The argument may be of any
11778integer type, or a vector with integer elements. The return type must
11779match the argument type.
11780
11781Semantics:
11782""""""""""
11783
11784The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
11785each element of a vector.
11786
11787'``llvm.ctlz.*``' Intrinsic
11788^^^^^^^^^^^^^^^^^^^^^^^^^^^
11789
11790Syntax:
11791"""""""
11792
11793This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
11794integer bit width, or any vector whose elements are integers. Not all
11795targets support all bit widths or vector types, however.
11796
11797::
11798
11799 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>)
11800 declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>)
11801 declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>)
11802 declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>)
11803 declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
Alexey Samsonovc4b18302016-03-17 23:08:01 +000011804 declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
Sean Silvab084af42012-12-07 10:36:55 +000011805
11806Overview:
11807"""""""""
11808
11809The '``llvm.ctlz``' family of intrinsic functions counts the number of
11810leading zeros in a variable.
11811
11812Arguments:
11813""""""""""
11814
11815The first argument is the value to be counted. This argument may be of
Hal Finkel5dd82782015-01-05 04:05:21 +000011816any integer type, or a vector with integer element type. The return
Sean Silvab084af42012-12-07 10:36:55 +000011817type must match the first argument type.
11818
11819The second argument must be a constant and is a flag to indicate whether
11820the intrinsic should ensure that a zero as the first argument produces a
11821defined result. Historically some architectures did not provide a
11822defined result for zero values as efficiently, and many algorithms are
11823now predicated on avoiding zero-value inputs.
11824
11825Semantics:
11826""""""""""
11827
11828The '``llvm.ctlz``' intrinsic counts the leading (most significant)
11829zeros in a variable, or within each element of the vector. If
11830``src == 0`` then the result is the size in bits of the type of ``src``
11831if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
11832``llvm.ctlz(i32 2) = 30``.
11833
11834'``llvm.cttz.*``' Intrinsic
11835^^^^^^^^^^^^^^^^^^^^^^^^^^^
11836
11837Syntax:
11838"""""""
11839
11840This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
11841integer bit width, or any vector of integer elements. Not all targets
11842support all bit widths or vector types, however.
11843
11844::
11845
11846 declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>)
11847 declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>)
11848 declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>)
11849 declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>)
11850 declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
Alexey Samsonovc4b18302016-03-17 23:08:01 +000011851 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
Sean Silvab084af42012-12-07 10:36:55 +000011852
11853Overview:
11854"""""""""
11855
11856The '``llvm.cttz``' family of intrinsic functions counts the number of
11857trailing zeros.
11858
11859Arguments:
11860""""""""""
11861
11862The first argument is the value to be counted. This argument may be of
Hal Finkel5dd82782015-01-05 04:05:21 +000011863any integer type, or a vector with integer element type. The return
Sean Silvab084af42012-12-07 10:36:55 +000011864type must match the first argument type.
11865
11866The second argument must be a constant and is a flag to indicate whether
11867the intrinsic should ensure that a zero as the first argument produces a
11868defined result. Historically some architectures did not provide a
11869defined result for zero values as efficiently, and many algorithms are
11870now predicated on avoiding zero-value inputs.
11871
11872Semantics:
11873""""""""""
11874
11875The '``llvm.cttz``' intrinsic counts the trailing (least significant)
11876zeros in a variable, or within each element of a vector. If ``src == 0``
11877then the result is the size in bits of the type of ``src`` if
11878``is_zero_undef == 0`` and ``undef`` otherwise. For example,
11879``llvm.cttz(2) = 1``.
11880
Philip Reames34843ae2015-03-05 05:55:55 +000011881.. _int_overflow:
11882
Sanjay Patelc71adc82018-07-16 22:59:31 +000011883'``llvm.fshl.*``' Intrinsic
11884^^^^^^^^^^^^^^^^^^^^^^^^^^^
11885
11886Syntax:
11887"""""""
11888
11889This is an overloaded intrinsic. You can use ``llvm.fshl`` on any
11890integer bit width or any vector of integer elements. Not all targets
11891support all bit widths or vector types, however.
11892
11893::
11894
11895 declare i8 @llvm.fshl.i8 (i8 %a, i8 %b, i8 %c)
11896 declare i67 @llvm.fshl.i67(i67 %a, i67 %b, i67 %c)
11897 declare <2 x i32> @llvm.fshl.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
11898
11899Overview:
11900"""""""""
11901
11902The '``llvm.fshl``' family of intrinsic functions performs a funnel shift left:
11903the first two values are concatenated as { %a : %b } (%a is the most significant
11904bits of the wide value), the combined value is shifted left, and the most
11905significant bits are extracted to produce a result that is the same size as the
11906original arguments. If the first 2 arguments are identical, this is equivalent
11907to a rotate left operation. For vector types, the operation occurs for each
11908element of the vector. The shift argument is treated as an unsigned amount
11909modulo the element size of the arguments.
11910
11911Arguments:
11912""""""""""
11913
11914The first two arguments are the values to be concatenated. The third
11915argument is the shift amount. The arguments may be any integer type or a
11916vector with integer element type. All arguments and the return value must
11917have the same type.
11918
11919Example:
11920""""""""
11921
11922.. code-block:: text
11923
11924 %r = call i8 @llvm.fshl.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: msb_extract((concat(x, y) << (z % 8)), 8)
11925 %r = call i8 @llvm.fshl.i8(i8 255, i8 0, i8 15) ; %r = i8: 128 (0b10000000)
11926 %r = call i8 @llvm.fshl.i8(i8 15, i8 15, i8 11) ; %r = i8: 120 (0b01111000)
11927 %r = call i8 @llvm.fshl.i8(i8 0, i8 255, i8 8) ; %r = i8: 0 (0b00000000)
11928
11929'``llvm.fshr.*``' Intrinsic
11930^^^^^^^^^^^^^^^^^^^^^^^^^^^
11931
11932Syntax:
11933"""""""
11934
11935This is an overloaded intrinsic. You can use ``llvm.fshr`` on any
11936integer bit width or any vector of integer elements. Not all targets
11937support all bit widths or vector types, however.
11938
11939::
11940
11941 declare i8 @llvm.fshr.i8 (i8 %a, i8 %b, i8 %c)
11942 declare i67 @llvm.fshr.i67(i67 %a, i67 %b, i67 %c)
11943 declare <2 x i32> @llvm.fshr.v2i32(<2 x i32> %a, <2 x i32> %b, <2 x i32> %c)
11944
11945Overview:
11946"""""""""
11947
11948The '``llvm.fshr``' family of intrinsic functions performs a funnel shift right:
11949the first two values are concatenated as { %a : %b } (%a is the most significant
11950bits of the wide value), the combined value is shifted right, and the least
11951significant bits are extracted to produce a result that is the same size as the
11952original arguments. If the first 2 arguments are identical, this is equivalent
11953to a rotate right operation. For vector types, the operation occurs for each
11954element of the vector. The shift argument is treated as an unsigned amount
11955modulo the element size of the arguments.
11956
11957Arguments:
11958""""""""""
11959
11960The first two arguments are the values to be concatenated. The third
11961argument is the shift amount. The arguments may be any integer type or a
11962vector with integer element type. All arguments and the return value must
11963have the same type.
11964
11965Example:
11966""""""""
11967
11968.. code-block:: text
11969
11970 %r = call i8 @llvm.fshr.i8(i8 %x, i8 %y, i8 %z) ; %r = i8: lsb_extract((concat(x, y) >> (z % 8)), 8)
11971 %r = call i8 @llvm.fshr.i8(i8 255, i8 0, i8 15) ; %r = i8: 254 (0b11111110)
11972 %r = call i8 @llvm.fshr.i8(i8 15, i8 15, i8 11) ; %r = i8: 225 (0b11100001)
11973 %r = call i8 @llvm.fshr.i8(i8 0, i8 255, i8 8) ; %r = i8: 255 (0b11111111)
11974
Sean Silvab084af42012-12-07 10:36:55 +000011975Arithmetic with Overflow Intrinsics
11976-----------------------------------
11977
John Regehr6a493f22016-05-12 20:55:09 +000011978LLVM provides intrinsics for fast arithmetic overflow checking.
11979
11980Each of these intrinsics returns a two-element struct. The first
11981element of this struct contains the result of the corresponding
11982arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
11983the result. Therefore, for example, the first element of the struct
11984returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
11985result of a 32-bit ``add`` instruction with the same operands, where
11986the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
11987
11988The second element of the result is an ``i1`` that is 1 if the
11989arithmetic operation overflowed and 0 otherwise. An operation
11990overflows if, for any values of its operands ``A`` and ``B`` and for
11991any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
11992not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
11993``sext`` for signed overflow and ``zext`` for unsigned overflow, and
11994``op`` is the underlying arithmetic operation.
11995
11996The behavior of these intrinsics is well-defined for all argument
11997values.
Sean Silvab084af42012-12-07 10:36:55 +000011998
11999'``llvm.sadd.with.overflow.*``' Intrinsics
12000^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12001
12002Syntax:
12003"""""""
12004
12005This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
12006on any integer bit width.
12007
12008::
12009
12010 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
12011 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
12012 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
12013
12014Overview:
12015"""""""""
12016
12017The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
12018a signed addition of the two arguments, and indicate whether an overflow
12019occurred during the signed summation.
12020
12021Arguments:
12022""""""""""
12023
12024The arguments (%a and %b) and the first element of the result structure
12025may be of integer types of any bit width, but they must have the same
12026bit width. The second element of the result structure must be of type
12027``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
12028addition.
12029
12030Semantics:
12031""""""""""
12032
12033The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000012034a signed addition of the two variables. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000012035first element of which is the signed summation, and the second element
12036of which is a bit specifying if the signed summation resulted in an
12037overflow.
12038
12039Examples:
12040"""""""""
12041
12042.. code-block:: llvm
12043
12044 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
12045 %sum = extractvalue {i32, i1} %res, 0
12046 %obit = extractvalue {i32, i1} %res, 1
12047 br i1 %obit, label %overflow, label %normal
12048
12049'``llvm.uadd.with.overflow.*``' Intrinsics
12050^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12051
12052Syntax:
12053"""""""
12054
12055This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
12056on any integer bit width.
12057
12058::
12059
12060 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
12061 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
12062 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
12063
12064Overview:
12065"""""""""
12066
12067The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
12068an unsigned addition of the two arguments, and indicate whether a carry
12069occurred during the unsigned summation.
12070
12071Arguments:
12072""""""""""
12073
12074The arguments (%a and %b) and the first element of the result structure
12075may be of integer types of any bit width, but they must have the same
12076bit width. The second element of the result structure must be of type
12077``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
12078addition.
12079
12080Semantics:
12081""""""""""
12082
12083The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000012084an unsigned addition of the two arguments. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000012085first element of which is the sum, and the second element of which is a
12086bit specifying if the unsigned summation resulted in a carry.
12087
12088Examples:
12089"""""""""
12090
12091.. code-block:: llvm
12092
12093 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
12094 %sum = extractvalue {i32, i1} %res, 0
12095 %obit = extractvalue {i32, i1} %res, 1
12096 br i1 %obit, label %carry, label %normal
12097
12098'``llvm.ssub.with.overflow.*``' Intrinsics
12099^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12100
12101Syntax:
12102"""""""
12103
12104This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
12105on any integer bit width.
12106
12107::
12108
12109 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
12110 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
12111 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
12112
12113Overview:
12114"""""""""
12115
12116The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
12117a signed subtraction of the two arguments, and indicate whether an
12118overflow occurred during the signed subtraction.
12119
12120Arguments:
12121""""""""""
12122
12123The arguments (%a and %b) and the first element of the result structure
12124may be of integer types of any bit width, but they must have the same
12125bit width. The second element of the result structure must be of type
12126``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
12127subtraction.
12128
12129Semantics:
12130""""""""""
12131
12132The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000012133a signed subtraction of the two arguments. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000012134first element of which is the subtraction, and the second element of
12135which is a bit specifying if the signed subtraction resulted in an
12136overflow.
12137
12138Examples:
12139"""""""""
12140
12141.. code-block:: llvm
12142
12143 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
12144 %sum = extractvalue {i32, i1} %res, 0
12145 %obit = extractvalue {i32, i1} %res, 1
12146 br i1 %obit, label %overflow, label %normal
12147
12148'``llvm.usub.with.overflow.*``' Intrinsics
12149^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12150
12151Syntax:
12152"""""""
12153
12154This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
12155on any integer bit width.
12156
12157::
12158
12159 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
12160 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
12161 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
12162
12163Overview:
12164"""""""""
12165
12166The '``llvm.usub.with.overflow``' family of intrinsic functions perform
12167an unsigned subtraction of the two arguments, and indicate whether an
12168overflow occurred during the unsigned subtraction.
12169
12170Arguments:
12171""""""""""
12172
12173The arguments (%a and %b) and the first element of the result structure
12174may be of integer types of any bit width, but they must have the same
12175bit width. The second element of the result structure must be of type
12176``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
12177subtraction.
12178
12179Semantics:
12180""""""""""
12181
12182The '``llvm.usub.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000012183an unsigned subtraction of the two arguments. They return a structure ---
Sean Silvab084af42012-12-07 10:36:55 +000012184the first element of which is the subtraction, and the second element of
12185which is a bit specifying if the unsigned subtraction resulted in an
12186overflow.
12187
12188Examples:
12189"""""""""
12190
12191.. code-block:: llvm
12192
12193 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
12194 %sum = extractvalue {i32, i1} %res, 0
12195 %obit = extractvalue {i32, i1} %res, 1
12196 br i1 %obit, label %overflow, label %normal
12197
12198'``llvm.smul.with.overflow.*``' Intrinsics
12199^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12200
12201Syntax:
12202"""""""
12203
12204This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
12205on any integer bit width.
12206
12207::
12208
12209 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
12210 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
12211 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
12212
12213Overview:
12214"""""""""
12215
12216The '``llvm.smul.with.overflow``' family of intrinsic functions perform
12217a signed multiplication of the two arguments, and indicate whether an
12218overflow occurred during the signed multiplication.
12219
12220Arguments:
12221""""""""""
12222
12223The arguments (%a and %b) and the first element of the result structure
12224may be of integer types of any bit width, but they must have the same
12225bit width. The second element of the result structure must be of type
12226``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
12227multiplication.
12228
12229Semantics:
12230""""""""""
12231
12232The '``llvm.smul.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000012233a signed multiplication of the two arguments. They return a structure ---
Sean Silvab084af42012-12-07 10:36:55 +000012234the first element of which is the multiplication, and the second element
12235of which is a bit specifying if the signed multiplication resulted in an
12236overflow.
12237
12238Examples:
12239"""""""""
12240
12241.. code-block:: llvm
12242
12243 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
12244 %sum = extractvalue {i32, i1} %res, 0
12245 %obit = extractvalue {i32, i1} %res, 1
12246 br i1 %obit, label %overflow, label %normal
12247
12248'``llvm.umul.with.overflow.*``' Intrinsics
12249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12250
12251Syntax:
12252"""""""
12253
12254This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
12255on any integer bit width.
12256
12257::
12258
12259 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
12260 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
12261 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
12262
12263Overview:
12264"""""""""
12265
12266The '``llvm.umul.with.overflow``' family of intrinsic functions perform
12267a unsigned multiplication of the two arguments, and indicate whether an
12268overflow occurred during the unsigned multiplication.
12269
12270Arguments:
12271""""""""""
12272
12273The arguments (%a and %b) and the first element of the result structure
12274may be of integer types of any bit width, but they must have the same
12275bit width. The second element of the result structure must be of type
12276``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
12277multiplication.
12278
12279Semantics:
12280""""""""""
12281
12282The '``llvm.umul.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000012283an unsigned multiplication of the two arguments. They return a structure ---
12284the first element of which is the multiplication, and the second
Sean Silvab084af42012-12-07 10:36:55 +000012285element of which is a bit specifying if the unsigned multiplication
12286resulted in an overflow.
12287
12288Examples:
12289"""""""""
12290
12291.. code-block:: llvm
12292
12293 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
12294 %sum = extractvalue {i32, i1} %res, 0
12295 %obit = extractvalue {i32, i1} %res, 1
12296 br i1 %obit, label %overflow, label %normal
12297
12298Specialised Arithmetic Intrinsics
12299---------------------------------
12300
Owen Anderson1056a922015-07-11 07:01:27 +000012301'``llvm.canonicalize.*``' Intrinsic
12302^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12303
12304Syntax:
12305"""""""
12306
12307::
12308
12309 declare float @llvm.canonicalize.f32(float %a)
12310 declare double @llvm.canonicalize.f64(double %b)
12311
12312Overview:
12313"""""""""
12314
12315The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012316encoding of a floating-point number. This canonicalization is useful for
Owen Anderson1056a922015-07-11 07:01:27 +000012317implementing certain numeric primitives such as frexp. The canonical encoding is
12318defined by IEEE-754-2008 to be:
12319
12320::
12321
12322 2.1.8 canonical encoding: The preferred encoding of a floating-point
Sean Silvaa1190322015-08-06 22:56:48 +000012323 representation in a format. Applied to declets, significands of finite
Owen Anderson1056a922015-07-11 07:01:27 +000012324 numbers, infinities, and NaNs, especially in decimal formats.
12325
12326This operation can also be considered equivalent to the IEEE-754-2008
Sean Silvaa1190322015-08-06 22:56:48 +000012327conversion of a floating-point value to the same format. NaNs are handled
Owen Anderson1056a922015-07-11 07:01:27 +000012328according to section 6.2.
12329
12330Examples of non-canonical encodings:
12331
Sean Silvaa1190322015-08-06 22:56:48 +000012332- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
Owen Anderson1056a922015-07-11 07:01:27 +000012333 converted to a canonical representation per hardware-specific protocol.
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012334- Many normal decimal floating-point numbers have non-canonical alternative
Owen Anderson1056a922015-07-11 07:01:27 +000012335 encodings.
12336- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
Sanjay Patelcc330962016-02-24 23:44:19 +000012337 These are treated as non-canonical encodings of zero and will be flushed to
Owen Anderson1056a922015-07-11 07:01:27 +000012338 a zero of the same sign by this operation.
12339
12340Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
12341default exception handling must signal an invalid exception, and produce a
12342quiet NaN result.
12343
12344This function should always be implementable as multiplication by 1.0, provided
Sean Silvaa1190322015-08-06 22:56:48 +000012345that the compiler does not constant fold the operation. Likewise, division by
123461.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
Owen Anderson1056a922015-07-11 07:01:27 +000012347-0.0 is also sufficient provided that the rounding mode is not -Infinity.
12348
Sean Silvaa1190322015-08-06 22:56:48 +000012349``@llvm.canonicalize`` must preserve the equality relation. That is:
Owen Anderson1056a922015-07-11 07:01:27 +000012350
12351- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
12352- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
12353 to ``(x == y)``
12354
12355Additionally, the sign of zero must be conserved:
12356``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
12357
12358The payload bits of a NaN must be conserved, with two exceptions.
12359First, environments which use only a single canonical representation of NaN
Sean Silvaa1190322015-08-06 22:56:48 +000012360must perform said canonicalization. Second, SNaNs must be quieted per the
Owen Anderson1056a922015-07-11 07:01:27 +000012361usual methods.
12362
12363The canonicalization operation may be optimized away if:
12364
Sean Silvaa1190322015-08-06 22:56:48 +000012365- The input is known to be canonical. For example, it was produced by a
Owen Anderson1056a922015-07-11 07:01:27 +000012366 floating-point operation that is required by the standard to be canonical.
12367- The result is consumed only by (or fused with) other floating-point
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012368 operations. That is, the bits of the floating-point value are not examined.
Owen Anderson1056a922015-07-11 07:01:27 +000012369
Sean Silvab084af42012-12-07 10:36:55 +000012370'``llvm.fmuladd.*``' Intrinsic
12371^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12372
12373Syntax:
12374"""""""
12375
12376::
12377
12378 declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
12379 declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
12380
12381Overview:
12382"""""""""
12383
12384The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
Lang Hames045f4392013-01-17 00:00:49 +000012385expressions that can be fused if the code generator determines that (a) the
12386target instruction set has support for a fused operation, and (b) that the
12387fused operation is more efficient than the equivalent, separate pair of mul
12388and add instructions.
Sean Silvab084af42012-12-07 10:36:55 +000012389
12390Arguments:
12391""""""""""
12392
12393The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
12394multiplicands, a and b, and an addend c.
12395
12396Semantics:
12397""""""""""
12398
12399The expression:
12400
12401::
12402
12403 %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
12404
12405is equivalent to the expression a \* b + c, except that rounding will
12406not be performed between the multiplication and addition steps if the
12407code generator fuses the operations. Fusion is not guaranteed, even if
12408the target platform supports it. If a fused multiply-add is required the
Matt Arsenaultee364ee2014-01-31 00:09:00 +000012409corresponding llvm.fma.\* intrinsic function should be used
12410instead. This never sets errno, just as '``llvm.fma.*``'.
Sean Silvab084af42012-12-07 10:36:55 +000012411
12412Examples:
12413"""""""""
12414
12415.. code-block:: llvm
12416
Tim Northover675a0962014-06-13 14:24:23 +000012417 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
Sean Silvab084af42012-12-07 10:36:55 +000012418
Amara Emersoncf9daa32017-05-09 10:43:25 +000012419
12420Experimental Vector Reduction Intrinsics
12421----------------------------------------
12422
12423Horizontal reductions of vectors can be expressed using the following
12424intrinsics. Each one takes a vector operand as an input and applies its
12425respective operation across all elements of the vector, returning a single
12426scalar result of the same element type.
12427
12428
12429'``llvm.experimental.vector.reduce.add.*``' Intrinsic
12430^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12431
12432Syntax:
12433"""""""
12434
12435::
12436
12437 declare i32 @llvm.experimental.vector.reduce.add.i32.v4i32(<4 x i32> %a)
12438 declare i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64> %a)
12439
12440Overview:
12441"""""""""
12442
12443The '``llvm.experimental.vector.reduce.add.*``' intrinsics do an integer ``ADD``
12444reduction of a vector, returning the result as a scalar. The return type matches
12445the element-type of the vector input.
12446
12447Arguments:
12448""""""""""
12449The argument to this intrinsic must be a vector of integer values.
12450
12451'``llvm.experimental.vector.reduce.fadd.*``' Intrinsic
12452^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12453
12454Syntax:
12455"""""""
12456
12457::
12458
12459 declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %a)
12460 declare double @llvm.experimental.vector.reduce.fadd.f64.v2f64(double %acc, <2 x double> %a)
12461
12462Overview:
12463"""""""""
12464
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012465The '``llvm.experimental.vector.reduce.fadd.*``' intrinsics do a floating-point
Amara Emersoncf9daa32017-05-09 10:43:25 +000012466``ADD`` reduction of a vector, returning the result as a scalar. The return type
12467matches the element-type of the vector input.
12468
12469If the intrinsic call has fast-math flags, then the reduction will not preserve
12470the associativity of an equivalent scalarized counterpart. If it does not have
12471fast-math flags, then the reduction will be *ordered*, implying that the
12472operation respects the associativity of a scalarized reduction.
12473
12474
12475Arguments:
12476""""""""""
12477The first argument to this intrinsic is a scalar accumulator value, which is
12478only used when there are no fast-math flags attached. This argument may be undef
12479when fast-math flags are used.
12480
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012481The second argument must be a vector of floating-point values.
Amara Emersoncf9daa32017-05-09 10:43:25 +000012482
12483Examples:
12484"""""""""
12485
12486.. code-block:: llvm
12487
12488 %fast = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
12489 %ord = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
12490
12491
12492'``llvm.experimental.vector.reduce.mul.*``' Intrinsic
12493^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12494
12495Syntax:
12496"""""""
12497
12498::
12499
12500 declare i32 @llvm.experimental.vector.reduce.mul.i32.v4i32(<4 x i32> %a)
12501 declare i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64> %a)
12502
12503Overview:
12504"""""""""
12505
12506The '``llvm.experimental.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
12507reduction of a vector, returning the result as a scalar. The return type matches
12508the element-type of the vector input.
12509
12510Arguments:
12511""""""""""
12512The argument to this intrinsic must be a vector of integer values.
12513
12514'``llvm.experimental.vector.reduce.fmul.*``' Intrinsic
12515^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12516
12517Syntax:
12518"""""""
12519
12520::
12521
12522 declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %a)
12523 declare double @llvm.experimental.vector.reduce.fmul.f64.v2f64(double %acc, <2 x double> %a)
12524
12525Overview:
12526"""""""""
12527
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012528The '``llvm.experimental.vector.reduce.fmul.*``' intrinsics do a floating-point
Amara Emersoncf9daa32017-05-09 10:43:25 +000012529``MUL`` reduction of a vector, returning the result as a scalar. The return type
12530matches the element-type of the vector input.
12531
12532If the intrinsic call has fast-math flags, then the reduction will not preserve
12533the associativity of an equivalent scalarized counterpart. If it does not have
12534fast-math flags, then the reduction will be *ordered*, implying that the
12535operation respects the associativity of a scalarized reduction.
12536
12537
12538Arguments:
12539""""""""""
12540The first argument to this intrinsic is a scalar accumulator value, which is
12541only used when there are no fast-math flags attached. This argument may be undef
12542when fast-math flags are used.
12543
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012544The second argument must be a vector of floating-point values.
Amara Emersoncf9daa32017-05-09 10:43:25 +000012545
12546Examples:
12547"""""""""
12548
12549.. code-block:: llvm
12550
12551 %fast = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
12552 %ord = call float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
12553
12554'``llvm.experimental.vector.reduce.and.*``' Intrinsic
12555^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12556
12557Syntax:
12558"""""""
12559
12560::
12561
12562 declare i32 @llvm.experimental.vector.reduce.and.i32.v4i32(<4 x i32> %a)
12563
12564Overview:
12565"""""""""
12566
12567The '``llvm.experimental.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
12568reduction of a vector, returning the result as a scalar. The return type matches
12569the element-type of the vector input.
12570
12571Arguments:
12572""""""""""
12573The argument to this intrinsic must be a vector of integer values.
12574
12575'``llvm.experimental.vector.reduce.or.*``' Intrinsic
12576^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12577
12578Syntax:
12579"""""""
12580
12581::
12582
12583 declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)
12584
12585Overview:
12586"""""""""
12587
12588The '``llvm.experimental.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
12589of a vector, returning the result as a scalar. The return type matches the
12590element-type of the vector input.
12591
12592Arguments:
12593""""""""""
12594The argument to this intrinsic must be a vector of integer values.
12595
12596'``llvm.experimental.vector.reduce.xor.*``' Intrinsic
12597^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12598
12599Syntax:
12600"""""""
12601
12602::
12603
12604 declare i32 @llvm.experimental.vector.reduce.xor.i32.v4i32(<4 x i32> %a)
12605
12606Overview:
12607"""""""""
12608
12609The '``llvm.experimental.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
12610reduction of a vector, returning the result as a scalar. The return type matches
12611the element-type of the vector input.
12612
12613Arguments:
12614""""""""""
12615The argument to this intrinsic must be a vector of integer values.
12616
12617'``llvm.experimental.vector.reduce.smax.*``' Intrinsic
12618^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12619
12620Syntax:
12621"""""""
12622
12623::
12624
12625 declare i32 @llvm.experimental.vector.reduce.smax.i32.v4i32(<4 x i32> %a)
12626
12627Overview:
12628"""""""""
12629
12630The '``llvm.experimental.vector.reduce.smax.*``' intrinsics do a signed integer
12631``MAX`` reduction of a vector, returning the result as a scalar. The return type
12632matches the element-type of the vector input.
12633
12634Arguments:
12635""""""""""
12636The argument to this intrinsic must be a vector of integer values.
12637
12638'``llvm.experimental.vector.reduce.smin.*``' Intrinsic
12639^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12640
12641Syntax:
12642"""""""
12643
12644::
12645
12646 declare i32 @llvm.experimental.vector.reduce.smin.i32.v4i32(<4 x i32> %a)
12647
12648Overview:
12649"""""""""
12650
12651The '``llvm.experimental.vector.reduce.smin.*``' intrinsics do a signed integer
12652``MIN`` reduction of a vector, returning the result as a scalar. The return type
12653matches the element-type of the vector input.
12654
12655Arguments:
12656""""""""""
12657The argument to this intrinsic must be a vector of integer values.
12658
12659'``llvm.experimental.vector.reduce.umax.*``' Intrinsic
12660^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12661
12662Syntax:
12663"""""""
12664
12665::
12666
12667 declare i32 @llvm.experimental.vector.reduce.umax.i32.v4i32(<4 x i32> %a)
12668
12669Overview:
12670"""""""""
12671
12672The '``llvm.experimental.vector.reduce.umax.*``' intrinsics do an unsigned
12673integer ``MAX`` reduction of a vector, returning the result as a scalar. The
12674return type matches the element-type of the vector input.
12675
12676Arguments:
12677""""""""""
12678The argument to this intrinsic must be a vector of integer values.
12679
12680'``llvm.experimental.vector.reduce.umin.*``' Intrinsic
12681^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12682
12683Syntax:
12684"""""""
12685
12686::
12687
12688 declare i32 @llvm.experimental.vector.reduce.umin.i32.v4i32(<4 x i32> %a)
12689
12690Overview:
12691"""""""""
12692
12693The '``llvm.experimental.vector.reduce.umin.*``' intrinsics do an unsigned
12694integer ``MIN`` reduction of a vector, returning the result as a scalar. The
12695return type matches the element-type of the vector input.
12696
12697Arguments:
12698""""""""""
12699The argument to this intrinsic must be a vector of integer values.
12700
12701'``llvm.experimental.vector.reduce.fmax.*``' Intrinsic
12702^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12703
12704Syntax:
12705"""""""
12706
12707::
12708
12709 declare float @llvm.experimental.vector.reduce.fmax.f32.v4f32(<4 x float> %a)
12710 declare double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double> %a)
12711
12712Overview:
12713"""""""""
12714
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012715The '``llvm.experimental.vector.reduce.fmax.*``' intrinsics do a floating-point
Amara Emersoncf9daa32017-05-09 10:43:25 +000012716``MAX`` reduction of a vector, returning the result as a scalar. The return type
12717matches the element-type of the vector input.
12718
12719If the intrinsic call has the ``nnan`` fast-math flag then the operation can
12720assume that NaNs are not present in the input vector.
12721
12722Arguments:
12723""""""""""
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012724The argument to this intrinsic must be a vector of floating-point values.
Amara Emersoncf9daa32017-05-09 10:43:25 +000012725
12726'``llvm.experimental.vector.reduce.fmin.*``' Intrinsic
12727^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12728
12729Syntax:
12730"""""""
12731
12732::
12733
12734 declare float @llvm.experimental.vector.reduce.fmin.f32.v4f32(<4 x float> %a)
12735 declare double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double> %a)
12736
12737Overview:
12738"""""""""
12739
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012740The '``llvm.experimental.vector.reduce.fmin.*``' intrinsics do a floating-point
Amara Emersoncf9daa32017-05-09 10:43:25 +000012741``MIN`` reduction of a vector, returning the result as a scalar. The return type
12742matches the element-type of the vector input.
12743
12744If the intrinsic call has the ``nnan`` fast-math flag then the operation can
12745assume that NaNs are not present in the input vector.
12746
12747Arguments:
12748""""""""""
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012749The argument to this intrinsic must be a vector of floating-point values.
Amara Emersoncf9daa32017-05-09 10:43:25 +000012750
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012751Half Precision Floating-Point Intrinsics
Sean Silvab084af42012-12-07 10:36:55 +000012752----------------------------------------
12753
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012754For most target platforms, half precision floating-point is a
Sean Silvab084af42012-12-07 10:36:55 +000012755storage-only format. This means that it is a dense encoding (in memory)
12756but does not support computation in the format.
12757
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012758This means that code must first load the half-precision floating-point
Sean Silvab084af42012-12-07 10:36:55 +000012759value as an i16, then convert it to float with
12760:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
12761then be performed on the float value (including extending to double
12762etc). To store the value back to memory, it is first converted to float
12763if needed, then converted to i16 with
12764:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
12765i16 value.
12766
12767.. _int_convert_to_fp16:
12768
12769'``llvm.convert.to.fp16``' Intrinsic
12770^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12771
12772Syntax:
12773"""""""
12774
12775::
12776
Tim Northoverfd7e4242014-07-17 10:51:23 +000012777 declare i16 @llvm.convert.to.fp16.f32(float %a)
12778 declare i16 @llvm.convert.to.fp16.f64(double %a)
Sean Silvab084af42012-12-07 10:36:55 +000012779
12780Overview:
12781"""""""""
12782
Tim Northoverfd7e4242014-07-17 10:51:23 +000012783The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012784conventional floating-point type to half precision floating-point format.
Sean Silvab084af42012-12-07 10:36:55 +000012785
12786Arguments:
12787""""""""""
12788
12789The intrinsic function contains single argument - the value to be
12790converted.
12791
12792Semantics:
12793""""""""""
12794
Tim Northoverfd7e4242014-07-17 10:51:23 +000012795The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012796conventional floating-point format to half precision floating-point format. The
Tim Northoverfd7e4242014-07-17 10:51:23 +000012797return value is an ``i16`` which contains the converted number.
Sean Silvab084af42012-12-07 10:36:55 +000012798
12799Examples:
12800"""""""""
12801
12802.. code-block:: llvm
12803
Tim Northoverfd7e4242014-07-17 10:51:23 +000012804 %res = call i16 @llvm.convert.to.fp16.f32(float %a)
Sean Silvab084af42012-12-07 10:36:55 +000012805 store i16 %res, i16* @x, align 2
12806
12807.. _int_convert_from_fp16:
12808
12809'``llvm.convert.from.fp16``' Intrinsic
12810^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12811
12812Syntax:
12813"""""""
12814
12815::
12816
Tim Northoverfd7e4242014-07-17 10:51:23 +000012817 declare float @llvm.convert.from.fp16.f32(i16 %a)
12818 declare double @llvm.convert.from.fp16.f64(i16 %a)
Sean Silvab084af42012-12-07 10:36:55 +000012819
12820Overview:
12821"""""""""
12822
12823The '``llvm.convert.from.fp16``' intrinsic function performs a
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012824conversion from half precision floating-point format to single precision
12825floating-point format.
Sean Silvab084af42012-12-07 10:36:55 +000012826
12827Arguments:
12828""""""""""
12829
12830The intrinsic function contains single argument - the value to be
12831converted.
12832
12833Semantics:
12834""""""""""
12835
12836The '``llvm.convert.from.fp16``' intrinsic function performs a
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012837conversion from half single precision floating-point format to single
12838precision floating-point format. The input half-float value is
Sean Silvab084af42012-12-07 10:36:55 +000012839represented by an ``i16`` value.
12840
12841Examples:
12842"""""""""
12843
12844.. code-block:: llvm
12845
David Blaikiec7aabbb2015-03-04 22:06:14 +000012846 %a = load i16, i16* @x, align 2
Matt Arsenault3e3ddda2014-07-10 03:22:16 +000012847 %res = call float @llvm.convert.from.fp16(i16 %a)
Sean Silvab084af42012-12-07 10:36:55 +000012848
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +000012849.. _dbg_intrinsics:
12850
Sean Silvab084af42012-12-07 10:36:55 +000012851Debugger Intrinsics
12852-------------------
12853
12854The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
12855prefix), are described in the `LLVM Source Level
Hans Wennborg65195622017-09-28 15:16:37 +000012856Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
Sean Silvab084af42012-12-07 10:36:55 +000012857document.
12858
12859Exception Handling Intrinsics
12860-----------------------------
12861
12862The LLVM exception handling intrinsics (which all start with
12863``llvm.eh.`` prefix), are described in the `LLVM Exception
Hans Wennborg65195622017-09-28 15:16:37 +000012864Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
Sean Silvab084af42012-12-07 10:36:55 +000012865
12866.. _int_trampoline:
12867
12868Trampoline Intrinsics
12869---------------------
12870
12871These intrinsics make it possible to excise one parameter, marked with
12872the :ref:`nest <nest>` attribute, from a function. The result is a
12873callable function pointer lacking the nest parameter - the caller does
12874not need to provide a value for it. Instead, the value to use is stored
12875in advance in a "trampoline", a block of memory usually allocated on the
12876stack, which also contains code to splice the nest value into the
12877argument list. This is used to implement the GCC nested function address
12878extension.
12879
12880For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
12881then the resulting function pointer has signature ``i32 (i32, i32)*``.
12882It can be created as follows:
12883
12884.. code-block:: llvm
12885
12886 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
David Blaikie16a97eb2015-03-04 22:02:58 +000012887 %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
Sean Silvab084af42012-12-07 10:36:55 +000012888 call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
12889 %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
12890 %fp = bitcast i8* %p to i32 (i32, i32)*
12891
12892The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
12893``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
12894
12895.. _int_it:
12896
12897'``llvm.init.trampoline``' Intrinsic
12898^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12899
12900Syntax:
12901"""""""
12902
12903::
12904
12905 declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
12906
12907Overview:
12908"""""""""
12909
12910This fills the memory pointed to by ``tramp`` with executable code,
12911turning it into a trampoline.
12912
12913Arguments:
12914""""""""""
12915
12916The ``llvm.init.trampoline`` intrinsic takes three arguments, all
12917pointers. The ``tramp`` argument must point to a sufficiently large and
12918sufficiently aligned block of memory; this memory is written to by the
12919intrinsic. Note that the size and the alignment are target-specific -
12920LLVM currently provides no portable way of determining them, so a
12921front-end that generates this intrinsic needs to have some
12922target-specific knowledge. The ``func`` argument must hold a function
12923bitcast to an ``i8*``.
12924
12925Semantics:
12926""""""""""
12927
12928The block of memory pointed to by ``tramp`` is filled with target
12929dependent code, turning it into a function. Then ``tramp`` needs to be
12930passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
12931be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
12932function's signature is the same as that of ``func`` with any arguments
12933marked with the ``nest`` attribute removed. At most one such ``nest``
12934argument is allowed, and it must be of pointer type. Calling the new
12935function is equivalent to calling ``func`` with the same argument list,
12936but with ``nval`` used for the missing ``nest`` argument. If, after
12937calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
12938modified, then the effect of any later call to the returned function
12939pointer is undefined.
12940
12941.. _int_at:
12942
12943'``llvm.adjust.trampoline``' Intrinsic
12944^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12945
12946Syntax:
12947"""""""
12948
12949::
12950
12951 declare i8* @llvm.adjust.trampoline(i8* <tramp>)
12952
12953Overview:
12954"""""""""
12955
12956This performs any required machine-specific adjustment to the address of
12957a trampoline (passed as ``tramp``).
12958
12959Arguments:
12960""""""""""
12961
12962``tramp`` must point to a block of memory which already has trampoline
12963code filled in by a previous call to
12964:ref:`llvm.init.trampoline <int_it>`.
12965
12966Semantics:
12967""""""""""
12968
12969On some architectures the address of the code to be executed needs to be
Sanjay Patel69bf48e2014-07-04 19:40:43 +000012970different than the address where the trampoline is actually stored. This
Sean Silvab084af42012-12-07 10:36:55 +000012971intrinsic returns the executable address corresponding to ``tramp``
12972after performing the required machine specific adjustments. The pointer
12973returned can then be :ref:`bitcast and executed <int_trampoline>`.
12974
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012975.. _int_mload_mstore:
12976
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012977Masked Vector Load and Store Intrinsics
12978---------------------------------------
12979
12980LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
12981
12982.. _int_mload:
12983
12984'``llvm.masked.load.*``' Intrinsics
12985^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12986
12987Syntax:
12988"""""""
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000012989This is an overloaded intrinsic. The loaded data is a vector of any integer, floating-point or pointer data type.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012990
12991::
12992
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012993 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
12994 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012995 ;; The data is a vector of pointers to double
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012996 declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012997 ;; The data is a vector of function pointers
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012998 declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012999
13000Overview:
13001"""""""""
13002
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013003Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000013004
13005
13006Arguments:
13007""""""""""
13008
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013009The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000013010
13011
13012Semantics:
13013""""""""""
13014
13015The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
13016The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
13017
13018
13019::
13020
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000013021 %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
Mehdi Amini4a121fa2015-03-14 22:04:06 +000013022
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000013023 ;; The result of the two following instructions is identical aside from potential memory access exception
David Blaikiec7aabbb2015-03-04 22:06:14 +000013024 %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
Elena Demikhovskye86c8c82014-12-29 09:47:51 +000013025 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000013026
13027.. _int_mstore:
13028
13029'``llvm.masked.store.*``' Intrinsics
13030^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13031
13032Syntax:
13033"""""""
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013034This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000013035
13036::
13037
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000013038 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
13039 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000013040 ;; The data is a vector of pointers to double
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000013041 declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000013042 ;; The data is a vector of function pointers
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000013043 declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000013044
13045Overview:
13046"""""""""
13047
13048Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
13049
13050Arguments:
13051""""""""""
13052
13053The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
13054
13055
13056Semantics:
13057""""""""""
13058
13059The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
13060The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
13061
13062::
13063
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000013064 call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask)
Mehdi Amini4a121fa2015-03-14 22:04:06 +000013065
Elena Demikhovskye86c8c82014-12-29 09:47:51 +000013066 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
David Blaikiec7aabbb2015-03-04 22:06:14 +000013067 %oldval = load <16 x float>, <16 x float>* %ptr, align 4
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000013068 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
13069 store <16 x float> %res, <16 x float>* %ptr, align 4
13070
13071
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013072Masked Vector Gather and Scatter Intrinsics
13073-------------------------------------------
13074
13075LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
13076
13077.. _int_mgather:
13078
13079'``llvm.masked.gather.*``' Intrinsics
13080^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13081
13082Syntax:
13083"""""""
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013084This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating-point or pointer data type gathered together into one vector.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013085
13086::
13087
Elad Cohenef5798a2017-05-03 12:28:54 +000013088 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
13089 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
13090 declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013091
13092Overview:
13093"""""""""
13094
13095Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
13096
13097
13098Arguments:
13099""""""""""
13100
13101The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
13102
13103
13104Semantics:
13105""""""""""
13106
13107The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
13108The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
13109
13110
13111::
13112
Elad Cohenef5798a2017-05-03 12:28:54 +000013113 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013114
13115 ;; The gather with all-true mask is equivalent to the following instruction sequence
13116 %ptr0 = extractelement <4 x double*> %ptrs, i32 0
13117 %ptr1 = extractelement <4 x double*> %ptrs, i32 1
13118 %ptr2 = extractelement <4 x double*> %ptrs, i32 2
13119 %ptr3 = extractelement <4 x double*> %ptrs, i32 3
13120
13121 %val0 = load double, double* %ptr0, align 8
13122 %val1 = load double, double* %ptr1, align 8
13123 %val2 = load double, double* %ptr2, align 8
13124 %val3 = load double, double* %ptr3, align 8
13125
13126 %vec0 = insertelement <4 x double>undef, %val0, 0
13127 %vec01 = insertelement <4 x double>%vec0, %val1, 1
13128 %vec012 = insertelement <4 x double>%vec01, %val2, 2
13129 %vec0123 = insertelement <4 x double>%vec012, %val3, 3
13130
13131.. _int_mscatter:
13132
13133'``llvm.masked.scatter.*``' Intrinsics
13134^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13135
13136Syntax:
13137"""""""
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013138This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013139
13140::
13141
Elad Cohenef5798a2017-05-03 12:28:54 +000013142 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>)
13143 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
13144 declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013145
13146Overview:
13147"""""""""
13148
13149Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
13150
13151Arguments:
13152""""""""""
13153
13154The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
13155
13156
13157Semantics:
13158""""""""""
13159
Bruce Mitchenere9ffb452015-09-12 01:17:08 +000013160The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013161
13162::
13163
Sylvestre Ledru84666a12016-02-14 20:16:22 +000013164 ;; This instruction unconditionally stores data vector in multiple addresses
Elad Cohenef5798a2017-05-03 12:28:54 +000013165 call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000013166
13167 ;; It is equivalent to a list of scalar stores
13168 %val0 = extractelement <8 x i32> %value, i32 0
13169 %val1 = extractelement <8 x i32> %value, i32 1
13170 ..
13171 %val7 = extractelement <8 x i32> %value, i32 7
13172 %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
13173 %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
13174 ..
13175 %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
13176 ;; Note: the order of the following stores is important when they overlap:
13177 store i32 %val0, i32* %ptr0, align 4
13178 store i32 %val1, i32* %ptr1, align 4
13179 ..
13180 store i32 %val7, i32* %ptr7, align 4
13181
13182
Elena Demikhovsky0ef2ce32018-06-06 09:11:46 +000013183Masked Vector Expanding Load and Compressing Store Intrinsics
13184-------------------------------------------------------------
13185
13186LLVM provides intrinsics for expanding load and compressing store operations. Data selected from a vector according to a mask is stored in consecutive memory addresses (compressed store), and vice-versa (expanding load). These operations effective map to "if (cond.i) a[j++] = v.i" and "if (cond.i) v.i = a[j++]" patterns, respectively. Note that when the mask starts with '1' bits followed by '0' bits, these operations are identical to :ref:`llvm.masked.store <int_mstore>` and :ref:`llvm.masked.load <int_mload>`.
13187
13188.. _int_expandload:
13189
13190'``llvm.masked.expandload.*``' Intrinsics
13191^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13192
13193Syntax:
13194"""""""
13195This is an overloaded intrinsic. Several values of integer, floating point or pointer data type are loaded from consecutive memory addresses and stored into the elements of a vector according to the mask.
13196
13197::
13198
13199 declare <16 x float> @llvm.masked.expandload.v16f32 (float* <ptr>, <16 x i1> <mask>, <16 x float> <passthru>)
13200 declare <2 x i64> @llvm.masked.expandload.v2i64 (i64* <ptr>, <2 x i1> <mask>, <2 x i64> <passthru>)
13201
13202Overview:
13203"""""""""
13204
13205Reads a number of scalar values sequentially from memory location provided in '``ptr``' and spreads them in a vector. The '``mask``' holds a bit for each vector lane. The number of elements read from memory is equal to the number of '1' bits in the mask. The loaded elements are positioned in the destination vector according to the sequence of '1' and '0' bits in the mask. E.g., if the mask vector is '10010001', "explandload" reads 3 values from memory addresses ptr, ptr+1, ptr+2 and places them in lanes 0, 3 and 7 accordingly. The masked-off lanes are filled by elements from the corresponding lanes of the '``passthru``' operand.
13206
13207
13208Arguments:
13209""""""""""
13210
13211The first operand is the base pointer for the load. It has the same underlying type as the element of the returned vector. The second operand, mask, is a vector of boolean values with the same number of elements as the return type. The third is a pass-through value that is used to fill the masked-off lanes of the result. The return type and the type of the '``passthru``' operand have the same vector type.
13212
13213Semantics:
13214""""""""""
13215
13216The '``llvm.masked.expandload``' intrinsic is designed for reading multiple scalar values from adjacent memory addresses into possibly non-adjacent vector lanes. It is useful for targets that support vector expanding loads and allows vectorizing loop with cross-iteration dependency like in the following example:
13217
13218.. code-block:: c
13219
13220 // In this loop we load from B and spread the elements into array A.
13221 double *A, B; int *C;
13222 for (int i = 0; i < size; ++i) {
13223 if (C[i] != 0)
13224 A[i] = B[j++];
13225 }
13226
13227
13228.. code-block:: llvm
13229
13230 ; Load several elements from array B and expand them in a vector.
13231 ; The number of loaded elements is equal to the number of '1' elements in the Mask.
13232 %Tmp = call <8 x double> @llvm.masked.expandload.v8f64(double* %Bptr, <8 x i1> %Mask, <8 x double> undef)
13233 ; Store the result in A
13234 call void @llvm.masked.store.v8f64.p0v8f64(<8 x double> %Tmp, <8 x double>* %Aptr, i32 8, <8 x i1> %Mask)
13235
13236 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
13237 %MaskI = bitcast <8 x i1> %Mask to i8
13238 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
13239 %MaskI64 = zext i8 %MaskIPopcnt to i64
13240 %BNextInd = add i64 %BInd, %MaskI64
13241
13242
13243Other targets may support this intrinsic differently, for example, by lowering it into a sequence of conditional scalar load operations and shuffles.
13244If all mask elements are '1', the intrinsic behavior is equivalent to the regular unmasked vector load.
13245
13246.. _int_compressstore:
13247
13248'``llvm.masked.compressstore.*``' Intrinsics
13249^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13250
13251Syntax:
13252"""""""
13253This is an overloaded intrinsic. A number of scalar values of integer, floating point or pointer data type are collected from an input vector and stored into adjacent memory addresses. A mask defines which elements to collect from the vector.
13254
13255::
13256
13257 declare void @llvm.masked.compressstore.v8i32 (<8 x i32> <value>, i32* <ptr>, <8 x i1> <mask>)
13258 declare void @llvm.masked.compressstore.v16f32 (<16 x float> <value>, float* <ptr>, <16 x i1> <mask>)
13259
13260Overview:
13261"""""""""
13262
13263Selects elements from input vector '``value``' according to the '``mask``'. All selected elements are written into adjacent memory addresses starting at address '`ptr`', from lower to higher. The mask holds a bit for each vector lane, and is used to select elements to be stored. The number of elements to be stored is equal to the number of active bits in the mask.
13264
13265Arguments:
13266""""""""""
13267
13268The first operand is the input vector, from which elements are collected and written to memory. The second operand is the base pointer for the store, it has the same underlying type as the element of the input vector operand. The third operand is the mask, a vector of boolean values. The mask and the input vector must have the same number of vector elements.
13269
13270
13271Semantics:
13272""""""""""
13273
13274The '``llvm.masked.compressstore``' intrinsic is designed for compressing data in memory. It allows to collect elements from possibly non-adjacent lanes of a vector and store them contiguously in memory in one IR operation. It is useful for targets that support compressing store operations and allows vectorizing loops with cross-iteration dependences like in the following example:
13275
13276.. code-block:: c
13277
13278 // In this loop we load elements from A and store them consecutively in B
13279 double *A, B; int *C;
13280 for (int i = 0; i < size; ++i) {
13281 if (C[i] != 0)
13282 B[j++] = A[i]
13283 }
13284
13285
13286.. code-block:: llvm
13287
13288 ; Load elements from A.
13289 %Tmp = call <8 x double> @llvm.masked.load.v8f64.p0v8f64(<8 x double>* %Aptr, i32 8, <8 x i1> %Mask, <8 x double> undef)
13290 ; Store all selected elements consecutively in array B
13291 call <void> @llvm.masked.compressstore.v8f64(<8 x double> %Tmp, double* %Bptr, <8 x i1> %Mask)
13292
13293 ; %Bptr should be increased on each iteration according to the number of '1' elements in the Mask.
13294 %MaskI = bitcast <8 x i1> %Mask to i8
13295 %MaskIPopcnt = call i8 @llvm.ctpop.i8(i8 %MaskI)
13296 %MaskI64 = zext i8 %MaskIPopcnt to i64
13297 %BNextInd = add i64 %BInd, %MaskI64
13298
13299
13300Other targets may support this intrinsic differently, for example, by lowering it into a sequence of branches that guard scalar store operations.
13301
13302
Sean Silvab084af42012-12-07 10:36:55 +000013303Memory Use Markers
13304------------------
13305
Sanjay Patel69bf48e2014-07-04 19:40:43 +000013306This class of intrinsics provides information about the lifetime of
Sean Silvab084af42012-12-07 10:36:55 +000013307memory objects and ranges where variables are immutable.
13308
Reid Klecknera534a382013-12-19 02:14:12 +000013309.. _int_lifestart:
13310
Sean Silvab084af42012-12-07 10:36:55 +000013311'``llvm.lifetime.start``' Intrinsic
13312^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13313
13314Syntax:
13315"""""""
13316
13317::
13318
13319 declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
13320
13321Overview:
13322"""""""""
13323
13324The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
13325object's lifetime.
13326
13327Arguments:
13328""""""""""
13329
13330The first argument is a constant integer representing the size of the
13331object, or -1 if it is variable sized. The second argument is a pointer
13332to the object.
13333
13334Semantics:
13335""""""""""
13336
13337This intrinsic indicates that before this point in the code, the value
13338of the memory pointed to by ``ptr`` is dead. This means that it is known
13339to never be used and has an undefined value. A load from the pointer
13340that precedes this intrinsic can be replaced with ``'undef'``.
13341
Reid Klecknera534a382013-12-19 02:14:12 +000013342.. _int_lifeend:
13343
Sean Silvab084af42012-12-07 10:36:55 +000013344'``llvm.lifetime.end``' Intrinsic
13345^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13346
13347Syntax:
13348"""""""
13349
13350::
13351
13352 declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
13353
13354Overview:
13355"""""""""
13356
13357The '``llvm.lifetime.end``' intrinsic specifies the end of a memory
13358object's lifetime.
13359
13360Arguments:
13361""""""""""
13362
13363The first argument is a constant integer representing the size of the
13364object, or -1 if it is variable sized. The second argument is a pointer
13365to the object.
13366
13367Semantics:
13368""""""""""
13369
13370This intrinsic indicates that after this point in the code, the value of
13371the memory pointed to by ``ptr`` is dead. This means that it is known to
13372never be used and has an undefined value. Any stores into the memory
13373object following this intrinsic may be removed as dead.
13374
13375'``llvm.invariant.start``' Intrinsic
13376^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13377
13378Syntax:
13379"""""""
Mehdi Amini8c629ec2016-08-13 23:31:24 +000013380This is an overloaded intrinsic. The memory object can belong to any address space.
Sean Silvab084af42012-12-07 10:36:55 +000013381
13382::
13383
Mehdi Amini8c629ec2016-08-13 23:31:24 +000013384 declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
Sean Silvab084af42012-12-07 10:36:55 +000013385
13386Overview:
13387"""""""""
13388
13389The '``llvm.invariant.start``' intrinsic specifies that the contents of
13390a memory object will not change.
13391
13392Arguments:
13393""""""""""
13394
13395The first argument is a constant integer representing the size of the
13396object, or -1 if it is variable sized. The second argument is a pointer
13397to the object.
13398
13399Semantics:
13400""""""""""
13401
13402This intrinsic indicates that until an ``llvm.invariant.end`` that uses
13403the return value, the referenced memory location is constant and
13404unchanging.
13405
13406'``llvm.invariant.end``' Intrinsic
13407^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13408
13409Syntax:
13410"""""""
Mehdi Amini8c629ec2016-08-13 23:31:24 +000013411This is an overloaded intrinsic. The memory object can belong to any address space.
Sean Silvab084af42012-12-07 10:36:55 +000013412
13413::
13414
Mehdi Amini8c629ec2016-08-13 23:31:24 +000013415 declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
Sean Silvab084af42012-12-07 10:36:55 +000013416
13417Overview:
13418"""""""""
13419
13420The '``llvm.invariant.end``' intrinsic specifies that the contents of a
13421memory object are mutable.
13422
13423Arguments:
13424""""""""""
13425
13426The first argument is the matching ``llvm.invariant.start`` intrinsic.
13427The second argument is a constant integer representing the size of the
13428object, or -1 if it is variable sized and the third argument is a
13429pointer to the object.
13430
13431Semantics:
13432""""""""""
13433
13434This intrinsic indicates that the memory is mutable again.
13435
Piotr Padlewski5dde8092018-05-03 11:03:01 +000013436'``llvm.launder.invariant.group``' Intrinsic
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000013437^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13438
13439Syntax:
13440"""""""
Yaxun Liu407ca362017-11-16 16:32:16 +000013441This is an overloaded intrinsic. The memory object can belong to any address
13442space. The returned pointer must belong to the same address space as the
13443argument.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000013444
13445::
13446
Piotr Padlewski5dde8092018-05-03 11:03:01 +000013447 declare i8* @llvm.launder.invariant.group.p0i8(i8* <ptr>)
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000013448
13449Overview:
13450"""""""""
13451
Piotr Padlewski5dde8092018-05-03 11:03:01 +000013452The '``llvm.launder.invariant.group``' intrinsic can be used when an invariant
Piotr Padlewski5b3db452018-07-02 04:49:30 +000013453established by ``invariant.group`` metadata no longer holds, to obtain a new
13454pointer value that carries fresh invariant group information. It is an
13455experimental intrinsic, which means that its semantics might change in the
13456future.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000013457
13458
13459Arguments:
13460""""""""""
13461
Piotr Padlewski5b3db452018-07-02 04:49:30 +000013462The ``llvm.launder.invariant.group`` takes only one argument, which is a pointer
13463to the memory.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000013464
13465Semantics:
13466""""""""""
13467
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013468Returns another pointer that aliases its argument but which is considered different
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000013469for the purposes of ``load``/``store`` ``invariant.group`` metadata.
Piotr Padlewski5dde8092018-05-03 11:03:01 +000013470It does not read any accessible memory and the execution can be speculated.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000013471
Piotr Padlewski5b3db452018-07-02 04:49:30 +000013472'``llvm.strip.invariant.group``' Intrinsic
13473^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13474
13475Syntax:
13476"""""""
13477This is an overloaded intrinsic. The memory object can belong to any address
13478space. The returned pointer must belong to the same address space as the
13479argument.
13480
13481::
13482
13483 declare i8* @llvm.strip.invariant.group.p0i8(i8* <ptr>)
13484
13485Overview:
13486"""""""""
13487
13488The '``llvm.strip.invariant.group``' intrinsic can be used when an invariant
13489established by ``invariant.group`` metadata no longer holds, to obtain a new pointer
13490value that does not carry the invariant information. It is an experimental
13491intrinsic, which means that its semantics might change in the future.
13492
13493
13494Arguments:
13495""""""""""
13496
13497The ``llvm.strip.invariant.group`` takes only one argument, which is a pointer
13498to the memory.
13499
13500Semantics:
13501""""""""""
13502
13503Returns another pointer that aliases its argument but which has no associated
13504``invariant.group`` metadata.
13505It does not read any memory and can be speculated.
13506
13507
13508
Sanjay Patel54b161e2018-03-20 16:38:22 +000013509.. _constrainedfp:
13510
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013511Constrained Floating-Point Intrinsics
Andrew Kaylora0a11642017-01-26 23:27:59 +000013512-------------------------------------
13513
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013514These intrinsics are used to provide special handling of floating-point
13515operations when specific rounding mode or floating-point exception behavior is
Andrew Kaylora0a11642017-01-26 23:27:59 +000013516required. By default, LLVM optimization passes assume that the rounding mode is
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013517round-to-nearest and that floating-point exceptions will not be monitored.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013518Constrained FP intrinsics are used to support non-default rounding modes and
13519accurately preserve exception behavior without compromising LLVM's ability to
13520optimize FP code when the default behavior is used.
13521
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013522Each of these intrinsics corresponds to a normal floating-point operation. The
Andrew Kaylora0a11642017-01-26 23:27:59 +000013523first two arguments and the return value are the same as the corresponding FP
13524operation.
13525
13526The third argument is a metadata argument specifying the rounding mode to be
13527assumed. This argument must be one of the following strings:
13528
13529::
Andrew Kaylor73b4a9a2017-04-20 18:18:36 +000013530
Andrew Kaylora0a11642017-01-26 23:27:59 +000013531 "round.dynamic"
13532 "round.tonearest"
13533 "round.downward"
13534 "round.upward"
13535 "round.towardzero"
13536
13537If this argument is "round.dynamic" optimization passes must assume that the
13538rounding mode is unknown and may change at runtime. No transformations that
13539depend on rounding mode may be performed in this case.
13540
13541The other possible values for the rounding mode argument correspond to the
13542similarly named IEEE rounding modes. If the argument is any of these values
13543optimization passes may perform transformations as long as they are consistent
13544with the specified rounding mode.
13545
13546For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
13547"round.downward" or "round.dynamic" because if the value of 'x' is +0 then
13548'x-0' should evaluate to '-0' when rounding downward. However, this
13549transformation is legal for all other rounding modes.
13550
13551For values other than "round.dynamic" optimization passes may assume that the
13552actual runtime rounding mode (as defined in a target-specific manner) matches
13553the specified rounding mode, but this is not guaranteed. Using a specific
13554non-dynamic rounding mode which does not match the actual rounding mode at
13555runtime results in undefined behavior.
13556
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013557The fourth argument to the constrained floating-point intrinsics specifies the
Andrew Kaylora0a11642017-01-26 23:27:59 +000013558required exception behavior. This argument must be one of the following
13559strings:
13560
13561::
Andrew Kaylor73b4a9a2017-04-20 18:18:36 +000013562
Andrew Kaylora0a11642017-01-26 23:27:59 +000013563 "fpexcept.ignore"
13564 "fpexcept.maytrap"
13565 "fpexcept.strict"
13566
13567If this argument is "fpexcept.ignore" optimization passes may assume that the
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013568exception status flags will not be read and that floating-point exceptions will
Andrew Kaylora0a11642017-01-26 23:27:59 +000013569be masked. This allows transformations to be performed that may change the
13570exception semantics of the original code. For example, FP operations may be
13571speculatively executed in this case whereas they must not be for either of the
13572other possible values of this argument.
13573
13574If the exception behavior argument is "fpexcept.maytrap" optimization passes
13575must avoid transformations that may raise exceptions that would not have been
13576raised by the original code (such as speculatively executing FP operations), but
13577passes are not required to preserve all exceptions that are implied by the
13578original code. For example, exceptions may be potentially hidden by constant
13579folding.
13580
13581If the exception behavior argument is "fpexcept.strict" all transformations must
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013582strictly preserve the floating-point exception semantics of the original code.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013583Any FP exception that would have been raised by the original code must be raised
13584by the transformed code, and the transformed code must not raise any FP
13585exceptions that would not have been raised by the original code. This is the
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013586exception behavior argument that will be used if the code being compiled reads
Andrew Kaylora0a11642017-01-26 23:27:59 +000013587the FP exception status flags, but this mode can also be used with code that
13588unmasks FP exceptions.
13589
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013590The number and order of floating-point exceptions is NOT guaranteed. For
Andrew Kaylora0a11642017-01-26 23:27:59 +000013591example, a series of FP operations that each may raise exceptions may be
13592vectorized into a single instruction that raises each unique exception a single
13593time.
13594
13595
13596'``llvm.experimental.constrained.fadd``' Intrinsic
13597^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13598
13599Syntax:
13600"""""""
13601
13602::
13603
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013604 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013605 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
13606 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013607 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013608
13609Overview:
13610"""""""""
13611
13612The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
13613two operands.
13614
13615
13616Arguments:
13617""""""""""
13618
13619The first two arguments to the '``llvm.experimental.constrained.fadd``'
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013620intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13621of floating-point values. Both arguments must have identical types.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013622
13623The third and fourth arguments specify the rounding mode and exception
13624behavior as described above.
13625
13626Semantics:
13627""""""""""
13628
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013629The value produced is the floating-point sum of the two value operands and has
Andrew Kaylora0a11642017-01-26 23:27:59 +000013630the same type as the operands.
13631
13632
13633'``llvm.experimental.constrained.fsub``' Intrinsic
13634^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13635
13636Syntax:
13637"""""""
13638
13639::
13640
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013641 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013642 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
13643 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013644 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013645
13646Overview:
13647"""""""""
13648
13649The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
13650of its two operands.
13651
13652
13653Arguments:
13654""""""""""
13655
13656The first two arguments to the '``llvm.experimental.constrained.fsub``'
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013657intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13658of floating-point values. Both arguments must have identical types.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013659
13660The third and fourth arguments specify the rounding mode and exception
13661behavior as described above.
13662
13663Semantics:
13664""""""""""
13665
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013666The value produced is the floating-point difference of the two value operands
Andrew Kaylora0a11642017-01-26 23:27:59 +000013667and has the same type as the operands.
13668
13669
13670'``llvm.experimental.constrained.fmul``' Intrinsic
13671^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13672
13673Syntax:
13674"""""""
13675
13676::
13677
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013678 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013679 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
13680 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013681 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013682
13683Overview:
13684"""""""""
13685
13686The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
13687its two operands.
13688
13689
13690Arguments:
13691""""""""""
13692
13693The first two arguments to the '``llvm.experimental.constrained.fmul``'
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013694intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13695of floating-point values. Both arguments must have identical types.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013696
13697The third and fourth arguments specify the rounding mode and exception
13698behavior as described above.
13699
13700Semantics:
13701""""""""""
13702
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013703The value produced is the floating-point product of the two value operands and
Andrew Kaylora0a11642017-01-26 23:27:59 +000013704has the same type as the operands.
13705
13706
13707'``llvm.experimental.constrained.fdiv``' Intrinsic
13708^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13709
13710Syntax:
13711"""""""
13712
13713::
13714
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013715 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013716 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
13717 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013718 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013719
13720Overview:
13721"""""""""
13722
13723The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
13724its two operands.
13725
13726
13727Arguments:
13728""""""""""
13729
13730The first two arguments to the '``llvm.experimental.constrained.fdiv``'
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013731intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13732of floating-point values. Both arguments must have identical types.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013733
13734The third and fourth arguments specify the rounding mode and exception
13735behavior as described above.
13736
13737Semantics:
13738""""""""""
13739
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013740The value produced is the floating-point quotient of the two value operands and
Andrew Kaylora0a11642017-01-26 23:27:59 +000013741has the same type as the operands.
13742
13743
13744'``llvm.experimental.constrained.frem``' Intrinsic
13745^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13746
13747Syntax:
13748"""""""
13749
13750::
13751
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013752 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013753 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
13754 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013755 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013756
13757Overview:
13758"""""""""
13759
13760The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
13761from the division of its two operands.
13762
13763
13764Arguments:
13765""""""""""
13766
13767The first two arguments to the '``llvm.experimental.constrained.frem``'
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013768intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector <t_vector>`
13769of floating-point values. Both arguments must have identical types.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013770
13771The third and fourth arguments specify the rounding mode and exception
13772behavior as described above. The rounding mode argument has no effect, since
13773the result of frem is never rounded, but the argument is included for
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013774consistency with the other constrained floating-point intrinsics.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013775
13776Semantics:
13777""""""""""
13778
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013779The value produced is the floating-point remainder from the division of the two
Andrew Kaylora0a11642017-01-26 23:27:59 +000013780value operands and has the same type as the operands. The remainder has the
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013781same sign as the dividend.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013782
Wei Dinga131d3f2017-08-24 04:18:24 +000013783'``llvm.experimental.constrained.fma``' Intrinsic
13784^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13785
13786Syntax:
13787"""""""
13788
13789::
13790
13791 declare <type>
13792 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
13793 metadata <rounding mode>,
13794 metadata <exception behavior>)
13795
13796Overview:
13797"""""""""
13798
13799The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
13800fused-multiply-add operation on its operands.
13801
13802Arguments:
13803""""""""""
13804
13805The first three arguments to the '``llvm.experimental.constrained.fma``'
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013806intrinsic must be :ref:`floating-point <t_floating>` or :ref:`vector
13807<t_vector>` of floating-point values. All arguments must have identical types.
Wei Dinga131d3f2017-08-24 04:18:24 +000013808
13809The fourth and fifth arguments specify the rounding mode and exception behavior
13810as described above.
13811
13812Semantics:
13813""""""""""
13814
13815The result produced is the product of the first two operands added to the third
13816operand computed with infinite precision, and then rounded to the target
13817precision.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013818
Andrew Kaylorf4660012017-05-25 21:31:00 +000013819Constrained libm-equivalent Intrinsics
13820--------------------------------------
13821
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013822In addition to the basic floating-point operations for which constrained
Andrew Kaylorf4660012017-05-25 21:31:00 +000013823intrinsics are described above, there are constrained versions of various
13824operations which provide equivalent behavior to a corresponding libm function.
13825These intrinsics allow the precise behavior of these operations with respect to
13826rounding mode and exception behavior to be controlled.
13827
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013828As with the basic constrained floating-point intrinsics, the rounding mode
Andrew Kaylorf4660012017-05-25 21:31:00 +000013829and exception behavior arguments only control the behavior of the optimizer.
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013830They do not change the runtime floating-point environment.
Andrew Kaylorf4660012017-05-25 21:31:00 +000013831
13832
13833'``llvm.experimental.constrained.sqrt``' Intrinsic
13834^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13835
13836Syntax:
13837"""""""
13838
13839::
13840
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013841 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013842 @llvm.experimental.constrained.sqrt(<type> <op1>,
13843 metadata <rounding mode>,
13844 metadata <exception behavior>)
13845
13846Overview:
13847"""""""""
13848
13849The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
13850of the specified value, returning the same value as the libm '``sqrt``'
13851functions would, but without setting ``errno``.
13852
13853Arguments:
13854""""""""""
13855
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013856The first argument and the return type are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000013857type.
13858
13859The second and third arguments specify the rounding mode and exception
13860behavior as described above.
13861
13862Semantics:
13863""""""""""
13864
13865This function returns the nonnegative square root of the specified value.
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013866If the value is less than negative zero, a floating-point exception occurs
Hiroshi Inoue760c0c92018-01-16 13:19:48 +000013867and the return value is architecture specific.
Andrew Kaylorf4660012017-05-25 21:31:00 +000013868
13869
13870'``llvm.experimental.constrained.pow``' Intrinsic
13871^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13872
13873Syntax:
13874"""""""
13875
13876::
13877
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013878 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013879 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
13880 metadata <rounding mode>,
13881 metadata <exception behavior>)
13882
13883Overview:
13884"""""""""
13885
13886The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
13887raised to the (positive or negative) power specified by the second operand.
13888
13889Arguments:
13890""""""""""
13891
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013892The first two arguments and the return value are floating-point numbers of the
Andrew Kaylorf4660012017-05-25 21:31:00 +000013893same type. The second argument specifies the power to which the first argument
13894should be raised.
13895
13896The third and fourth arguments specify the rounding mode and exception
13897behavior as described above.
13898
13899Semantics:
13900""""""""""
13901
13902This function returns the first value raised to the second power,
13903returning the same values as the libm ``pow`` functions would, and
13904handles error conditions in the same way.
13905
13906
13907'``llvm.experimental.constrained.powi``' Intrinsic
13908^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13909
13910Syntax:
13911"""""""
13912
13913::
13914
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013915 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013916 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
13917 metadata <rounding mode>,
13918 metadata <exception behavior>)
13919
13920Overview:
13921"""""""""
13922
13923The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
13924raised to the (positive or negative) power specified by the second operand. The
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013925order of evaluation of multiplications is not defined. When a vector of
13926floating-point type is used, the second argument remains a scalar integer value.
Andrew Kaylorf4660012017-05-25 21:31:00 +000013927
13928
13929Arguments:
13930""""""""""
13931
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013932The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000013933type. The second argument is a 32-bit signed integer specifying the power to
13934which the first argument should be raised.
13935
13936The third and fourth arguments specify the rounding mode and exception
13937behavior as described above.
13938
13939Semantics:
13940""""""""""
13941
13942This function returns the first value raised to the second power with an
13943unspecified sequence of rounding operations.
13944
13945
13946'``llvm.experimental.constrained.sin``' Intrinsic
13947^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13948
13949Syntax:
13950"""""""
13951
13952::
13953
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013954 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013955 @llvm.experimental.constrained.sin(<type> <op1>,
13956 metadata <rounding mode>,
13957 metadata <exception behavior>)
13958
13959Overview:
13960"""""""""
13961
13962The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
13963first operand.
13964
13965Arguments:
13966""""""""""
13967
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000013968The first argument and the return type are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000013969type.
13970
13971The second and third arguments specify the rounding mode and exception
13972behavior as described above.
13973
13974Semantics:
13975""""""""""
13976
13977This function returns the sine of the specified operand, returning the
13978same values as the libm ``sin`` functions would, and handles error
13979conditions in the same way.
13980
13981
13982'``llvm.experimental.constrained.cos``' Intrinsic
13983^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13984
13985Syntax:
13986"""""""
13987
13988::
13989
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013990 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013991 @llvm.experimental.constrained.cos(<type> <op1>,
13992 metadata <rounding mode>,
13993 metadata <exception behavior>)
13994
13995Overview:
13996"""""""""
13997
13998The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
13999first operand.
14000
14001Arguments:
14002""""""""""
14003
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014004The first argument and the return type are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014005type.
14006
14007The second and third arguments specify the rounding mode and exception
14008behavior as described above.
14009
14010Semantics:
14011""""""""""
14012
14013This function returns the cosine of the specified operand, returning the
14014same values as the libm ``cos`` functions would, and handles error
14015conditions in the same way.
14016
14017
14018'``llvm.experimental.constrained.exp``' Intrinsic
14019^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14020
14021Syntax:
14022"""""""
14023
14024::
14025
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014026 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000014027 @llvm.experimental.constrained.exp(<type> <op1>,
14028 metadata <rounding mode>,
14029 metadata <exception behavior>)
14030
14031Overview:
14032"""""""""
14033
14034The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
14035exponential of the specified value.
14036
14037Arguments:
14038""""""""""
14039
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014040The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014041type.
14042
14043The second and third arguments specify the rounding mode and exception
14044behavior as described above.
14045
14046Semantics:
14047""""""""""
14048
14049This function returns the same values as the libm ``exp`` functions
14050would, and handles error conditions in the same way.
14051
14052
14053'``llvm.experimental.constrained.exp2``' Intrinsic
14054^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14055
14056Syntax:
14057"""""""
14058
14059::
14060
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014061 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000014062 @llvm.experimental.constrained.exp2(<type> <op1>,
14063 metadata <rounding mode>,
14064 metadata <exception behavior>)
14065
14066Overview:
14067"""""""""
14068
14069The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
14070exponential of the specified value.
14071
14072
14073Arguments:
14074""""""""""
14075
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014076The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014077type.
14078
14079The second and third arguments specify the rounding mode and exception
14080behavior as described above.
14081
14082Semantics:
14083""""""""""
14084
14085This function returns the same values as the libm ``exp2`` functions
14086would, and handles error conditions in the same way.
14087
14088
14089'``llvm.experimental.constrained.log``' Intrinsic
14090^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14091
14092Syntax:
14093"""""""
14094
14095::
14096
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014097 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000014098 @llvm.experimental.constrained.log(<type> <op1>,
14099 metadata <rounding mode>,
14100 metadata <exception behavior>)
14101
14102Overview:
14103"""""""""
14104
14105The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
14106logarithm of the specified value.
14107
14108Arguments:
14109""""""""""
14110
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014111The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014112type.
14113
14114The second and third arguments specify the rounding mode and exception
14115behavior as described above.
14116
14117
14118Semantics:
14119""""""""""
14120
14121This function returns the same values as the libm ``log`` functions
14122would, and handles error conditions in the same way.
14123
14124
14125'``llvm.experimental.constrained.log10``' Intrinsic
14126^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14127
14128Syntax:
14129"""""""
14130
14131::
14132
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014133 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000014134 @llvm.experimental.constrained.log10(<type> <op1>,
14135 metadata <rounding mode>,
14136 metadata <exception behavior>)
14137
14138Overview:
14139"""""""""
14140
14141The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
14142logarithm of the specified value.
14143
14144Arguments:
14145""""""""""
14146
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014147The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014148type.
14149
14150The second and third arguments specify the rounding mode and exception
14151behavior as described above.
14152
14153Semantics:
14154""""""""""
14155
14156This function returns the same values as the libm ``log10`` functions
14157would, and handles error conditions in the same way.
14158
14159
14160'``llvm.experimental.constrained.log2``' Intrinsic
14161^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14162
14163Syntax:
14164"""""""
14165
14166::
14167
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014168 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000014169 @llvm.experimental.constrained.log2(<type> <op1>,
14170 metadata <rounding mode>,
14171 metadata <exception behavior>)
14172
14173Overview:
14174"""""""""
14175
14176The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
14177logarithm of the specified value.
14178
14179Arguments:
14180""""""""""
14181
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014182The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014183type.
14184
14185The second and third arguments specify the rounding mode and exception
14186behavior as described above.
14187
14188Semantics:
14189""""""""""
14190
14191This function returns the same values as the libm ``log2`` functions
14192would, and handles error conditions in the same way.
14193
14194
14195'``llvm.experimental.constrained.rint``' Intrinsic
14196^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14197
14198Syntax:
14199"""""""
14200
14201::
14202
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014203 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000014204 @llvm.experimental.constrained.rint(<type> <op1>,
14205 metadata <rounding mode>,
14206 metadata <exception behavior>)
14207
14208Overview:
14209"""""""""
14210
14211The '``llvm.experimental.constrained.rint``' intrinsic returns the first
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014212operand rounded to the nearest integer. It may raise an inexact floating-point
Andrew Kaylorf4660012017-05-25 21:31:00 +000014213exception if the operand is not an integer.
14214
14215Arguments:
14216""""""""""
14217
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014218The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014219type.
14220
14221The second and third arguments specify the rounding mode and exception
14222behavior as described above.
14223
14224Semantics:
14225""""""""""
14226
14227This function returns the same values as the libm ``rint`` functions
14228would, and handles error conditions in the same way. The rounding mode is
14229described, not determined, by the rounding mode argument. The actual rounding
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014230mode is determined by the runtime floating-point environment. The rounding
Andrew Kaylorf4660012017-05-25 21:31:00 +000014231mode argument is only intended as information to the compiler.
14232
14233
14234'``llvm.experimental.constrained.nearbyint``' Intrinsic
14235^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14236
14237Syntax:
14238"""""""
14239
14240::
14241
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014242 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000014243 @llvm.experimental.constrained.nearbyint(<type> <op1>,
14244 metadata <rounding mode>,
14245 metadata <exception behavior>)
14246
14247Overview:
14248"""""""""
14249
14250The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014251operand rounded to the nearest integer. It will not raise an inexact
14252floating-point exception if the operand is not an integer.
Andrew Kaylorf4660012017-05-25 21:31:00 +000014253
14254
14255Arguments:
14256""""""""""
14257
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014258The first argument and the return value are floating-point numbers of the same
Andrew Kaylorf4660012017-05-25 21:31:00 +000014259type.
14260
14261The second and third arguments specify the rounding mode and exception
14262behavior as described above.
14263
14264Semantics:
14265""""""""""
14266
14267This function returns the same values as the libm ``nearbyint`` functions
14268would, and handles error conditions in the same way. The rounding mode is
14269described, not determined, by the rounding mode argument. The actual rounding
Sanjay Patel85fa9ef2018-03-21 14:15:33 +000014270mode is determined by the runtime floating-point environment. The rounding
Andrew Kaylorf4660012017-05-25 21:31:00 +000014271mode argument is only intended as information to the compiler.
14272
14273
Sean Silvab084af42012-12-07 10:36:55 +000014274General Intrinsics
14275------------------
14276
14277This class of intrinsics is designed to be generic and has no specific
14278purpose.
14279
14280'``llvm.var.annotation``' Intrinsic
14281^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14282
14283Syntax:
14284"""""""
14285
14286::
14287
14288 declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
14289
14290Overview:
14291"""""""""
14292
14293The '``llvm.var.annotation``' intrinsic.
14294
14295Arguments:
14296""""""""""
14297
14298The first argument is a pointer to a value, the second is a pointer to a
14299global string, the third is a pointer to a global string which is the
14300source file name, and the last argument is the line number.
14301
14302Semantics:
14303""""""""""
14304
14305This intrinsic allows annotation of local variables with arbitrary
14306strings. This can be useful for special purpose optimizations that want
14307to look for these annotations. These have no other defined use; they are
14308ignored by code generation and optimization.
14309
Michael Gottesman88d18832013-03-26 00:34:27 +000014310'``llvm.ptr.annotation.*``' Intrinsic
14311^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14312
14313Syntax:
14314"""""""
14315
14316This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
14317pointer to an integer of any width. *NOTE* you must specify an address space for
14318the pointer. The identifier for the default address space is the integer
14319'``0``'.
14320
14321::
14322
14323 declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
14324 declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>)
14325 declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>)
14326 declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>)
14327 declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>)
14328
14329Overview:
14330"""""""""
14331
14332The '``llvm.ptr.annotation``' intrinsic.
14333
14334Arguments:
14335""""""""""
14336
14337The first argument is a pointer to an integer value of arbitrary bitwidth
14338(result of some expression), the second is a pointer to a global string, the
14339third is a pointer to a global string which is the source file name, and the
14340last argument is the line number. It returns the value of the first argument.
14341
14342Semantics:
14343""""""""""
14344
14345This intrinsic allows annotation of a pointer to an integer with arbitrary
14346strings. This can be useful for special purpose optimizations that want to look
14347for these annotations. These have no other defined use; they are ignored by code
14348generation and optimization.
14349
Sean Silvab084af42012-12-07 10:36:55 +000014350'``llvm.annotation.*``' Intrinsic
14351^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14352
14353Syntax:
14354"""""""
14355
14356This is an overloaded intrinsic. You can use '``llvm.annotation``' on
14357any integer bit width.
14358
14359::
14360
14361 declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>)
14362 declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>)
14363 declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>)
14364 declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>)
14365 declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>)
14366
14367Overview:
14368"""""""""
14369
14370The '``llvm.annotation``' intrinsic.
14371
14372Arguments:
14373""""""""""
14374
14375The first argument is an integer value (result of some expression), the
14376second is a pointer to a global string, the third is a pointer to a
14377global string which is the source file name, and the last argument is
14378the line number. It returns the value of the first argument.
14379
14380Semantics:
14381""""""""""
14382
14383This intrinsic allows annotations to be put on arbitrary expressions
14384with arbitrary strings. This can be useful for special purpose
14385optimizations that want to look for these annotations. These have no
14386other defined use; they are ignored by code generation and optimization.
14387
Reid Klecknere33c94f2017-09-05 20:14:58 +000014388'``llvm.codeview.annotation``' Intrinsic
Reid Klecknerd4523682017-09-05 20:26:25 +000014389^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Reid Klecknere33c94f2017-09-05 20:14:58 +000014390
14391Syntax:
14392"""""""
14393
14394This annotation emits a label at its program point and an associated
14395``S_ANNOTATION`` codeview record with some additional string metadata. This is
14396used to implement MSVC's ``__annotation`` intrinsic. It is marked
14397``noduplicate``, so calls to this intrinsic prevent inlining and should be
14398considered expensive.
14399
14400::
14401
14402 declare void @llvm.codeview.annotation(metadata)
14403
14404Arguments:
14405""""""""""
14406
14407The argument should be an MDTuple containing any number of MDStrings.
14408
Sean Silvab084af42012-12-07 10:36:55 +000014409'``llvm.trap``' Intrinsic
14410^^^^^^^^^^^^^^^^^^^^^^^^^
14411
14412Syntax:
14413"""""""
14414
14415::
14416
14417 declare void @llvm.trap() noreturn nounwind
14418
14419Overview:
14420"""""""""
14421
14422The '``llvm.trap``' intrinsic.
14423
14424Arguments:
14425""""""""""
14426
14427None.
14428
14429Semantics:
14430""""""""""
14431
14432This intrinsic is lowered to the target dependent trap instruction. If
14433the target does not have a trap instruction, this intrinsic will be
14434lowered to a call of the ``abort()`` function.
14435
14436'``llvm.debugtrap``' Intrinsic
14437^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14438
14439Syntax:
14440"""""""
14441
14442::
14443
14444 declare void @llvm.debugtrap() nounwind
14445
14446Overview:
14447"""""""""
14448
14449The '``llvm.debugtrap``' intrinsic.
14450
14451Arguments:
14452""""""""""
14453
14454None.
14455
14456Semantics:
14457""""""""""
14458
14459This intrinsic is lowered to code which is intended to cause an
14460execution trap with the intention of requesting the attention of a
14461debugger.
14462
14463'``llvm.stackprotector``' Intrinsic
14464^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14465
14466Syntax:
14467"""""""
14468
14469::
14470
14471 declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
14472
14473Overview:
14474"""""""""
14475
14476The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
14477onto the stack at ``slot``. The stack slot is adjusted to ensure that it
14478is placed on the stack before local variables.
14479
14480Arguments:
14481""""""""""
14482
14483The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
14484The first argument is the value loaded from the stack guard
14485``@__stack_chk_guard``. The second variable is an ``alloca`` that has
14486enough space to hold the value of the guard.
14487
14488Semantics:
14489""""""""""
14490
Michael Gottesmandafc7d92013-08-12 18:35:32 +000014491This intrinsic causes the prologue/epilogue inserter to force the position of
14492the ``AllocaInst`` stack slot to be before local variables on the stack. This is
14493to ensure that if a local variable on the stack is overwritten, it will destroy
14494the value of the guard. When the function exits, the guard on the stack is
14495checked against the original guard by ``llvm.stackprotectorcheck``. If they are
14496different, then ``llvm.stackprotectorcheck`` causes the program to abort by
14497calling the ``__stack_chk_fail()`` function.
14498
Tim Shene885d5e2016-04-19 19:40:37 +000014499'``llvm.stackguard``' Intrinsic
14500^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14501
14502Syntax:
14503"""""""
14504
14505::
14506
14507 declare i8* @llvm.stackguard()
14508
14509Overview:
14510"""""""""
14511
14512The ``llvm.stackguard`` intrinsic returns the system stack guard value.
14513
14514It should not be generated by frontends, since it is only for internal usage.
14515The reason why we create this intrinsic is that we still support IR form Stack
14516Protector in FastISel.
14517
14518Arguments:
14519""""""""""
14520
14521None.
14522
14523Semantics:
14524""""""""""
14525
14526On some platforms, the value returned by this intrinsic remains unchanged
14527between loads in the same thread. On other platforms, it returns the same
14528global variable value, if any, e.g. ``@__stack_chk_guard``.
14529
14530Currently some platforms have IR-level customized stack guard loading (e.g.
14531X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
14532in the future.
14533
Sean Silvab084af42012-12-07 10:36:55 +000014534'``llvm.objectsize``' Intrinsic
14535^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14536
14537Syntax:
14538"""""""
14539
14540::
14541
George Burgess IV56c7e882017-03-21 20:08:59 +000014542 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>)
14543 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>)
Sean Silvab084af42012-12-07 10:36:55 +000014544
14545Overview:
14546"""""""""
14547
14548The ``llvm.objectsize`` intrinsic is designed to provide information to
14549the optimizers to determine at compile time whether a) an operation
14550(like memcpy) will overflow a buffer that corresponds to an object, or
14551b) that a runtime check for overflow isn't necessary. An object in this
14552context means an allocation of a specific class, structure, array, or
14553other object.
14554
14555Arguments:
14556""""""""""
14557
George Burgess IV56c7e882017-03-21 20:08:59 +000014558The ``llvm.objectsize`` intrinsic takes three arguments. The first argument is
14559a pointer to or into the ``object``. The second argument determines whether
14560``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size
14561is unknown. The third argument controls how ``llvm.objectsize`` acts when
George Burgess IV3fbfa9c42018-07-09 22:21:16 +000014562``null`` in address space 0 is used as its pointer argument. If it's ``false``,
14563``llvm.objectsize`` reports 0 bytes available when given ``null``. Otherwise, if
14564the ``null`` is in a non-zero address space or if ``true`` is given for the
14565third argument of ``llvm.objectsize``, we assume its size is unknown.
George Burgess IV56c7e882017-03-21 20:08:59 +000014566
14567The second and third arguments only accept constants.
Sean Silvab084af42012-12-07 10:36:55 +000014568
14569Semantics:
14570""""""""""
14571
14572The ``llvm.objectsize`` intrinsic is lowered to a constant representing
14573the size of the object concerned. If the size cannot be determined at
14574compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending
14575on the ``min`` argument).
14576
14577'``llvm.expect``' Intrinsic
14578^^^^^^^^^^^^^^^^^^^^^^^^^^^
14579
14580Syntax:
14581"""""""
14582
Duncan P. N. Exon Smith1ff08e32014-02-02 22:43:55 +000014583This is an overloaded intrinsic. You can use ``llvm.expect`` on any
14584integer bit width.
14585
Sean Silvab084af42012-12-07 10:36:55 +000014586::
14587
Duncan P. N. Exon Smith1ff08e32014-02-02 22:43:55 +000014588 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
Sean Silvab084af42012-12-07 10:36:55 +000014589 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
14590 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
14591
14592Overview:
14593"""""""""
14594
14595The ``llvm.expect`` intrinsic provides information about expected (the
14596most probable) value of ``val``, which can be used by optimizers.
14597
14598Arguments:
14599""""""""""
14600
14601The ``llvm.expect`` intrinsic takes two arguments. The first argument is
14602a value. The second argument is an expected value, this needs to be a
14603constant value, variables are not allowed.
14604
14605Semantics:
14606""""""""""
14607
14608This intrinsic is lowered to the ``val``.
14609
Philip Reamese0e90832015-04-26 22:23:12 +000014610.. _int_assume:
14611
Hal Finkel93046912014-07-25 21:13:35 +000014612'``llvm.assume``' Intrinsic
14613^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14614
14615Syntax:
14616"""""""
14617
14618::
14619
14620 declare void @llvm.assume(i1 %cond)
14621
14622Overview:
14623"""""""""
14624
14625The ``llvm.assume`` allows the optimizer to assume that the provided
14626condition is true. This information can then be used in simplifying other parts
14627of the code.
14628
14629Arguments:
14630""""""""""
14631
14632The condition which the optimizer may assume is always true.
14633
14634Semantics:
14635""""""""""
14636
14637The intrinsic allows the optimizer to assume that the provided condition is
14638always true whenever the control flow reaches the intrinsic call. No code is
14639generated for this intrinsic, and instructions that contribute only to the
14640provided condition are not used for code generation. If the condition is
14641violated during execution, the behavior is undefined.
14642
Sanjay Patel1ed2bb52015-01-14 16:03:58 +000014643Note that the optimizer might limit the transformations performed on values
Hal Finkel93046912014-07-25 21:13:35 +000014644used by the ``llvm.assume`` intrinsic in order to preserve the instructions
14645only used to form the intrinsic's input argument. This might prove undesirable
Sanjay Patel1ed2bb52015-01-14 16:03:58 +000014646if the extra information provided by the ``llvm.assume`` intrinsic does not cause
Hal Finkel93046912014-07-25 21:13:35 +000014647sufficient overall improvement in code quality. For this reason,
14648``llvm.assume`` should not be used to document basic mathematical invariants
14649that the optimizer can otherwise deduce or facts that are of little use to the
14650optimizer.
14651
Daniel Berlin2c438a32017-02-07 19:29:25 +000014652.. _int_ssa_copy:
14653
14654'``llvm.ssa_copy``' Intrinsic
14655^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14656
14657Syntax:
14658"""""""
14659
14660::
14661
14662 declare type @llvm.ssa_copy(type %operand) returned(1) readnone
14663
14664Arguments:
14665""""""""""
14666
14667The first argument is an operand which is used as the returned value.
14668
14669Overview:
14670""""""""""
14671
14672The ``llvm.ssa_copy`` intrinsic can be used to attach information to
14673operations by copying them and giving them new names. For example,
14674the PredicateInfo utility uses it to build Extended SSA form, and
14675attach various forms of information to operands that dominate specific
14676uses. It is not meant for general use, only for building temporary
14677renaming forms that require value splits at certain points.
14678
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014679.. _type.test:
Peter Collingbournee6909c82015-02-20 20:30:47 +000014680
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014681'``llvm.type.test``' Intrinsic
Peter Collingbournee6909c82015-02-20 20:30:47 +000014682^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14683
14684Syntax:
14685"""""""
14686
14687::
14688
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014689 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
Peter Collingbournee6909c82015-02-20 20:30:47 +000014690
14691
14692Arguments:
14693""""""""""
14694
14695The first argument is a pointer to be tested. The second argument is a
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014696metadata object representing a :doc:`type identifier <TypeMetadata>`.
Peter Collingbournee6909c82015-02-20 20:30:47 +000014697
14698Overview:
14699"""""""""
14700
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014701The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
14702with the given type identifier.
Peter Collingbournee6909c82015-02-20 20:30:47 +000014703
Peter Collingbourne0312f612016-06-25 00:23:04 +000014704'``llvm.type.checked.load``' Intrinsic
14705^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14706
14707Syntax:
14708"""""""
14709
14710::
14711
14712 declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
14713
14714
14715Arguments:
14716""""""""""
14717
14718The first argument is a pointer from which to load a function pointer. The
14719second argument is the byte offset from which to load the function pointer. The
14720third argument is a metadata object representing a :doc:`type identifier
14721<TypeMetadata>`.
14722
14723Overview:
14724"""""""""
14725
14726The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
14727virtual table pointer using type metadata. This intrinsic is used to implement
14728control flow integrity in conjunction with virtual call optimization. The
14729virtual call optimization pass will optimize away ``llvm.type.checked.load``
14730intrinsics associated with devirtualized calls, thereby removing the type
14731check in cases where it is not needed to enforce the control flow integrity
14732constraint.
14733
14734If the given pointer is associated with a type metadata identifier, this
14735function returns true as the second element of its return value. (Note that
14736the function may also return true if the given pointer is not associated
14737with a type metadata identifier.) If the function's return value's second
14738element is true, the following rules apply to the first element:
14739
14740- If the given pointer is associated with the given type metadata identifier,
14741 it is the function pointer loaded from the given byte offset from the given
14742 pointer.
14743
14744- If the given pointer is not associated with the given type metadata
14745 identifier, it is one of the following (the choice of which is unspecified):
14746
14747 1. The function pointer that would have been loaded from an arbitrarily chosen
14748 (through an unspecified mechanism) pointer associated with the type
14749 metadata.
14750
14751 2. If the function has a non-void return type, a pointer to a function that
14752 returns an unspecified value without causing side effects.
14753
14754If the function's return value's second element is false, the value of the
14755first element is undefined.
14756
14757
Sean Silvab084af42012-12-07 10:36:55 +000014758'``llvm.donothing``' Intrinsic
14759^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14760
14761Syntax:
14762"""""""
14763
14764::
14765
14766 declare void @llvm.donothing() nounwind readnone
14767
14768Overview:
14769"""""""""
14770
Juergen Ributzkac9161192014-10-23 22:36:13 +000014771The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
Sanjoy Das7a4c94d2016-02-26 03:33:59 +000014772three intrinsics (besides ``llvm.experimental.patchpoint`` and
14773``llvm.experimental.gc.statepoint``) that can be called with an invoke
14774instruction.
Sean Silvab084af42012-12-07 10:36:55 +000014775
14776Arguments:
14777""""""""""
14778
14779None.
14780
14781Semantics:
14782""""""""""
14783
14784This intrinsic does nothing, and it's removed by optimizers and ignored
14785by codegen.
Andrew Trick5e029ce2013-12-24 02:57:25 +000014786
Sanjoy Dasb51325d2016-03-11 19:08:34 +000014787'``llvm.experimental.deoptimize``' Intrinsic
14788^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14789
14790Syntax:
14791"""""""
14792
14793::
14794
14795 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
14796
14797Overview:
14798"""""""""
14799
14800This intrinsic, together with :ref:`deoptimization operand bundles
14801<deopt_opbundles>`, allow frontends to express transfer of control and
14802frame-local state from the currently executing (typically more specialized,
14803hence faster) version of a function into another (typically more generic, hence
14804slower) version.
14805
14806In languages with a fully integrated managed runtime like Java and JavaScript
14807this intrinsic can be used to implement "uncommon trap" or "side exit" like
14808functionality. In unmanaged languages like C and C++, this intrinsic can be
14809used to represent the slow paths of specialized functions.
14810
14811
14812Arguments:
14813""""""""""
14814
14815The intrinsic takes an arbitrary number of arguments, whose meaning is
14816decided by the :ref:`lowering strategy<deoptimize_lowering>`.
14817
14818Semantics:
14819""""""""""
14820
14821The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
14822deoptimization continuation (denoted using a :ref:`deoptimization
14823operand bundle <deopt_opbundles>`) and returns the value returned by
14824the deoptimization continuation. Defining the semantic properties of
14825the continuation itself is out of scope of the language reference --
14826as far as LLVM is concerned, the deoptimization continuation can
14827invoke arbitrary side effects, including reading from and writing to
14828the entire heap.
14829
14830Deoptimization continuations expressed using ``"deopt"`` operand bundles always
14831continue execution to the end of the physical frame containing them, so all
14832calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
14833
14834 - ``@llvm.experimental.deoptimize`` cannot be invoked.
14835 - The call must immediately precede a :ref:`ret <i_ret>` instruction.
14836 - The ``ret`` instruction must return the value produced by the
14837 ``@llvm.experimental.deoptimize`` call if there is one, or void.
14838
14839Note that the above restrictions imply that the return type for a call to
14840``@llvm.experimental.deoptimize`` will match the return type of its immediate
14841caller.
14842
14843The inliner composes the ``"deopt"`` continuations of the caller into the
14844``"deopt"`` continuations present in the inlinee, and also updates calls to this
14845intrinsic to return directly from the frame of the function it inlined into.
14846
Sanjoy Dase0aa4142016-05-12 01:17:38 +000014847All declarations of ``@llvm.experimental.deoptimize`` must share the
14848same calling convention.
14849
Sanjoy Dasb51325d2016-03-11 19:08:34 +000014850.. _deoptimize_lowering:
14851
14852Lowering:
14853"""""""""
14854
Sanjoy Dasdf9ae702016-03-24 20:23:29 +000014855Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
14856symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
14857ensure that this symbol is defined). The call arguments to
14858``@llvm.experimental.deoptimize`` are lowered as if they were formal
14859arguments of the specified types, and not as varargs.
14860
Sanjoy Dasb51325d2016-03-11 19:08:34 +000014861
Sanjoy Das021de052016-03-31 00:18:46 +000014862'``llvm.experimental.guard``' Intrinsic
14863^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14864
14865Syntax:
14866"""""""
14867
14868::
14869
14870 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
14871
14872Overview:
14873"""""""""
14874
14875This intrinsic, together with :ref:`deoptimization operand bundles
14876<deopt_opbundles>`, allows frontends to express guards or checks on
14877optimistic assumptions made during compilation. The semantics of
14878``@llvm.experimental.guard`` is defined in terms of
14879``@llvm.experimental.deoptimize`` -- its body is defined to be
14880equivalent to:
14881
Renato Golin124f2592016-07-20 12:16:38 +000014882.. code-block:: text
Sanjoy Das021de052016-03-31 00:18:46 +000014883
Renato Golin124f2592016-07-20 12:16:38 +000014884 define void @llvm.experimental.guard(i1 %pred, <args...>) {
14885 %realPred = and i1 %pred, undef
14886 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
Sanjoy Das021de052016-03-31 00:18:46 +000014887
Renato Golin124f2592016-07-20 12:16:38 +000014888 leave:
14889 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
14890 ret void
Sanjoy Das021de052016-03-31 00:18:46 +000014891
Renato Golin124f2592016-07-20 12:16:38 +000014892 continue:
14893 ret void
14894 }
Sanjoy Das021de052016-03-31 00:18:46 +000014895
Sanjoy Das47cf2af2016-04-30 00:55:59 +000014896
14897with the optional ``[, !make.implicit !{}]`` present if and only if it
14898is present on the call site. For more details on ``!make.implicit``,
14899see :doc:`FaultMaps`.
14900
Sanjoy Das021de052016-03-31 00:18:46 +000014901In words, ``@llvm.experimental.guard`` executes the attached
14902``"deopt"`` continuation if (but **not** only if) its first argument
14903is ``false``. Since the optimizer is allowed to replace the ``undef``
14904with an arbitrary value, it can optimize guard to fail "spuriously",
14905i.e. without the original condition being false (hence the "not only
14906if"); and this allows for "check widening" type optimizations.
14907
14908``@llvm.experimental.guard`` cannot be invoked.
14909
14910
Peter Collingbourne7dd8dbf2016-04-22 21:18:02 +000014911'``llvm.load.relative``' Intrinsic
14912^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14913
14914Syntax:
14915"""""""
14916
14917::
14918
14919 declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
14920
14921Overview:
14922"""""""""
14923
14924This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
14925adds ``%ptr`` to that value and returns it. The constant folder specifically
14926recognizes the form of this intrinsic and the constant initializers it may
14927load from; if a loaded constant initializer is known to have the form
14928``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
14929
14930LLVM provides that the calculation of such a constant initializer will
14931not overflow at link time under the medium code model if ``x`` is an
14932``unnamed_addr`` function. However, it does not provide this guarantee for
14933a constant initializer folded into a function body. This intrinsic can be
14934used to avoid the possibility of overflows when loading from such a constant.
14935
Dan Gohman2c74fe92017-11-08 21:59:51 +000014936'``llvm.sideeffect``' Intrinsic
14937^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14938
14939Syntax:
14940"""""""
14941
14942::
14943
14944 declare void @llvm.sideeffect() inaccessiblememonly nounwind
14945
14946Overview:
14947"""""""""
14948
14949The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
14950treat it as having side effects, so it can be inserted into a loop to
14951indicate that the loop shouldn't be assumed to terminate (which could
14952potentially lead to the loop being optimized away entirely), even if it's
14953an infinite loop with no other side effects.
14954
14955Arguments:
14956""""""""""
14957
14958None.
14959
14960Semantics:
14961""""""""""
14962
14963This intrinsic actually does nothing, but optimizers must assume that it
14964has externally observable side effects.
14965
Andrew Trick5e029ce2013-12-24 02:57:25 +000014966Stack Map Intrinsics
14967--------------------
14968
14969LLVM provides experimental intrinsics to support runtime patching
14970mechanisms commonly desired in dynamic language JITs. These intrinsics
14971are described in :doc:`StackMaps`.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014972
14973Element Wise Atomic Memory Intrinsics
Igor Laevskyfedab152016-12-29 15:08:57 +000014974-------------------------------------
Igor Laevsky4f31e522016-12-29 14:31:07 +000014975
14976These intrinsics are similar to the standard library memory intrinsics except
14977that they perform memory transfer as a sequence of atomic memory accesses.
14978
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014979.. _int_memcpy_element_unordered_atomic:
Igor Laevsky4f31e522016-12-29 14:31:07 +000014980
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014981'``llvm.memcpy.element.unordered.atomic``' Intrinsic
14982^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Igor Laevsky4f31e522016-12-29 14:31:07 +000014983
14984Syntax:
14985"""""""
14986
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014987This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
Igor Laevsky4f31e522016-12-29 14:31:07 +000014988any integer bit width and for different address spaces. Not all targets
14989support all bit widths however.
14990
14991::
14992
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014993 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
14994 i8* <src>,
14995 i32 <len>,
14996 i32 <element_size>)
14997 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
14998 i8* <src>,
14999 i64 <len>,
15000 i32 <element_size>)
Igor Laevsky4f31e522016-12-29 14:31:07 +000015001
15002Overview:
15003"""""""""
15004
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015005The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
15006'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
15007as arrays with elements that are exactly ``element_size`` bytes, and the copy between
15008buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
15009that are a positive integer multiple of the ``element_size`` in size.
Igor Laevsky4f31e522016-12-29 14:31:07 +000015010
15011Arguments:
15012""""""""""
15013
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015014The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
15015intrinsic, with the added constraint that ``len`` is required to be a positive integer
15016multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
15017``element_size``, then the behaviour of the intrinsic is undefined.
Igor Laevsky4f31e522016-12-29 14:31:07 +000015018
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015019``element_size`` must be a compile-time constant positive power of two no greater than
15020target-specific atomic access size limit.
Igor Laevsky4f31e522016-12-29 14:31:07 +000015021
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015022For each of the input pointers ``align`` parameter attribute must be specified. It
15023must be a power of two no less than the ``element_size``. Caller guarantees that
15024both the source and destination pointers are aligned to that boundary.
Igor Laevsky4f31e522016-12-29 14:31:07 +000015025
15026Semantics:
15027""""""""""
15028
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015029The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
15030memory from the source location to the destination location. These locations are not
15031allowed to overlap. The memory copy is performed as a sequence of load/store operations
15032where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000015033aligned at an ``element_size`` boundary.
Igor Laevsky4f31e522016-12-29 14:31:07 +000015034
15035The order of the copy is unspecified. The same value may be read from the source
15036buffer many times, but only one write is issued to the destination buffer per
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015037element. It is well defined to have concurrent reads and writes to both source and
15038destination provided those reads and writes are unordered atomic when specified.
Igor Laevsky4f31e522016-12-29 14:31:07 +000015039
15040This intrinsic does not provide any additional ordering guarantees over those
15041provided by a set of unordered loads from the source location and stores to the
15042destination.
15043
15044Lowering:
Igor Laevskyfedab152016-12-29 15:08:57 +000015045"""""""""
Igor Laevsky4f31e522016-12-29 14:31:07 +000015046
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015047In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
15048lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
15049is replaced with an actual element size.
Igor Laevsky4f31e522016-12-29 14:31:07 +000015050
Daniel Neilson57226ef2017-07-12 15:25:26 +000015051Optimizer is allowed to inline memory copy when it's profitable to do so.
15052
15053'``llvm.memmove.element.unordered.atomic``' Intrinsic
15054^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15055
15056Syntax:
15057"""""""
15058
15059This is an overloaded intrinsic. You can use
15060``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
15061different address spaces. Not all targets support all bit widths however.
15062
15063::
15064
15065 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
15066 i8* <src>,
15067 i32 <len>,
15068 i32 <element_size>)
15069 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
15070 i8* <src>,
15071 i64 <len>,
15072 i32 <element_size>)
15073
15074Overview:
15075"""""""""
15076
15077The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
15078of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
15079``src`` are treated as arrays with elements that are exactly ``element_size``
15080bytes, and the copy between buffers uses a sequence of
15081:ref:`unordered atomic <ordering>` load/store operations that are a positive
15082integer multiple of the ``element_size`` in size.
15083
15084Arguments:
15085""""""""""
15086
15087The first three arguments are the same as they are in the
15088:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
15089``len`` is required to be a positive integer multiple of the ``element_size``.
15090If ``len`` is not a positive integer multiple of ``element_size``, then the
15091behaviour of the intrinsic is undefined.
15092
15093``element_size`` must be a compile-time constant positive power of two no
15094greater than a target-specific atomic access size limit.
15095
15096For each of the input pointers the ``align`` parameter attribute must be
15097specified. It must be a power of two no less than the ``element_size``. Caller
15098guarantees that both the source and destination pointers are aligned to that
15099boundary.
15100
15101Semantics:
15102""""""""""
15103
15104The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
15105of memory from the source location to the destination location. These locations
15106are allowed to overlap. The memory copy is performed as a sequence of load/store
15107operations where each access is guaranteed to be a multiple of ``element_size``
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000015108bytes wide and aligned at an ``element_size`` boundary.
Daniel Neilson57226ef2017-07-12 15:25:26 +000015109
15110The order of the copy is unspecified. The same value may be read from the source
15111buffer many times, but only one write is issued to the destination buffer per
15112element. It is well defined to have concurrent reads and writes to both source
15113and destination provided those reads and writes are unordered atomic when
15114specified.
15115
15116This intrinsic does not provide any additional ordering guarantees over those
15117provided by a set of unordered loads from the source location and stores to the
15118destination.
15119
15120Lowering:
15121"""""""""
15122
15123In the most general case call to the
15124'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
15125``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
15126actual element size.
15127
Daniel Neilson3faabbb2017-06-16 14:43:59 +000015128The optimizer is allowed to inline the memory copy when it's profitable to do so.
Daniel Neilson965613e2017-07-12 21:57:23 +000015129
15130.. _int_memset_element_unordered_atomic:
15131
15132'``llvm.memset.element.unordered.atomic``' Intrinsic
15133^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
15134
15135Syntax:
15136"""""""
15137
15138This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
15139any integer bit width and for different address spaces. Not all targets
15140support all bit widths however.
15141
15142::
15143
15144 declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
15145 i8 <value>,
15146 i32 <len>,
15147 i32 <element_size>)
15148 declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
15149 i8 <value>,
15150 i64 <len>,
15151 i32 <element_size>)
15152
15153Overview:
15154"""""""""
15155
15156The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
15157'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
15158with elements that are exactly ``element_size`` bytes, and the assignment to that array
15159uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
15160that are a positive integer multiple of the ``element_size`` in size.
15161
15162Arguments:
15163""""""""""
15164
15165The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
15166intrinsic, with the added constraint that ``len`` is required to be a positive integer
15167multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
15168``element_size``, then the behaviour of the intrinsic is undefined.
15169
15170``element_size`` must be a compile-time constant positive power of two no greater than
15171target-specific atomic access size limit.
15172
15173The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
15174must be a power of two no less than the ``element_size``. Caller guarantees that
15175the destination pointer is aligned to that boundary.
15176
15177Semantics:
15178""""""""""
15179
15180The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
15181memory starting at the destination location to the given ``value``. The memory is
15182set with a sequence of store operations where each access is guaranteed to be a
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000015183multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
Daniel Neilson965613e2017-07-12 21:57:23 +000015184
15185The order of the assignment is unspecified. Only one write is issued to the
15186destination buffer per element. It is well defined to have concurrent reads and
15187writes to the destination provided those reads and writes are unordered atomic
15188when specified.
15189
15190This intrinsic does not provide any additional ordering guarantees over those
15191provided by a set of unordered stores to the destination.
15192
15193Lowering:
15194"""""""""
15195
15196In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
15197lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
15198is replaced with an actual element size.
15199
15200The optimizer is allowed to inline the memory assignment when it's profitable to do so.