blob: 6999e8f55f8b9222cd040a51646efa550e151ccb [file] [log] [blame]
Sean Silvab084af42012-12-07 10:36:55 +00001==============================
2LLVM Language Reference Manual
3==============================
4
5.. contents::
6 :local:
Rafael Espindola08013342013-12-07 19:34:20 +00007 :depth: 4
Sean Silvab084af42012-12-07 10:36:55 +00008
Sean Silvab084af42012-12-07 10:36:55 +00009Abstract
10========
11
12This document is a reference manual for the LLVM assembly language. LLVM
13is a Static Single Assignment (SSA) based representation that provides
14type safety, low-level operations, flexibility, and the capability of
15representing 'all' high-level languages cleanly. It is the common code
16representation used throughout all phases of the LLVM compilation
17strategy.
18
19Introduction
20============
21
22The LLVM code representation is designed to be used in three different
23forms: as an in-memory compiler IR, as an on-disk bitcode representation
24(suitable for fast loading by a Just-In-Time compiler), and as a human
25readable assembly language representation. This allows LLVM to provide a
26powerful intermediate representation for efficient compiler
27transformations and analysis, while providing a natural means to debug
28and visualize the transformations. The three different forms of LLVM are
29all equivalent. This document describes the human readable
30representation and notation.
31
32The LLVM representation aims to be light-weight and low-level while
33being expressive, typed, and extensible at the same time. It aims to be
34a "universal IR" of sorts, by being at a low enough level that
35high-level ideas may be cleanly mapped to it (similar to how
36microprocessors are "universal IR's", allowing many source languages to
37be mapped to them). By providing type information, LLVM can be used as
38the target of optimizations: for example, through pointer analysis, it
39can be proven that a C automatic variable is never accessed outside of
40the current function, allowing it to be promoted to a simple SSA value
41instead of a memory location.
42
43.. _wellformed:
44
45Well-Formedness
46---------------
47
48It is important to note that this document describes 'well formed' LLVM
49assembly language. There is a difference between what the parser accepts
50and what is considered 'well formed'. For example, the following
51instruction is syntactically okay, but not well formed:
52
53.. code-block:: llvm
54
55 %x = add i32 1, %x
56
57because the definition of ``%x`` does not dominate all of its uses. The
58LLVM infrastructure provides a verification pass that may be used to
59verify that an LLVM module is well formed. This pass is automatically
60run by the parser after parsing input assembly and by the optimizer
61before it outputs bitcode. The violations pointed out by the verifier
62pass indicate bugs in transformation passes or input to the parser.
63
64.. _identifiers:
65
66Identifiers
67===========
68
69LLVM identifiers come in two basic types: global and local. Global
70identifiers (functions, global variables) begin with the ``'@'``
71character. Local identifiers (register names, types) begin with the
72``'%'`` character. Additionally, there are three different formats for
73identifiers, for different purposes:
74
75#. Named values are represented as a string of characters with their
76 prefix. For example, ``%foo``, ``@DivisionByZero``,
77 ``%a.really.long.identifier``. The actual regular expression used is
Sean Silva9d01a5b2015-01-07 21:35:14 +000078 '``[%@][-a-zA-Z$._][-a-zA-Z$._0-9]*``'. Identifiers that require other
Sean Silvab084af42012-12-07 10:36:55 +000079 characters in their names can be surrounded with quotes. Special
80 characters may be escaped using ``"\xx"`` where ``xx`` is the ASCII
81 code for the character in hexadecimal. In this way, any character can
Hans Wennborg85e06532014-07-30 20:02:08 +000082 be used in a name value, even quotes themselves. The ``"\01"`` prefix
83 can be used on global variables to suppress mangling.
Sean Silvab084af42012-12-07 10:36:55 +000084#. Unnamed values are represented as an unsigned numeric value with
85 their prefix. For example, ``%12``, ``@2``, ``%44``.
Sean Silvaa1190322015-08-06 22:56:48 +000086#. Constants, which are described in the section Constants_ below.
Sean Silvab084af42012-12-07 10:36:55 +000087
88LLVM requires that values start with a prefix for two reasons: Compilers
89don't need to worry about name clashes with reserved words, and the set
90of reserved words may be expanded in the future without penalty.
91Additionally, unnamed identifiers allow a compiler to quickly come up
92with a temporary variable without having to avoid symbol table
93conflicts.
94
95Reserved words in LLVM are very similar to reserved words in other
96languages. There are keywords for different opcodes ('``add``',
97'``bitcast``', '``ret``', etc...), for primitive type names ('``void``',
98'``i32``', etc...), and others. These reserved words cannot conflict
99with variable names, because none of them start with a prefix character
100(``'%'`` or ``'@'``).
101
102Here is an example of LLVM code to multiply the integer variable
103'``%X``' by 8:
104
105The easy way:
106
107.. code-block:: llvm
108
109 %result = mul i32 %X, 8
110
111After strength reduction:
112
113.. code-block:: llvm
114
Dmitri Gribenko675911d2013-01-26 13:30:13 +0000115 %result = shl i32 %X, 3
Sean Silvab084af42012-12-07 10:36:55 +0000116
117And the hard way:
118
119.. code-block:: llvm
120
Tim Northover675a0962014-06-13 14:24:23 +0000121 %0 = add i32 %X, %X ; yields i32:%0
122 %1 = add i32 %0, %0 ; yields i32:%1
Sean Silvab084af42012-12-07 10:36:55 +0000123 %result = add i32 %1, %1
124
125This last way of multiplying ``%X`` by 8 illustrates several important
126lexical features of LLVM:
127
128#. Comments are delimited with a '``;``' and go until the end of line.
129#. Unnamed temporaries are created when the result of a computation is
130 not assigned to a named value.
Sean Silva8ca11782013-05-20 23:31:12 +0000131#. Unnamed temporaries are numbered sequentially (using a per-function
Dan Liew2661dfc2014-08-20 15:06:30 +0000132 incrementing counter, starting with 0). Note that basic blocks and unnamed
133 function parameters are included in this numbering. For example, if the
134 entry basic block is not given a label name and all function parameters are
135 named, then it will get number 0.
Sean Silvab084af42012-12-07 10:36:55 +0000136
137It also shows a convention that we follow in this document. When
138demonstrating instructions, we will follow an instruction with a comment
139that defines the type and name of value produced.
140
141High Level Structure
142====================
143
144Module Structure
145----------------
146
147LLVM programs are composed of ``Module``'s, each of which is a
148translation unit of the input programs. Each module consists of
149functions, global variables, and symbol table entries. Modules may be
150combined together with the LLVM linker, which merges function (and
151global variable) definitions, resolves forward declarations, and merges
152symbol table entries. Here is an example of the "hello world" module:
153
154.. code-block:: llvm
155
Michael Liaoa7699082013-03-06 18:24:34 +0000156 ; Declare the string constant as a global constant.
157 @.str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
Sean Silvab084af42012-12-07 10:36:55 +0000158
Michael Liaoa7699082013-03-06 18:24:34 +0000159 ; External declaration of the puts function
160 declare i32 @puts(i8* nocapture) nounwind
Sean Silvab084af42012-12-07 10:36:55 +0000161
162 ; Definition of main function
Michael Liaoa7699082013-03-06 18:24:34 +0000163 define i32 @main() { ; i32()*
George Burgess IVfbc34982017-05-20 04:52:29 +0000164 ; Convert [13 x i8]* to i8*...
David Blaikie16a97eb2015-03-04 22:02:58 +0000165 %cast210 = getelementptr [13 x i8], [13 x i8]* @.str, i64 0, i64 0
Sean Silvab084af42012-12-07 10:36:55 +0000166
Michael Liaoa7699082013-03-06 18:24:34 +0000167 ; Call puts function to write out the string to stdout.
Sean Silvab084af42012-12-07 10:36:55 +0000168 call i32 @puts(i8* %cast210)
Michael Liaoa7699082013-03-06 18:24:34 +0000169 ret i32 0
Sean Silvab084af42012-12-07 10:36:55 +0000170 }
171
172 ; Named metadata
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +0000173 !0 = !{i32 42, null, !"string"}
Nick Lewyckya0de40a2014-08-13 04:54:05 +0000174 !foo = !{!0}
Sean Silvab084af42012-12-07 10:36:55 +0000175
176This example is made up of a :ref:`global variable <globalvars>` named
177"``.str``", an external declaration of the "``puts``" function, a
178:ref:`function definition <functionstructure>` for "``main``" and
179:ref:`named metadata <namedmetadatastructure>` "``foo``".
180
181In general, a module is made up of a list of global values (where both
182functions and global variables are global values). Global values are
183represented by a pointer to a memory location (in this case, a pointer
184to an array of char, and a pointer to a function), and have one of the
185following :ref:`linkage types <linkage>`.
186
187.. _linkage:
188
189Linkage Types
190-------------
191
192All Global Variables and Functions have one of the following types of
193linkage:
194
195``private``
196 Global values with "``private``" linkage are only directly
197 accessible by objects in the current module. In particular, linking
Sylvestre Ledru0604c5c2017-03-04 14:01:38 +0000198 code into a module with a private global value may cause the
Sean Silvab084af42012-12-07 10:36:55 +0000199 private to be renamed as necessary to avoid collisions. Because the
200 symbol is private to the module, all references can be updated. This
201 doesn't show up in any symbol table in the object file.
Sean Silvab084af42012-12-07 10:36:55 +0000202``internal``
203 Similar to private, but the value shows as a local symbol
204 (``STB_LOCAL`` in the case of ELF) in the object file. This
205 corresponds to the notion of the '``static``' keyword in C.
206``available_externally``
Peter Collingbourne45cd0c32015-12-14 19:22:37 +0000207 Globals with "``available_externally``" linkage are never emitted into
208 the object file corresponding to the LLVM module. From the linker's
209 perspective, an ``available_externally`` global is equivalent to
210 an external declaration. They exist to allow inlining and other
211 optimizations to take place given knowledge of the definition of the
212 global, which is known to be somewhere outside the module. Globals
213 with ``available_externally`` linkage are allowed to be discarded at
214 will, and allow inlining and other optimizations. This linkage type is
215 only allowed on definitions, not declarations.
Sean Silvab084af42012-12-07 10:36:55 +0000216``linkonce``
217 Globals with "``linkonce``" linkage are merged with other globals of
218 the same name when linkage occurs. This can be used to implement
219 some forms of inline functions, templates, or other code which must
220 be generated in each translation unit that uses it, but where the
221 body may be overridden with a more definitive definition later.
222 Unreferenced ``linkonce`` globals are allowed to be discarded. Note
223 that ``linkonce`` linkage does not actually allow the optimizer to
224 inline the body of this function into callers because it doesn't
225 know if this definition of the function is the definitive definition
226 within the program or whether it will be overridden by a stronger
227 definition. To enable inlining and other optimizations, use
228 "``linkonce_odr``" linkage.
229``weak``
230 "``weak``" linkage has the same merging semantics as ``linkonce``
231 linkage, except that unreferenced globals with ``weak`` linkage may
232 not be discarded. This is used for globals that are declared "weak"
233 in C source code.
234``common``
235 "``common``" linkage is most similar to "``weak``" linkage, but they
236 are used for tentative definitions in C, such as "``int X;``" at
237 global scope. Symbols with "``common``" linkage are merged in the
238 same way as ``weak symbols``, and they may not be deleted if
239 unreferenced. ``common`` symbols may not have an explicit section,
240 must have a zero initializer, and may not be marked
241 ':ref:`constant <globalvars>`'. Functions and aliases may not have
242 common linkage.
243
244.. _linkage_appending:
245
246``appending``
247 "``appending``" linkage may only be applied to global variables of
248 pointer to array type. When two global variables with appending
249 linkage are linked together, the two global arrays are appended
250 together. This is the LLVM, typesafe, equivalent of having the
251 system linker append together "sections" with identical names when
252 .o files are linked.
Rafael Espindolae64619c2016-05-16 21:14:24 +0000253
254 Unfortunately this doesn't correspond to any feature in .o files, so it
255 can only be used for variables like ``llvm.global_ctors`` which llvm
256 interprets specially.
257
Sean Silvab084af42012-12-07 10:36:55 +0000258``extern_weak``
259 The semantics of this linkage follow the ELF object file model: the
260 symbol is weak until linked, if not linked, the symbol becomes null
261 instead of being an undefined reference.
262``linkonce_odr``, ``weak_odr``
263 Some languages allow differing globals to be merged, such as two
264 functions with different semantics. Other languages, such as
265 ``C++``, ensure that only equivalent globals are ever merged (the
Sean Silvaa1190322015-08-06 22:56:48 +0000266 "one definition rule" --- "ODR"). Such languages can use the
Sean Silvab084af42012-12-07 10:36:55 +0000267 ``linkonce_odr`` and ``weak_odr`` linkage types to indicate that the
268 global will only be merged with equivalent globals. These linkage
269 types are otherwise the same as their non-``odr`` versions.
Sean Silvab084af42012-12-07 10:36:55 +0000270``external``
271 If none of the above identifiers are used, the global is externally
272 visible, meaning that it participates in linkage and can be used to
273 resolve external symbol references.
274
Sean Silvab084af42012-12-07 10:36:55 +0000275It is illegal for a function *declaration* to have any linkage type
Nico Rieck7157bb72014-01-14 15:22:47 +0000276other than ``external`` or ``extern_weak``.
Sean Silvab084af42012-12-07 10:36:55 +0000277
Sean Silvab084af42012-12-07 10:36:55 +0000278.. _callingconv:
279
280Calling Conventions
281-------------------
282
283LLVM :ref:`functions <functionstructure>`, :ref:`calls <i_call>` and
284:ref:`invokes <i_invoke>` can all have an optional calling convention
285specified for the call. The calling convention of any pair of dynamic
286caller/callee must match, or the behavior of the program is undefined.
287The following calling conventions are supported by LLVM, and more may be
288added in the future:
289
290"``ccc``" - The C calling convention
291 This calling convention (the default if no other calling convention
292 is specified) matches the target C calling conventions. This calling
293 convention supports varargs function calls and tolerates some
294 mismatch in the declared prototype and implemented declaration of
295 the function (as does normal C).
296"``fastcc``" - The fast calling convention
297 This calling convention attempts to make calls as fast as possible
298 (e.g. by passing things in registers). This calling convention
299 allows the target to use whatever tricks it wants to produce fast
300 code for the target, without having to conform to an externally
301 specified ABI (Application Binary Interface). `Tail calls can only
302 be optimized when this, the GHC or the HiPE convention is
303 used. <CodeGenerator.html#id80>`_ This calling convention does not
304 support varargs and requires the prototype of all callees to exactly
305 match the prototype of the function definition.
306"``coldcc``" - The cold calling convention
307 This calling convention attempts to make code in the caller as
308 efficient as possible under the assumption that the call is not
309 commonly executed. As such, these calls often preserve all registers
310 so that the call does not break any live ranges in the caller side.
311 This calling convention does not support varargs and requires the
312 prototype of all callees to exactly match the prototype of the
Juergen Ributzka5d05ed12014-01-17 22:24:35 +0000313 function definition. Furthermore the inliner doesn't consider such function
314 calls for inlining.
Sean Silvab084af42012-12-07 10:36:55 +0000315"``cc 10``" - GHC convention
316 This calling convention has been implemented specifically for use by
317 the `Glasgow Haskell Compiler (GHC) <http://www.haskell.org/ghc>`_.
318 It passes everything in registers, going to extremes to achieve this
319 by disabling callee save registers. This calling convention should
320 not be used lightly but only for specific situations such as an
321 alternative to the *register pinning* performance technique often
322 used when implementing functional programming languages. At the
323 moment only X86 supports this convention and it has the following
324 limitations:
325
326 - On *X86-32* only supports up to 4 bit type parameters. No
327 floating point types are supported.
328 - On *X86-64* only supports up to 10 bit type parameters and 6
329 floating point parameters.
330
331 This calling convention supports `tail call
332 optimization <CodeGenerator.html#id80>`_ but requires both the
333 caller and callee are using it.
334"``cc 11``" - The HiPE calling convention
335 This calling convention has been implemented specifically for use by
336 the `High-Performance Erlang
337 (HiPE) <http://www.it.uu.se/research/group/hipe/>`_ compiler, *the*
338 native code compiler of the `Ericsson's Open Source Erlang/OTP
339 system <http://www.erlang.org/download.shtml>`_. It uses more
340 registers for argument passing than the ordinary C calling
341 convention and defines no callee-saved registers. The calling
342 convention properly supports `tail call
343 optimization <CodeGenerator.html#id80>`_ but requires that both the
344 caller and the callee use it. It uses a *register pinning*
345 mechanism, similar to GHC's convention, for keeping frequently
346 accessed runtime components pinned to specific hardware registers.
347 At the moment only X86 supports this convention (both 32 and 64
348 bit).
Andrew Trick5e029ce2013-12-24 02:57:25 +0000349"``webkit_jscc``" - WebKit's JavaScript calling convention
350 This calling convention has been implemented for `WebKit FTL JIT
351 <https://trac.webkit.org/wiki/FTLJIT>`_. It passes arguments on the
352 stack right to left (as cdecl does), and returns a value in the
353 platform's customary return register.
354"``anyregcc``" - Dynamic calling convention for code patching
355 This is a special convention that supports patching an arbitrary code
356 sequence in place of a call site. This convention forces the call
Eli Bendersky45324ce2015-04-02 15:20:04 +0000357 arguments into registers but allows them to be dynamically
Andrew Trick5e029ce2013-12-24 02:57:25 +0000358 allocated. This can currently only be used with calls to
359 llvm.experimental.patchpoint because only this intrinsic records
360 the location of its arguments in a side table. See :doc:`StackMaps`.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000361"``preserve_mostcc``" - The `PreserveMost` calling convention
Eli Bendersky45324ce2015-04-02 15:20:04 +0000362 This calling convention attempts to make the code in the caller as
363 unintrusive as possible. This convention behaves identically to the `C`
Juergen Ributzkae6250132014-01-17 19:47:03 +0000364 calling convention on how arguments and return values are passed, but it
365 uses a different set of caller/callee-saved registers. This alleviates the
366 burden of saving and recovering a large register set before and after the
Juergen Ributzka980f2dc2014-01-30 02:39:00 +0000367 call in the caller. If the arguments are passed in callee-saved registers,
368 then they will be preserved by the callee across the call. This doesn't
369 apply for values returned in callee-saved registers.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000370
371 - On X86-64 the callee preserves all general purpose registers, except for
372 R11. R11 can be used as a scratch register. Floating-point registers
373 (XMMs/YMMs) are not preserved and need to be saved by the caller.
374
375 The idea behind this convention is to support calls to runtime functions
376 that have a hot path and a cold path. The hot path is usually a small piece
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000377 of code that doesn't use many registers. The cold path might need to call out to
Juergen Ributzkae6250132014-01-17 19:47:03 +0000378 another function and therefore only needs to preserve the caller-saved
Juergen Ributzka5d05ed12014-01-17 22:24:35 +0000379 registers, which haven't already been saved by the caller. The
380 `PreserveMost` calling convention is very similar to the `cold` calling
381 convention in terms of caller/callee-saved registers, but they are used for
382 different types of function calls. `coldcc` is for function calls that are
383 rarely executed, whereas `preserve_mostcc` function calls are intended to be
384 on the hot path and definitely executed a lot. Furthermore `preserve_mostcc`
385 doesn't prevent the inliner from inlining the function call.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000386
387 This calling convention will be used by a future version of the ObjectiveC
388 runtime and should therefore still be considered experimental at this time.
389 Although this convention was created to optimize certain runtime calls to
390 the ObjectiveC runtime, it is not limited to this runtime and might be used
391 by other runtimes in the future too. The current implementation only
392 supports X86-64, but the intention is to support more architectures in the
393 future.
394"``preserve_allcc``" - The `PreserveAll` calling convention
395 This calling convention attempts to make the code in the caller even less
396 intrusive than the `PreserveMost` calling convention. This calling
397 convention also behaves identical to the `C` calling convention on how
398 arguments and return values are passed, but it uses a different set of
399 caller/callee-saved registers. This removes the burden of saving and
Juergen Ributzka980f2dc2014-01-30 02:39:00 +0000400 recovering a large register set before and after the call in the caller. If
401 the arguments are passed in callee-saved registers, then they will be
402 preserved by the callee across the call. This doesn't apply for values
403 returned in callee-saved registers.
Juergen Ributzkae6250132014-01-17 19:47:03 +0000404
405 - On X86-64 the callee preserves all general purpose registers, except for
406 R11. R11 can be used as a scratch register. Furthermore it also preserves
407 all floating-point registers (XMMs/YMMs).
408
409 The idea behind this convention is to support calls to runtime functions
410 that don't need to call out to any other functions.
411
412 This calling convention, like the `PreserveMost` calling convention, will be
413 used by a future version of the ObjectiveC runtime and should be considered
414 experimental at this time.
Manman Ren19c7bbe2015-12-04 17:40:13 +0000415"``cxx_fast_tlscc``" - The `CXX_FAST_TLS` calling convention for access functions
Manman Ren17567d22015-12-07 21:40:09 +0000416 Clang generates an access function to access C++-style TLS. The access
417 function generally has an entry block, an exit block and an initialization
418 block that is run at the first time. The entry and exit blocks can access
419 a few TLS IR variables, each access will be lowered to a platform-specific
420 sequence.
421
Manman Ren19c7bbe2015-12-04 17:40:13 +0000422 This calling convention aims to minimize overhead in the caller by
Manman Ren17567d22015-12-07 21:40:09 +0000423 preserving as many registers as possible (all the registers that are
424 perserved on the fast path, composed of the entry and exit blocks).
425
426 This calling convention behaves identical to the `C` calling convention on
427 how arguments and return values are passed, but it uses a different set of
428 caller/callee-saved registers.
429
430 Given that each platform has its own lowering sequence, hence its own set
431 of preserved registers, we can't use the existing `PreserveMost`.
Manman Ren19c7bbe2015-12-04 17:40:13 +0000432
433 - On X86-64 the callee preserves all general purpose registers, except for
434 RDI and RAX.
Manman Renf8bdd882016-04-05 22:41:47 +0000435"``swiftcc``" - This calling convention is used for Swift language.
436 - On X86-64 RCX and R8 are available for additional integer returns, and
437 XMM2 and XMM3 are available for additional FP/vector returns.
Manman Ren802cd6f2016-04-05 22:44:44 +0000438 - On iOS platforms, we use AAPCS-VFP calling convention.
Sean Silvab084af42012-12-07 10:36:55 +0000439"``cc <n>``" - Numbered convention
440 Any calling convention may be specified by number, allowing
441 target-specific calling conventions to be used. Target specific
442 calling conventions start at 64.
443
444More calling conventions can be added/defined on an as-needed basis, to
445support Pascal conventions or any other well-known target-independent
446convention.
447
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000448.. _visibilitystyles:
449
Sean Silvab084af42012-12-07 10:36:55 +0000450Visibility Styles
451-----------------
452
453All Global Variables and Functions have one of the following visibility
454styles:
455
456"``default``" - Default style
457 On targets that use the ELF object file format, default visibility
458 means that the declaration is visible to other modules and, in
459 shared libraries, means that the declared entity may be overridden.
460 On Darwin, default visibility means that the declaration is visible
461 to other modules. Default visibility corresponds to "external
462 linkage" in the language.
463"``hidden``" - Hidden style
464 Two declarations of an object with hidden visibility refer to the
465 same object if they are in the same shared object. Usually, hidden
466 visibility indicates that the symbol will not be placed into the
467 dynamic symbol table, so no other module (executable or shared
468 library) can reference it directly.
469"``protected``" - Protected style
470 On ELF, protected visibility indicates that the symbol will be
471 placed in the dynamic symbol table, but that references within the
472 defining module will bind to the local symbol. That is, the symbol
473 cannot be overridden by another module.
474
Duncan P. N. Exon Smithb80de102014-05-07 22:57:20 +0000475A symbol with ``internal`` or ``private`` linkage must have ``default``
476visibility.
477
Rafael Espindola3bc64d52014-05-26 21:30:40 +0000478.. _dllstorageclass:
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000479
Nico Rieck7157bb72014-01-14 15:22:47 +0000480DLL Storage Classes
481-------------------
482
483All Global Variables, Functions and Aliases can have one of the following
484DLL storage class:
485
486``dllimport``
487 "``dllimport``" causes the compiler to reference a function or variable via
488 a global pointer to a pointer that is set up by the DLL exporting the
489 symbol. On Microsoft Windows targets, the pointer name is formed by
490 combining ``__imp_`` and the function or variable name.
491``dllexport``
492 "``dllexport``" causes the compiler to provide a global pointer to a pointer
493 in a DLL, so that it can be referenced with the ``dllimport`` attribute. On
494 Microsoft Windows targets, the pointer name is formed by combining
495 ``__imp_`` and the function or variable name. Since this storage class
496 exists for defining a dll interface, the compiler, assembler and linker know
497 it is externally referenced and must refrain from deleting the symbol.
498
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000499.. _tls_model:
500
501Thread Local Storage Models
502---------------------------
503
504A variable may be defined as ``thread_local``, which means that it will
505not be shared by threads (each thread will have a separated copy of the
506variable). Not all targets support thread-local variables. Optionally, a
507TLS model may be specified:
508
509``localdynamic``
510 For variables that are only used within the current shared library.
511``initialexec``
512 For variables in modules that will not be loaded dynamically.
513``localexec``
514 For variables defined in the executable and only used within it.
515
516If no explicit model is given, the "general dynamic" model is used.
517
518The models correspond to the ELF TLS models; see `ELF Handling For
519Thread-Local Storage <http://people.redhat.com/drepper/tls.pdf>`_ for
520more information on under which circumstances the different models may
521be used. The target may choose a different TLS model if the specified
522model is not supported, or if a better choice of model can be made.
523
Sean Silva706fba52015-08-06 22:56:24 +0000524A model can also be specified in an alias, but then it only governs how
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000525the alias is accessed. It will not have any effect in the aliasee.
526
Chih-Hung Hsieh1e859582015-07-28 16:24:05 +0000527For platforms without linker support of ELF TLS model, the -femulated-tls
528flag can be used to generate GCC compatible emulated TLS code.
529
Sean Fertilec70d28b2017-10-26 15:00:26 +0000530.. _runtime_preemption_model:
531
532Runtime Preemption Specifiers
533-----------------------------
534
535Global variables, functions and aliases may have an optional runtime preemption
536specifier. If a preemption specifier isn't given explicitly, then a
537symbol is assumed to be ``dso_preemptable``.
538
539``dso_preemptable``
540 Indicates that the function or variable may be replaced by a symbol from
541 outside the linkage unit at runtime.
542
543``dso_local``
544 The compiler may assume that a function or variable marked as ``dso_local``
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +0000545 will resolve to a symbol within the same linkage unit. Direct access will
Sean Fertilec70d28b2017-10-26 15:00:26 +0000546 be generated even if the definition is not within this compilation unit.
547
Rafael Espindola3bc64d52014-05-26 21:30:40 +0000548.. _namedtypes:
549
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000550Structure Types
551---------------
Sean Silvab084af42012-12-07 10:36:55 +0000552
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000553LLVM IR allows you to specify both "identified" and "literal" :ref:`structure
Sean Silvaa1190322015-08-06 22:56:48 +0000554types <t_struct>`. Literal types are uniqued structurally, but identified types
555are never uniqued. An :ref:`opaque structural type <t_opaque>` can also be used
Richard Smith32dbdf62014-07-31 04:25:36 +0000556to forward declare a type that is not yet available.
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000557
Sean Silva706fba52015-08-06 22:56:24 +0000558An example of an identified structure specification is:
Sean Silvab084af42012-12-07 10:36:55 +0000559
560.. code-block:: llvm
561
562 %mytype = type { %mytype*, i32 }
563
Sean Silvaa1190322015-08-06 22:56:48 +0000564Prior to the LLVM 3.0 release, identified types were structurally uniqued. Only
Reid Kleckner7c84d1d2014-03-05 02:21:50 +0000565literal types are uniqued in recent versions of LLVM.
Sean Silvab084af42012-12-07 10:36:55 +0000566
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +0000567.. _nointptrtype:
568
569Non-Integral Pointer Type
570-------------------------
571
572Note: non-integral pointer types are a work in progress, and they should be
573considered experimental at this time.
574
575LLVM IR optionally allows the frontend to denote pointers in certain address
Sanjoy Das63752e62016-08-10 21:48:24 +0000576spaces as "non-integral" via the :ref:`datalayout string<langref_datalayout>`.
577Non-integral pointer types represent pointers that have an *unspecified* bitwise
578representation; that is, the integral representation may be target dependent or
579unstable (not backed by a fixed integer).
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +0000580
581``inttoptr`` instructions converting integers to non-integral pointer types are
582ill-typed, and so are ``ptrtoint`` instructions converting values of
583non-integral pointer types to integers. Vector versions of said instructions
584are ill-typed as well.
585
Sean Silvab084af42012-12-07 10:36:55 +0000586.. _globalvars:
587
588Global Variables
589----------------
590
591Global variables define regions of memory allocated at compilation time
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000592instead of run-time.
593
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000594Global variable definitions must be initialized.
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000595
596Global variables in other translation units can also be declared, in which
597case they don't have an initializer.
Sean Silvab084af42012-12-07 10:36:55 +0000598
Bob Wilson85b24f22014-06-12 20:40:33 +0000599Either global variable definitions or declarations may have an explicit section
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +0000600to be placed in and may have an optional explicit alignment specified. If there
601is a mismatch between the explicit or inferred section information for the
602variable declaration and its definition the resulting behavior is undefined.
Bob Wilson85b24f22014-06-12 20:40:33 +0000603
Michael Gottesman006039c2013-01-31 05:48:48 +0000604A variable may be defined as a global ``constant``, which indicates that
Sean Silvab084af42012-12-07 10:36:55 +0000605the contents of the variable will **never** be modified (enabling better
606optimization, allowing the global data to be placed in the read-only
607section of an executable, etc). Note that variables that need runtime
Michael Gottesman1cffcf742013-01-31 05:44:04 +0000608initialization cannot be marked ``constant`` as there is a store to the
Sean Silvab084af42012-12-07 10:36:55 +0000609variable.
610
611LLVM explicitly allows *declarations* of global variables to be marked
612constant, even if the final definition of the global is not. This
613capability can be used to enable slightly better optimization of the
614program, but requires the language definition to guarantee that
615optimizations based on the 'constantness' are valid for the translation
616units that do not include the definition.
617
618As SSA values, global variables define pointer values that are in scope
619(i.e. they dominate) all basic blocks in the program. Global variables
620always define a pointer to their "content" type because they describe a
621region of memory, and all memory objects in LLVM are accessed through
622pointers.
623
624Global variables can be marked with ``unnamed_addr`` which indicates
625that the address is not significant, only the content. Constants marked
626like this can be merged with other constants if they have the same
627initializer. Note that a constant with significant address *can* be
628merged with a ``unnamed_addr`` constant, the result being a constant
629whose address is significant.
630
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000631If the ``local_unnamed_addr`` attribute is given, the address is known to
632not be significant within the module.
633
Sean Silvab084af42012-12-07 10:36:55 +0000634A global variable may be declared to reside in a target-specific
635numbered address space. For targets that support them, address spaces
636may affect how optimizations are performed and/or what target
637instructions are used to access the variable. The default address space
638is zero. The address space qualifier must precede any other attributes.
639
640LLVM allows an explicit section to be specified for globals. If the
641target supports it, it will emit globals to the section specified.
David Majnemerdad0a642014-06-27 18:19:56 +0000642Additionally, the global can placed in a comdat if the target has the necessary
643support.
Sean Silvab084af42012-12-07 10:36:55 +0000644
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +0000645External declarations may have an explicit section specified. Section
646information is retained in LLVM IR for targets that make use of this
647information. Attaching section information to an external declaration is an
648assertion that its definition is located in the specified section. If the
649definition is located in a different section, the behavior is undefined.
Erich Keane0343ef82017-08-22 15:30:43 +0000650
Michael Gottesmane743a302013-02-04 03:22:00 +0000651By default, global initializers are optimized by assuming that global
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000652variables defined within the module are not modified from their
Sean Silvaa1190322015-08-06 22:56:48 +0000653initial values before the start of the global initializer. This is
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000654true even for variables potentially accessible from outside the
655module, including those with external linkage or appearing in
Yunzhong Gaof5b769e2013-12-05 18:37:54 +0000656``@llvm.used`` or dllexported variables. This assumption may be suppressed
657by marking the variable with ``externally_initialized``.
Michael Gottesmanef2bc772013-02-03 09:57:15 +0000658
Sean Silvab084af42012-12-07 10:36:55 +0000659An explicit alignment may be specified for a global, which must be a
660power of 2. If not present, or if the alignment is set to zero, the
661alignment of the global is set by the target to whatever it feels
662convenient. If an explicit alignment is specified, the global is forced
663to have exactly that alignment. Targets and optimizers are not allowed
664to over-align the global if the global has an assigned section. In this
665case, the extra alignment could be observable: for example, code could
666assume that the globals are densely packed in their section and try to
667iterate over them as an array, alignment padding would break this
Reid Kleckner15fe7a52014-07-15 01:16:09 +0000668iteration. The maximum alignment is ``1 << 29``.
Sean Silvab084af42012-12-07 10:36:55 +0000669
Javed Absarf3d79042017-05-11 12:28:08 +0000670Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
Sean Fertilec70d28b2017-10-26 15:00:26 +0000671an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
Javed Absarf3d79042017-05-11 12:28:08 +0000672an optional :ref:`global attributes <glattrs>` and
673an optional list of attached :ref:`metadata <metadata>`.
Nico Rieck7157bb72014-01-14 15:22:47 +0000674
Peter Collingbourne69ba0162015-02-04 00:42:45 +0000675Variables and aliases can have a
Rafael Espindola59f7eba2014-05-28 18:15:43 +0000676:ref:`Thread Local Storage Model <tls_model>`.
677
Nico Rieck7157bb72014-01-14 15:22:47 +0000678Syntax::
679
Sean Fertilec70d28b2017-10-26 15:00:26 +0000680 @<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
681 [DLLStorageClass] [ThreadLocal]
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000682 [(unnamed_addr|local_unnamed_addr)] [AddrSpace]
683 [ExternallyInitialized]
Bob Wilson85b24f22014-06-12 20:40:33 +0000684 <global | constant> <Type> [<InitializerConstant>]
Rafael Espindola83a362c2015-01-06 22:55:16 +0000685 [, section "name"] [, comdat [($name)]]
Peter Collingbournecceae7f2016-05-31 23:01:54 +0000686 [, align <Alignment>] (, !name !N)*
Nico Rieck7157bb72014-01-14 15:22:47 +0000687
Sean Silvab084af42012-12-07 10:36:55 +0000688For example, the following defines a global in a numbered address space
689with an initializer, section, and alignment:
690
691.. code-block:: llvm
692
693 @G = addrspace(5) constant float 1.0, section "foo", align 4
694
Rafael Espindola5d1b7452013-10-29 13:44:11 +0000695The following example just declares a global variable
696
697.. code-block:: llvm
698
699 @G = external global i32
700
Sean Silvab084af42012-12-07 10:36:55 +0000701The following example defines a thread-local global with the
702``initialexec`` TLS model:
703
704.. code-block:: llvm
705
706 @G = thread_local(initialexec) global i32 0, align 4
707
708.. _functionstructure:
709
710Functions
711---------
712
713LLVM function definitions consist of the "``define``" keyword, an
Sean Fertilec70d28b2017-10-26 15:00:26 +0000714optional :ref:`linkage type <linkage>`, an optional :ref:`runtime preemption
715specifier <runtime_preemption_model>`, an optional :ref:`visibility
Nico Rieck7157bb72014-01-14 15:22:47 +0000716style <visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`,
717an optional :ref:`calling convention <callingconv>`,
Sean Silvab084af42012-12-07 10:36:55 +0000718an optional ``unnamed_addr`` attribute, a return type, an optional
719:ref:`parameter attribute <paramattrs>` for the return type, a function
720name, a (possibly empty) argument list (each with optional :ref:`parameter
721attributes <paramattrs>`), optional :ref:`function attributes <fnattrs>`,
David Majnemerdad0a642014-06-27 18:19:56 +0000722an optional section, an optional alignment,
723an optional :ref:`comdat <langref_comdats>`,
Peter Collingbourne51d2de72014-12-03 02:08:38 +0000724an optional :ref:`garbage collector name <gc>`, an optional :ref:`prefix <prefixdata>`,
David Majnemer7fddecc2015-06-17 20:52:32 +0000725an optional :ref:`prologue <prologuedata>`,
726an optional :ref:`personality <personalityfn>`,
Peter Collingbourne50108682015-11-06 02:41:02 +0000727an optional list of attached :ref:`metadata <metadata>`,
David Majnemer7fddecc2015-06-17 20:52:32 +0000728an opening curly brace, a list of basic blocks, and a closing curly brace.
Sean Silvab084af42012-12-07 10:36:55 +0000729
730LLVM function declarations consist of the "``declare``" keyword, an
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000731optional :ref:`linkage type <linkage>`, an optional :ref:`visibility style
732<visibility>`, an optional :ref:`DLL storage class <dllstorageclass>`, an
733optional :ref:`calling convention <callingconv>`, an optional ``unnamed_addr``
734or ``local_unnamed_addr`` attribute, a return type, an optional :ref:`parameter
735attribute <paramattrs>` for the return type, a function name, a possibly
736empty list of arguments, an optional alignment, an optional :ref:`garbage
737collector name <gc>`, an optional :ref:`prefix <prefixdata>`, and an optional
738:ref:`prologue <prologuedata>`.
Sean Silvab084af42012-12-07 10:36:55 +0000739
Bill Wendling6822ecb2013-10-27 05:09:12 +0000740A function definition contains a list of basic blocks, forming the CFG (Control
741Flow Graph) for the function. Each basic block may optionally start with a label
742(giving the basic block a symbol table entry), contains a list of instructions,
743and ends with a :ref:`terminator <terminators>` instruction (such as a branch or
744function return). If an explicit label is not provided, a block is assigned an
745implicit numbered label, using the next value from the same counter as used for
746unnamed temporaries (:ref:`see above<identifiers>`). For example, if a function
747entry block does not have an explicit label, it will be assigned label "%0",
748then the first unnamed temporary in that block will be "%1", etc.
Sean Silvab084af42012-12-07 10:36:55 +0000749
750The first basic block in a function is special in two ways: it is
751immediately executed on entrance to the function, and it is not allowed
752to have predecessor basic blocks (i.e. there can not be any branches to
753the entry block of a function). Because the block can have no
754predecessors, it also cannot have any :ref:`PHI nodes <i_phi>`.
755
756LLVM allows an explicit section to be specified for functions. If the
757target supports it, it will emit functions to the section specified.
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000758Additionally, the function can be placed in a COMDAT.
Sean Silvab084af42012-12-07 10:36:55 +0000759
760An explicit alignment may be specified for a function. If not present,
761or if the alignment is set to zero, the alignment of the function is set
762by the target to whatever it feels convenient. If an explicit alignment
763is specified, the function is forced to have at least that much
764alignment. All alignments must be a power of 2.
765
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000766If the ``unnamed_addr`` attribute is given, the address is known to not
Sean Silvab084af42012-12-07 10:36:55 +0000767be significant and two identical functions can be merged.
768
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000769If the ``local_unnamed_addr`` attribute is given, the address is known to
770not be significant within the module.
771
Sean Silvab084af42012-12-07 10:36:55 +0000772Syntax::
773
Sean Fertilec70d28b2017-10-26 15:00:26 +0000774 define [linkage] [PreemptionSpecifier] [visibility] [DLLStorageClass]
Sean Silvab084af42012-12-07 10:36:55 +0000775 [cconv] [ret attrs]
776 <ResultType> @<FunctionName> ([argument list])
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000777 [(unnamed_addr|local_unnamed_addr)] [fn Attrs] [section "name"]
778 [comdat [($name)]] [align N] [gc] [prefix Constant]
779 [prologue Constant] [personality Constant] (!name !N)* { ... }
Sean Silvab084af42012-12-07 10:36:55 +0000780
Sean Silva706fba52015-08-06 22:56:24 +0000781The argument list is a comma separated sequence of arguments where each
782argument is of the following form:
Dan Liew2661dfc2014-08-20 15:06:30 +0000783
784Syntax::
785
786 <type> [parameter Attrs] [name]
787
788
Eli Benderskyfdc529a2013-06-07 19:40:08 +0000789.. _langref_aliases:
790
Sean Silvab084af42012-12-07 10:36:55 +0000791Aliases
792-------
793
Rafael Espindola64c1e182014-06-03 02:41:57 +0000794Aliases, unlike function or variables, don't create any new data. They
795are just a new symbol and metadata for an existing position.
796
797Aliases have a name and an aliasee that is either a global value or a
798constant expression.
799
Nico Rieck7157bb72014-01-14 15:22:47 +0000800Aliases may have an optional :ref:`linkage type <linkage>`, an optional
Sean Fertilec70d28b2017-10-26 15:00:26 +0000801:ref:`runtime preemption specifier <runtime_preemption_model>`, an optional
Rafael Espindola64c1e182014-06-03 02:41:57 +0000802:ref:`visibility style <visibility>`, an optional :ref:`DLL storage class
803<dllstorageclass>` and an optional :ref:`tls model <tls_model>`.
Sean Silvab084af42012-12-07 10:36:55 +0000804
805Syntax::
806
Sean Fertilec70d28b2017-10-26 15:00:26 +0000807 @<Name> = [Linkage] [PreemptionSpecifier] [Visibility] [DLLStorageClass] [ThreadLocal] [(unnamed_addr|local_unnamed_addr)] alias <AliaseeTy>, <AliaseeTy>* @<Aliasee>
Sean Silvab084af42012-12-07 10:36:55 +0000808
Rafael Espindola2fb5bc32014-03-13 23:18:37 +0000809The linkage must be one of ``private``, ``internal``, ``linkonce``, ``weak``,
Rafael Espindola716e7402013-11-01 17:09:14 +0000810``linkonce_odr``, ``weak_odr``, ``external``. Note that some system linkers
Rafael Espindola64c1e182014-06-03 02:41:57 +0000811might not correctly handle dropping a weak symbol that is aliased.
Rafael Espindola78527052013-10-06 15:10:43 +0000812
Eric Christopher1e61ffd2015-02-19 18:46:25 +0000813Aliases that are not ``unnamed_addr`` are guaranteed to have the same address as
Rafael Espindola42a4c9f2014-06-06 01:20:28 +0000814the aliasee expression. ``unnamed_addr`` ones are only guaranteed to point
815to the same content.
Rafael Espindolaf3336bc2014-03-12 20:15:49 +0000816
Peter Collingbourne96efdd62016-06-14 21:01:22 +0000817If the ``local_unnamed_addr`` attribute is given, the address is known to
818not be significant within the module.
819
Rafael Espindola64c1e182014-06-03 02:41:57 +0000820Since aliases are only a second name, some restrictions apply, of which
821some can only be checked when producing an object file:
Rafael Espindolaf3336bc2014-03-12 20:15:49 +0000822
Rafael Espindola64c1e182014-06-03 02:41:57 +0000823* The expression defining the aliasee must be computable at assembly
824 time. Since it is just a name, no relocations can be used.
825
826* No alias in the expression can be weak as the possibility of the
827 intermediate alias being overridden cannot be represented in an
828 object file.
829
830* No global value in the expression can be a declaration, since that
831 would require a relocation, which is not possible.
Rafael Espindola24a669d2014-03-27 15:26:56 +0000832
Dmitry Polukhina1feff72016-04-07 12:32:19 +0000833.. _langref_ifunc:
834
835IFuncs
836-------
837
838IFuncs, like as aliases, don't create any new data or func. They are just a new
839symbol that dynamic linker resolves at runtime by calling a resolver function.
840
841IFuncs have a name and a resolver that is a function called by dynamic linker
842that returns address of another function associated with the name.
843
844IFunc may have an optional :ref:`linkage type <linkage>` and an optional
845:ref:`visibility style <visibility>`.
846
847Syntax::
848
849 @<Name> = [Linkage] [Visibility] ifunc <IFuncTy>, <ResolverTy>* @<Resolver>
850
851
David Majnemerdad0a642014-06-27 18:19:56 +0000852.. _langref_comdats:
853
854Comdats
855-------
856
857Comdat IR provides access to COFF and ELF object file COMDAT functionality.
858
Sean Silvaa1190322015-08-06 22:56:48 +0000859Comdats have a name which represents the COMDAT key. All global objects that
David Majnemerdad0a642014-06-27 18:19:56 +0000860specify this key will only end up in the final object file if the linker chooses
Sean Silvaa1190322015-08-06 22:56:48 +0000861that key over some other key. Aliases are placed in the same COMDAT that their
David Majnemerdad0a642014-06-27 18:19:56 +0000862aliasee computes to, if any.
863
864Comdats have a selection kind to provide input on how the linker should
865choose between keys in two different object files.
866
867Syntax::
868
869 $<Name> = comdat SelectionKind
870
871The selection kind must be one of the following:
872
873``any``
874 The linker may choose any COMDAT key, the choice is arbitrary.
875``exactmatch``
876 The linker may choose any COMDAT key but the sections must contain the
877 same data.
878``largest``
879 The linker will choose the section containing the largest COMDAT key.
880``noduplicates``
881 The linker requires that only section with this COMDAT key exist.
882``samesize``
883 The linker may choose any COMDAT key but the sections must contain the
884 same amount of data.
885
Sam Cleggea7cace2018-01-09 23:43:14 +0000886Note that the Mach-O platform doesn't support COMDATs, and ELF and WebAssembly
887only support ``any`` as a selection kind.
David Majnemerdad0a642014-06-27 18:19:56 +0000888
889Here is an example of a COMDAT group where a function will only be selected if
890the COMDAT key's section is the largest:
891
Renato Golin124f2592016-07-20 12:16:38 +0000892.. code-block:: text
David Majnemerdad0a642014-06-27 18:19:56 +0000893
894 $foo = comdat largest
Rafael Espindola83a362c2015-01-06 22:55:16 +0000895 @foo = global i32 2, comdat($foo)
David Majnemerdad0a642014-06-27 18:19:56 +0000896
Rafael Espindola83a362c2015-01-06 22:55:16 +0000897 define void @bar() comdat($foo) {
David Majnemerdad0a642014-06-27 18:19:56 +0000898 ret void
899 }
900
Rafael Espindola83a362c2015-01-06 22:55:16 +0000901As a syntactic sugar the ``$name`` can be omitted if the name is the same as
902the global name:
903
Renato Golin124f2592016-07-20 12:16:38 +0000904.. code-block:: text
Rafael Espindola83a362c2015-01-06 22:55:16 +0000905
906 $foo = comdat any
907 @foo = global i32 2, comdat
908
909
David Majnemerdad0a642014-06-27 18:19:56 +0000910In a COFF object file, this will create a COMDAT section with selection kind
911``IMAGE_COMDAT_SELECT_LARGEST`` containing the contents of the ``@foo`` symbol
912and another COMDAT section with selection kind
913``IMAGE_COMDAT_SELECT_ASSOCIATIVE`` which is associated with the first COMDAT
Hans Wennborg0def0662014-09-10 17:05:08 +0000914section and contains the contents of the ``@bar`` symbol.
David Majnemerdad0a642014-06-27 18:19:56 +0000915
916There are some restrictions on the properties of the global object.
917It, or an alias to it, must have the same name as the COMDAT group when
918targeting COFF.
919The contents and size of this object may be used during link-time to determine
920which COMDAT groups get selected depending on the selection kind.
921Because the name of the object must match the name of the COMDAT group, the
922linkage of the global object must not be local; local symbols can get renamed
923if a collision occurs in the symbol table.
924
925The combined use of COMDATS and section attributes may yield surprising results.
926For example:
927
Renato Golin124f2592016-07-20 12:16:38 +0000928.. code-block:: text
David Majnemerdad0a642014-06-27 18:19:56 +0000929
930 $foo = comdat any
931 $bar = comdat any
Rafael Espindola83a362c2015-01-06 22:55:16 +0000932 @g1 = global i32 42, section "sec", comdat($foo)
933 @g2 = global i32 42, section "sec", comdat($bar)
David Majnemerdad0a642014-06-27 18:19:56 +0000934
935From the object file perspective, this requires the creation of two sections
Sean Silvaa1190322015-08-06 22:56:48 +0000936with the same name. This is necessary because both globals belong to different
David Majnemerdad0a642014-06-27 18:19:56 +0000937COMDAT groups and COMDATs, at the object file level, are represented by
938sections.
939
Peter Collingbourne1feef2e2015-06-30 19:10:31 +0000940Note that certain IR constructs like global variables and functions may
941create COMDATs in the object file in addition to any which are specified using
Sean Silvaa1190322015-08-06 22:56:48 +0000942COMDAT IR. This arises when the code generator is configured to emit globals
Peter Collingbourne1feef2e2015-06-30 19:10:31 +0000943in individual sections (e.g. when `-data-sections` or `-function-sections`
944is supplied to `llc`).
David Majnemerdad0a642014-06-27 18:19:56 +0000945
Sean Silvab084af42012-12-07 10:36:55 +0000946.. _namedmetadatastructure:
947
948Named Metadata
949--------------
950
951Named metadata is a collection of metadata. :ref:`Metadata
952nodes <metadata>` (but not metadata strings) are the only valid
953operands for a named metadata.
954
Filipe Cabecinhas62431b12015-06-02 21:25:08 +0000955#. Named metadata are represented as a string of characters with the
956 metadata prefix. The rules for metadata names are the same as for
957 identifiers, but quoted names are not allowed. ``"\xx"`` type escapes
958 are still valid, which allows any character to be part of a name.
959
Sean Silvab084af42012-12-07 10:36:55 +0000960Syntax::
961
962 ; Some unnamed metadata nodes, which are referenced by the named metadata.
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +0000963 !0 = !{!"zero"}
964 !1 = !{!"one"}
965 !2 = !{!"two"}
Sean Silvab084af42012-12-07 10:36:55 +0000966 ; A named metadata.
967 !name = !{!0, !1, !2}
968
969.. _paramattrs:
970
971Parameter Attributes
972--------------------
973
974The return type and each parameter of a function type may have a set of
975*parameter attributes* associated with them. Parameter attributes are
976used to communicate additional information about the result or
977parameters of a function. Parameter attributes are considered to be part
978of the function, not of the function type, so functions with different
979parameter attributes can have the same function type.
980
981Parameter attributes are simple keywords that follow the type specified.
982If multiple parameter attributes are needed, they are space separated.
983For example:
984
985.. code-block:: llvm
986
987 declare i32 @printf(i8* noalias nocapture, ...)
988 declare i32 @atoi(i8 zeroext)
989 declare signext i8 @returns_signed_char()
990
991Note that any attributes for the function result (``nounwind``,
992``readonly``) come immediately after the argument list.
993
994Currently, only the following parameter attributes are defined:
995
996``zeroext``
997 This indicates to the code generator that the parameter or return
998 value should be zero-extended to the extent required by the target's
Hans Wennborg850ec6c2016-02-08 19:34:30 +0000999 ABI by the caller (for a parameter) or the callee (for a return value).
Sean Silvab084af42012-12-07 10:36:55 +00001000``signext``
1001 This indicates to the code generator that the parameter or return
1002 value should be sign-extended to the extent required by the target's
1003 ABI (which is usually 32-bits) by the caller (for a parameter) or
1004 the callee (for a return value).
1005``inreg``
1006 This indicates that this parameter or return value should be treated
Sean Silva706fba52015-08-06 22:56:24 +00001007 in a special target-dependent fashion while emitting code for
Sean Silvab084af42012-12-07 10:36:55 +00001008 a function call or return (usually, by putting it in a register as
1009 opposed to memory, though some targets use it to distinguish between
1010 two different kinds of registers). Use of this attribute is
1011 target-specific.
1012``byval``
1013 This indicates that the pointer parameter should really be passed by
1014 value to the function. The attribute implies that a hidden copy of
1015 the pointee is made between the caller and the callee, so the callee
1016 is unable to modify the value in the caller. This attribute is only
1017 valid on LLVM pointer arguments. It is generally used to pass
1018 structs and arrays by value, but is also valid on pointers to
1019 scalars. The copy is considered to belong to the caller not the
1020 callee (for example, ``readonly`` functions should not write to
1021 ``byval`` parameters). This is not a valid attribute for return
1022 values.
1023
1024 The byval attribute also supports specifying an alignment with the
1025 align attribute. It indicates the alignment of the stack slot to
1026 form and the known alignment of the pointer specified to the call
1027 site. If the alignment is not specified, then the code generator
1028 makes a target-specific assumption.
1029
Reid Klecknera534a382013-12-19 02:14:12 +00001030.. _attr_inalloca:
1031
1032``inalloca``
1033
Reid Kleckner60d3a832014-01-16 22:59:24 +00001034 The ``inalloca`` argument attribute allows the caller to take the
Sean Silvaa1190322015-08-06 22:56:48 +00001035 address of outgoing stack arguments. An ``inalloca`` argument must
Reid Kleckner436c42e2014-01-17 23:58:17 +00001036 be a pointer to stack memory produced by an ``alloca`` instruction.
1037 The alloca, or argument allocation, must also be tagged with the
Sean Silvaa1190322015-08-06 22:56:48 +00001038 inalloca keyword. Only the last argument may have the ``inalloca``
Reid Kleckner436c42e2014-01-17 23:58:17 +00001039 attribute, and that argument is guaranteed to be passed in memory.
Reid Klecknera534a382013-12-19 02:14:12 +00001040
Reid Kleckner436c42e2014-01-17 23:58:17 +00001041 An argument allocation may be used by a call at most once because
Sean Silvaa1190322015-08-06 22:56:48 +00001042 the call may deallocate it. The ``inalloca`` attribute cannot be
Reid Kleckner436c42e2014-01-17 23:58:17 +00001043 used in conjunction with other attributes that affect argument
Sean Silvaa1190322015-08-06 22:56:48 +00001044 storage, like ``inreg``, ``nest``, ``sret``, or ``byval``. The
Reid Klecknerf5b76512014-01-31 23:50:57 +00001045 ``inalloca`` attribute also disables LLVM's implicit lowering of
1046 large aggregate return values, which means that frontend authors
1047 must lower them with ``sret`` pointers.
Reid Klecknera534a382013-12-19 02:14:12 +00001048
Reid Kleckner60d3a832014-01-16 22:59:24 +00001049 When the call site is reached, the argument allocation must have
1050 been the most recent stack allocation that is still live, or the
Sean Silvaa1190322015-08-06 22:56:48 +00001051 results are undefined. It is possible to allocate additional stack
Reid Kleckner60d3a832014-01-16 22:59:24 +00001052 space after an argument allocation and before its call site, but it
1053 must be cleared off with :ref:`llvm.stackrestore
1054 <int_stackrestore>`.
Reid Klecknera534a382013-12-19 02:14:12 +00001055
1056 See :doc:`InAlloca` for more information on how to use this
1057 attribute.
1058
Sean Silvab084af42012-12-07 10:36:55 +00001059``sret``
1060 This indicates that the pointer parameter specifies the address of a
1061 structure that is the return value of the function in the source
1062 program. This pointer must be guaranteed by the caller to be valid:
Reid Kleckner1361c0c2016-09-08 15:45:27 +00001063 loads and stores to the structure may be assumed by the callee not
1064 to trap and to be properly aligned. This is not a valid attribute
1065 for return values.
Sean Silva1703e702014-04-08 21:06:22 +00001066
Daniel Neilson1e687242018-01-19 17:13:12 +00001067.. _attr_align:
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00001068
Hal Finkelccc70902014-07-22 16:58:55 +00001069``align <n>``
1070 This indicates that the pointer value may be assumed by the optimizer to
1071 have the specified alignment.
1072
1073 Note that this attribute has additional semantics when combined with the
1074 ``byval`` attribute.
1075
Sean Silva1703e702014-04-08 21:06:22 +00001076.. _noalias:
1077
Sean Silvab084af42012-12-07 10:36:55 +00001078``noalias``
Hal Finkel12d36302014-11-21 02:22:46 +00001079 This indicates that objects accessed via pointer values
1080 :ref:`based <pointeraliasing>` on the argument or return value are not also
1081 accessed, during the execution of the function, via pointer values not
1082 *based* on the argument or return value. The attribute on a return value
1083 also has additional semantics described below. The caller shares the
1084 responsibility with the callee for ensuring that these requirements are met.
1085 For further details, please see the discussion of the NoAlias response in
1086 :ref:`alias analysis <Must, May, or No>`.
Sean Silvab084af42012-12-07 10:36:55 +00001087
1088 Note that this definition of ``noalias`` is intentionally similar
Hal Finkel12d36302014-11-21 02:22:46 +00001089 to the definition of ``restrict`` in C99 for function arguments.
Sean Silvab084af42012-12-07 10:36:55 +00001090
1091 For function return values, C99's ``restrict`` is not meaningful,
Hal Finkel12d36302014-11-21 02:22:46 +00001092 while LLVM's ``noalias`` is. Furthermore, the semantics of the ``noalias``
1093 attribute on return values are stronger than the semantics of the attribute
1094 when used on function arguments. On function return values, the ``noalias``
1095 attribute indicates that the function acts like a system memory allocation
1096 function, returning a pointer to allocated storage disjoint from the
1097 storage for any other object accessible to the caller.
1098
Sean Silvab084af42012-12-07 10:36:55 +00001099``nocapture``
1100 This indicates that the callee does not make any copies of the
1101 pointer that outlive the callee itself. This is not a valid
David Majnemer7f324202016-05-26 17:36:22 +00001102 attribute for return values. Addresses used in volatile operations
1103 are considered to be captured.
Sean Silvab084af42012-12-07 10:36:55 +00001104
1105.. _nest:
1106
1107``nest``
1108 This indicates that the pointer parameter can be excised using the
1109 :ref:`trampoline intrinsics <int_trampoline>`. This is not a valid
Stephen Linb8bd2322013-04-20 05:14:40 +00001110 attribute for return values and can only be applied to one parameter.
1111
1112``returned``
Stephen Linfec5b0b2013-06-20 21:55:10 +00001113 This indicates that the function always returns the argument as its return
Hal Finkel3b66caa2016-07-10 21:52:39 +00001114 value. This is a hint to the optimizer and code generator used when
1115 generating the caller, allowing value propagation, tail call optimization,
1116 and omission of register saves and restores in some cases; it is not
1117 checked or enforced when generating the callee. The parameter and the
1118 function return type must be valid operands for the
1119 :ref:`bitcast instruction <i_bitcast>`. This is not a valid attribute for
1120 return values and can only be applied to one parameter.
Sean Silvab084af42012-12-07 10:36:55 +00001121
Nick Lewyckyd52b1522014-05-20 01:23:40 +00001122``nonnull``
1123 This indicates that the parameter or return pointer is not null. This
1124 attribute may only be applied to pointer typed parameters. This is not
1125 checked or enforced by LLVM, the caller must ensure that the pointer
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001126 passed in is non-null, or the callee must ensure that the returned pointer
Nick Lewyckyd52b1522014-05-20 01:23:40 +00001127 is non-null.
1128
Hal Finkelb0407ba2014-07-18 15:51:28 +00001129``dereferenceable(<n>)``
1130 This indicates that the parameter or return pointer is dereferenceable. This
1131 attribute may only be applied to pointer typed parameters. A pointer that
1132 is dereferenceable can be loaded from speculatively without a risk of
1133 trapping. The number of bytes known to be dereferenceable must be provided
1134 in parentheses. It is legal for the number of bytes to be less than the
1135 size of the pointee type. The ``nonnull`` attribute does not imply
1136 dereferenceability (consider a pointer to one element past the end of an
1137 array), however ``dereferenceable(<n>)`` does imply ``nonnull`` in
1138 ``addrspace(0)`` (which is the default address space).
1139
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001140``dereferenceable_or_null(<n>)``
1141 This indicates that the parameter or return value isn't both
1142 non-null and non-dereferenceable (up to ``<n>`` bytes) at the same
Sean Silvaa1190322015-08-06 22:56:48 +00001143 time. All non-null pointers tagged with
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001144 ``dereferenceable_or_null(<n>)`` are ``dereferenceable(<n>)``.
1145 For address space 0 ``dereferenceable_or_null(<n>)`` implies that
1146 a pointer is exactly one of ``dereferenceable(<n>)`` or ``null``,
1147 and in other address spaces ``dereferenceable_or_null(<n>)``
1148 implies that a pointer is at least one of ``dereferenceable(<n>)``
1149 or ``null`` (i.e. it may be both ``null`` and
Sean Silvaa1190322015-08-06 22:56:48 +00001150 ``dereferenceable(<n>)``). This attribute may only be applied to
Sanjoy Das31ea6d12015-04-16 20:29:50 +00001151 pointer typed parameters.
1152
Manman Renf46262e2016-03-29 17:37:21 +00001153``swiftself``
1154 This indicates that the parameter is the self/context parameter. This is not
1155 a valid attribute for return values and can only be applied to one
1156 parameter.
1157
Manman Ren9bfd0d02016-04-01 21:41:15 +00001158``swifterror``
1159 This attribute is motivated to model and optimize Swift error handling. It
1160 can be applied to a parameter with pointer to pointer type or a
1161 pointer-sized alloca. At the call site, the actual argument that corresponds
Arnold Schwaighofer6c57f4f2016-09-10 19:42:53 +00001162 to a ``swifterror`` parameter has to come from a ``swifterror`` alloca or
1163 the ``swifterror`` parameter of the caller. A ``swifterror`` value (either
1164 the parameter or the alloca) can only be loaded and stored from, or used as
1165 a ``swifterror`` argument. This is not a valid attribute for return values
1166 and can only be applied to one parameter.
Manman Ren9bfd0d02016-04-01 21:41:15 +00001167
1168 These constraints allow the calling convention to optimize access to
1169 ``swifterror`` variables by associating them with a specific register at
1170 call boundaries rather than placing them in memory. Since this does change
1171 the calling convention, a function which uses the ``swifterror`` attribute
1172 on a parameter is not ABI-compatible with one which does not.
1173
1174 These constraints also allow LLVM to assume that a ``swifterror`` argument
1175 does not alias any other memory visible within a function and that a
1176 ``swifterror`` alloca passed as an argument does not escape.
1177
Sean Silvab084af42012-12-07 10:36:55 +00001178.. _gc:
1179
Philip Reamesf80bbff2015-02-25 23:45:20 +00001180Garbage Collector Strategy Names
1181--------------------------------
Sean Silvab084af42012-12-07 10:36:55 +00001182
Philip Reamesf80bbff2015-02-25 23:45:20 +00001183Each function may specify a garbage collector strategy name, which is simply a
Sean Silvab084af42012-12-07 10:36:55 +00001184string:
1185
1186.. code-block:: llvm
1187
1188 define void @f() gc "name" { ... }
1189
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001190The supported values of *name* includes those :ref:`built in to LLVM
Sean Silvaa1190322015-08-06 22:56:48 +00001191<builtin-gc-strategies>` and any provided by loaded plugins. Specifying a GC
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001192strategy will cause the compiler to alter its output in order to support the
Sean Silvaa1190322015-08-06 22:56:48 +00001193named garbage collection algorithm. Note that LLVM itself does not contain a
Philip Reamesf80bbff2015-02-25 23:45:20 +00001194garbage collector, this functionality is restricted to generating machine code
Mehdi Amini4a121fa2015-03-14 22:04:06 +00001195which can interoperate with a collector provided externally.
Sean Silvab084af42012-12-07 10:36:55 +00001196
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001197.. _prefixdata:
1198
1199Prefix Data
1200-----------
1201
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001202Prefix data is data associated with a function which the code
1203generator will emit immediately before the function's entrypoint.
1204The purpose of this feature is to allow frontends to associate
1205language-specific runtime metadata with specific functions and make it
1206available through the function pointer while still allowing the
1207function pointer to be called.
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001208
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001209To access the data for a given function, a program may bitcast the
1210function pointer to a pointer to the constant's type and dereference
Sean Silvaa1190322015-08-06 22:56:48 +00001211index -1. This implies that the IR symbol points just past the end of
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001212the prefix data. For instance, take the example of a function annotated
1213with a single ``i32``,
1214
1215.. code-block:: llvm
1216
1217 define void @f() prefix i32 123 { ... }
1218
1219The prefix data can be referenced as,
1220
1221.. code-block:: llvm
1222
David Blaikie16a97eb2015-03-04 22:02:58 +00001223 %0 = bitcast void* () @f to i32*
1224 %a = getelementptr inbounds i32, i32* %0, i32 -1
David Blaikiec7aabbb2015-03-04 22:06:14 +00001225 %b = load i32, i32* %a
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001226
1227Prefix data is laid out as if it were an initializer for a global variable
Sean Silvaa1190322015-08-06 22:56:48 +00001228of the prefix data's type. The function will be placed such that the
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001229beginning of the prefix data is aligned. This means that if the size
1230of the prefix data is not a multiple of the alignment size, the
1231function's entrypoint will not be aligned. If alignment of the
1232function's entrypoint is desired, padding must be added to the prefix
1233data.
1234
Sean Silvaa1190322015-08-06 22:56:48 +00001235A function may have prefix data but no body. This has similar semantics
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001236to the ``available_externally`` linkage in that the data may be used by the
1237optimizers but will not be emitted in the object file.
1238
1239.. _prologuedata:
1240
1241Prologue Data
1242-------------
1243
1244The ``prologue`` attribute allows arbitrary code (encoded as bytes) to
1245be inserted prior to the function body. This can be used for enabling
1246function hot-patching and instrumentation.
1247
1248To maintain the semantics of ordinary function calls, the prologue data must
Sean Silvaa1190322015-08-06 22:56:48 +00001249have a particular format. Specifically, it must begin with a sequence of
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001250bytes which decode to a sequence of machine instructions, valid for the
1251module's target, which transfer control to the point immediately succeeding
Sean Silvaa1190322015-08-06 22:56:48 +00001252the prologue data, without performing any other visible action. This allows
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001253the inliner and other passes to reason about the semantics of the function
Sean Silvaa1190322015-08-06 22:56:48 +00001254definition without needing to reason about the prologue data. Obviously this
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001255makes the format of the prologue data highly target dependent.
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001256
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001257A trivial example of valid prologue data for the x86 architecture is ``i8 144``,
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001258which encodes the ``nop`` instruction:
1259
Renato Golin124f2592016-07-20 12:16:38 +00001260.. code-block:: text
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001261
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001262 define void @f() prologue i8 144 { ... }
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001263
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001264Generally prologue data can be formed by encoding a relative branch instruction
1265which skips the metadata, as in this example of valid prologue data for the
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001266x86_64 architecture, where the first two bytes encode ``jmp .+10``:
1267
Renato Golin124f2592016-07-20 12:16:38 +00001268.. code-block:: text
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001269
1270 %0 = type <{ i8, i8, i8* }>
1271
Peter Collingbourne51d2de72014-12-03 02:08:38 +00001272 define void @f() prologue %0 <{ i8 235, i8 8, i8* @md}> { ... }
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001273
Sean Silvaa1190322015-08-06 22:56:48 +00001274A function may have prologue data but no body. This has similar semantics
Peter Collingbourne3fa50f92013-09-16 01:08:15 +00001275to the ``available_externally`` linkage in that the data may be used by the
1276optimizers but will not be emitted in the object file.
1277
David Majnemer7fddecc2015-06-17 20:52:32 +00001278.. _personalityfn:
1279
1280Personality Function
David Majnemerc5ad8a92015-06-17 21:21:16 +00001281--------------------
David Majnemer7fddecc2015-06-17 20:52:32 +00001282
1283The ``personality`` attribute permits functions to specify what function
1284to use for exception handling.
1285
Bill Wendling63b88192013-02-06 06:52:58 +00001286.. _attrgrp:
1287
1288Attribute Groups
1289----------------
1290
1291Attribute groups are groups of attributes that are referenced by objects within
1292the IR. They are important for keeping ``.ll`` files readable, because a lot of
1293functions will use the same set of attributes. In the degenerative case of a
1294``.ll`` file that corresponds to a single ``.c`` file, the single attribute
1295group will capture the important command line flags used to build that file.
1296
1297An attribute group is a module-level object. To use an attribute group, an
1298object references the attribute group's ID (e.g. ``#37``). An object may refer
1299to more than one attribute group. In that situation, the attributes from the
1300different groups are merged.
1301
1302Here is an example of attribute groups for a function that should always be
1303inlined, has a stack alignment of 4, and which shouldn't use SSE instructions:
1304
1305.. code-block:: llvm
1306
1307 ; Target-independent attributes:
Eli Bendersky97ad9242013-04-18 16:11:44 +00001308 attributes #0 = { alwaysinline alignstack=4 }
Bill Wendling63b88192013-02-06 06:52:58 +00001309
1310 ; Target-dependent attributes:
Eli Bendersky97ad9242013-04-18 16:11:44 +00001311 attributes #1 = { "no-sse" }
Bill Wendling63b88192013-02-06 06:52:58 +00001312
1313 ; Function @f has attributes: alwaysinline, alignstack=4, and "no-sse".
1314 define void @f() #0 #1 { ... }
1315
Sean Silvab084af42012-12-07 10:36:55 +00001316.. _fnattrs:
1317
1318Function Attributes
1319-------------------
1320
1321Function attributes are set to communicate additional information about
1322a function. Function attributes are considered to be part of the
1323function, not of the function type, so functions with different function
1324attributes can have the same function type.
1325
1326Function attributes are simple keywords that follow the type specified.
1327If multiple attributes are needed, they are space separated. For
1328example:
1329
1330.. code-block:: llvm
1331
1332 define void @f() noinline { ... }
1333 define void @f() alwaysinline { ... }
1334 define void @f() alwaysinline optsize { ... }
1335 define void @f() optsize { ... }
1336
Sean Silvab084af42012-12-07 10:36:55 +00001337``alignstack(<n>)``
1338 This attribute indicates that, when emitting the prologue and
1339 epilogue, the backend should forcibly align the stack pointer.
1340 Specify the desired alignment, which must be a power of two, in
1341 parentheses.
George Burgess IV278199f2016-04-12 01:05:35 +00001342``allocsize(<EltSizeParam>[, <NumEltsParam>])``
1343 This attribute indicates that the annotated function will always return at
1344 least a given number of bytes (or null). Its arguments are zero-indexed
1345 parameter numbers; if one argument is provided, then it's assumed that at
1346 least ``CallSite.Args[EltSizeParam]`` bytes will be available at the
1347 returned pointer. If two are provided, then it's assumed that
1348 ``CallSite.Args[EltSizeParam] * CallSite.Args[NumEltsParam]`` bytes are
1349 available. The referenced parameters must be integer types. No assumptions
1350 are made about the contents of the returned block of memory.
Sean Silvab084af42012-12-07 10:36:55 +00001351``alwaysinline``
1352 This attribute indicates that the inliner should attempt to inline
1353 this function into callers whenever possible, ignoring any active
1354 inlining size threshold for this caller.
Michael Gottesman41748d72013-06-27 00:25:01 +00001355``builtin``
1356 This indicates that the callee function at a call site should be
1357 recognized as a built-in function, even though the function's declaration
Michael Gottesman3a6a9672013-07-02 21:32:56 +00001358 uses the ``nobuiltin`` attribute. This is only valid at call sites for
Richard Smith32dbdf62014-07-31 04:25:36 +00001359 direct calls to functions that are declared with the ``nobuiltin``
Michael Gottesman41748d72013-06-27 00:25:01 +00001360 attribute.
Michael Gottesman296adb82013-06-27 22:48:08 +00001361``cold``
1362 This attribute indicates that this function is rarely called. When
1363 computing edge weights, basic blocks post-dominated by a cold
1364 function call are also considered to be cold; and, thus, given low
1365 weight.
Owen Anderson85fa7d52015-05-26 23:48:40 +00001366``convergent``
Justin Lebard5fb6952016-02-09 23:03:17 +00001367 In some parallel execution models, there exist operations that cannot be
1368 made control-dependent on any additional values. We call such operations
Justin Lebar58535b12016-02-17 17:46:41 +00001369 ``convergent``, and mark them with this attribute.
Justin Lebard5fb6952016-02-09 23:03:17 +00001370
Justin Lebar58535b12016-02-17 17:46:41 +00001371 The ``convergent`` attribute may appear on functions or call/invoke
1372 instructions. When it appears on a function, it indicates that calls to
1373 this function should not be made control-dependent on additional values.
Justin Bognera4635372016-07-06 20:02:45 +00001374 For example, the intrinsic ``llvm.nvvm.barrier0`` is ``convergent``, so
Justin Lebard5fb6952016-02-09 23:03:17 +00001375 calls to this intrinsic cannot be made control-dependent on additional
Justin Lebar58535b12016-02-17 17:46:41 +00001376 values.
Justin Lebard5fb6952016-02-09 23:03:17 +00001377
Justin Lebar58535b12016-02-17 17:46:41 +00001378 When it appears on a call/invoke, the ``convergent`` attribute indicates
1379 that we should treat the call as though we're calling a convergent
1380 function. This is particularly useful on indirect calls; without this we
1381 may treat such calls as though the target is non-convergent.
1382
1383 The optimizer may remove the ``convergent`` attribute on functions when it
1384 can prove that the function does not execute any convergent operations.
1385 Similarly, the optimizer may remove ``convergent`` on calls/invokes when it
1386 can prove that the call/invoke cannot call a convergent function.
Vaivaswatha Nagarajfb3f4902015-12-16 16:16:19 +00001387``inaccessiblememonly``
1388 This attribute indicates that the function may only access memory that
1389 is not accessible by the module being compiled. This is a weaker form
1390 of ``readnone``.
1391``inaccessiblemem_or_argmemonly``
1392 This attribute indicates that the function may only access memory that is
1393 either not accessible by the module being compiled, or is pointed to
1394 by its pointer arguments. This is a weaker form of ``argmemonly``
Sean Silvab084af42012-12-07 10:36:55 +00001395``inlinehint``
1396 This attribute indicates that the source code contained a hint that
1397 inlining this function is desirable (such as the "inline" keyword in
1398 C/C++). It is just a hint; it imposes no requirements on the
1399 inliner.
Tom Roeder44cb65f2014-06-05 19:29:43 +00001400``jumptable``
1401 This attribute indicates that the function should be added to a
1402 jump-instruction table at code-generation time, and that all address-taken
1403 references to this function should be replaced with a reference to the
1404 appropriate jump-instruction-table function pointer. Note that this creates
1405 a new pointer for the original function, which means that code that depends
1406 on function-pointer identity can break. So, any function annotated with
1407 ``jumptable`` must also be ``unnamed_addr``.
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001408``minsize``
1409 This attribute suggests that optimization passes and code generator
1410 passes make choices that keep the code size of this function as small
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001411 as possible and perform optimizations that may sacrifice runtime
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001412 performance in order to minimize the size of the generated code.
Sean Silvab084af42012-12-07 10:36:55 +00001413``naked``
1414 This attribute disables prologue / epilogue emission for the
1415 function. This can have very system-specific consequences.
Sumanth Gundapaneni6af104e2017-07-28 22:26:22 +00001416``no-jump-tables``
1417 When this attribute is set to true, the jump tables and lookup tables that
1418 can be generated from a switch case lowering are disabled.
Eli Bendersky97ad9242013-04-18 16:11:44 +00001419``nobuiltin``
Michael Gottesman41748d72013-06-27 00:25:01 +00001420 This indicates that the callee function at a call site is not recognized as
1421 a built-in function. LLVM will retain the original call and not replace it
1422 with equivalent code based on the semantics of the built-in function, unless
1423 the call site uses the ``builtin`` attribute. This is valid at call sites
1424 and on function declarations and definitions.
Bill Wendlingbf902f12013-02-06 06:22:58 +00001425``noduplicate``
1426 This attribute indicates that calls to the function cannot be
1427 duplicated. A call to a ``noduplicate`` function may be moved
1428 within its parent function, but may not be duplicated within
1429 its parent function.
1430
1431 A function containing a ``noduplicate`` call may still
1432 be an inlining candidate, provided that the call is not
1433 duplicated by inlining. That implies that the function has
1434 internal linkage and only has one call site, so the original
1435 call is dead after inlining.
Sean Silvab084af42012-12-07 10:36:55 +00001436``noimplicitfloat``
1437 This attributes disables implicit floating point instructions.
1438``noinline``
1439 This attribute indicates that the inliner should never inline this
1440 function in any situation. This attribute may not be used together
1441 with the ``alwaysinline`` attribute.
Sean Silva1cbbcf12013-08-06 19:34:37 +00001442``nonlazybind``
1443 This attribute suppresses lazy symbol binding for the function. This
1444 may make calls to the function faster, at the cost of extra program
1445 startup time if the function is not called during program startup.
Sean Silvab084af42012-12-07 10:36:55 +00001446``noredzone``
1447 This attribute indicates that the code generator should not use a
1448 red zone, even if the target-specific ABI normally permits it.
1449``noreturn``
1450 This function attribute indicates that the function never returns
1451 normally. This produces undefined behavior at runtime if the
1452 function ever does dynamically return.
James Molloye6f87ca2015-11-06 10:32:53 +00001453``norecurse``
1454 This function attribute indicates that the function does not call itself
1455 either directly or indirectly down any possible call path. This produces
1456 undefined behavior at runtime if the function ever does recurse.
Sean Silvab084af42012-12-07 10:36:55 +00001457``nounwind``
Reid Kleckner96d01132015-02-11 01:23:16 +00001458 This function attribute indicates that the function never raises an
1459 exception. If the function does raise an exception, its runtime
1460 behavior is undefined. However, functions marked nounwind may still
1461 trap or generate asynchronous exceptions. Exception handling schemes
1462 that are recognized by LLVM to handle asynchronous exceptions, such
1463 as SEH, will still provide their implementation defined semantics.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001464``optnone``
Paul Robinsona2550a62015-11-30 21:56:16 +00001465 This function attribute indicates that most optimization passes will skip
1466 this function, with the exception of interprocedural optimization passes.
1467 Code generation defaults to the "fast" instruction selector.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001468 This attribute cannot be used together with the ``alwaysinline``
1469 attribute; this attribute is also incompatible
1470 with the ``minsize`` attribute and the ``optsize`` attribute.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001471
Paul Robinsondcbe35b2013-11-18 21:44:03 +00001472 This attribute requires the ``noinline`` attribute to be specified on
1473 the function as well, so the function is never inlined into any caller.
Andrea Di Biagio377496b2013-08-23 11:53:55 +00001474 Only functions with the ``alwaysinline`` attribute are valid
Paul Robinsondcbe35b2013-11-18 21:44:03 +00001475 candidates for inlining into the body of this function.
Sean Silvab084af42012-12-07 10:36:55 +00001476``optsize``
1477 This attribute suggests that optimization passes and code generator
1478 passes make choices that keep the code size of this function low,
Andrea Di Biagio9b5d23b2013-08-09 18:42:18 +00001479 and otherwise do optimizations specifically to reduce code size as
1480 long as they do not significantly impact runtime performance.
Sanjoy Dasc0441c22016-04-19 05:24:47 +00001481``"patchable-function"``
1482 This attribute tells the code generator that the code
1483 generated for this function needs to follow certain conventions that
1484 make it possible for a runtime function to patch over it later.
1485 The exact effect of this attribute depends on its string value,
Charles Davise9c32c72016-08-08 21:20:15 +00001486 for which there currently is one legal possibility:
Sanjoy Dasc0441c22016-04-19 05:24:47 +00001487
1488 * ``"prologue-short-redirect"`` - This style of patchable
1489 function is intended to support patching a function prologue to
1490 redirect control away from the function in a thread safe
1491 manner. It guarantees that the first instruction of the
1492 function will be large enough to accommodate a short jump
1493 instruction, and will be sufficiently aligned to allow being
1494 fully changed via an atomic compare-and-swap instruction.
1495 While the first requirement can be satisfied by inserting large
1496 enough NOP, LLVM can and will try to re-purpose an existing
1497 instruction (i.e. one that would have to be emitted anyway) as
1498 the patchable instruction larger than a short jump.
1499
1500 ``"prologue-short-redirect"`` is currently only supported on
1501 x86-64.
1502
1503 This attribute by itself does not imply restrictions on
1504 inter-procedural optimizations. All of the semantic effects the
1505 patching may have to be separately conveyed via the linkage type.
whitequarked54b4a2017-06-21 18:46:50 +00001506``"probe-stack"``
1507 This attribute indicates that the function will trigger a guard region
1508 in the end of the stack. It ensures that accesses to the stack must be
1509 no further apart than the size of the guard region to a previous
1510 access of the stack. It takes one required string value, the name of
1511 the stack probing function that will be called.
1512
1513 If a function that has a ``"probe-stack"`` attribute is inlined into
1514 a function with another ``"probe-stack"`` attribute, the resulting
1515 function has the ``"probe-stack"`` attribute of the caller. If a
1516 function that has a ``"probe-stack"`` attribute is inlined into a
1517 function that has no ``"probe-stack"`` attribute at all, the resulting
1518 function has the ``"probe-stack"`` attribute of the callee.
Sean Silvab084af42012-12-07 10:36:55 +00001519``readnone``
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001520 On a function, this attribute indicates that the function computes its
1521 result (or decides to unwind an exception) based strictly on its arguments,
Sean Silvab084af42012-12-07 10:36:55 +00001522 without dereferencing any pointer arguments or otherwise accessing
1523 any mutable state (e.g. memory, control registers, etc) visible to
1524 caller functions. It does not write through any pointer arguments
1525 (including ``byval`` arguments) and never changes any state visible
Sanjoy Das5be2e842017-02-13 23:19:07 +00001526 to callers. This means while it cannot unwind exceptions by calling
1527 the ``C++`` exception throwing methods (since they write to memory), there may
1528 be non-``C++`` mechanisms that throw exceptions without writing to LLVM
1529 visible memory.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001530
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001531 On an argument, this attribute indicates that the function does not
1532 dereference that pointer argument, even though it may read or write the
Nick Lewyckyefe31f22013-07-06 01:04:47 +00001533 memory that the pointer points to if accessed through other pointers.
Sean Silvab084af42012-12-07 10:36:55 +00001534``readonly``
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001535 On a function, this attribute indicates that the function does not write
1536 through any pointer arguments (including ``byval`` arguments) or otherwise
Sean Silvab084af42012-12-07 10:36:55 +00001537 modify any state (e.g. memory, control registers, etc) visible to
1538 caller functions. It may dereference pointer arguments and read
1539 state that may be set in the caller. A readonly function always
1540 returns the same value (or unwinds an exception identically) when
Sanjoy Das5be2e842017-02-13 23:19:07 +00001541 called with the same set of arguments and global state. This means while it
1542 cannot unwind exceptions by calling the ``C++`` exception throwing methods
1543 (since they write to memory), there may be non-``C++`` mechanisms that throw
1544 exceptions without writing to LLVM visible memory.
Andrew Trickd4d1d9c2013-10-31 17:18:07 +00001545
Nick Lewyckyc2ec0722013-07-06 00:29:58 +00001546 On an argument, this attribute indicates that the function does not write
1547 through this pointer argument, even though it may write to the memory that
1548 the pointer points to.
whitequark08b20352017-06-22 23:22:36 +00001549``"stack-probe-size"``
1550 This attribute controls the behavior of stack probes: either
1551 the ``"probe-stack"`` attribute, or ABI-required stack probes, if any.
1552 It defines the size of the guard region. It ensures that if the function
1553 may use more stack space than the size of the guard region, stack probing
1554 sequence will be emitted. It takes one required integer value, which
1555 is 4096 by default.
1556
1557 If a function that has a ``"stack-probe-size"`` attribute is inlined into
1558 a function with another ``"stack-probe-size"`` attribute, the resulting
1559 function has the ``"stack-probe-size"`` attribute that has the lower
1560 numeric value. If a function that has a ``"stack-probe-size"`` attribute is
1561 inlined into a function that has no ``"stack-probe-size"`` attribute
1562 at all, the resulting function has the ``"stack-probe-size"`` attribute
1563 of the callee.
Nicolai Haehnle84c9f992016-07-04 08:01:29 +00001564``writeonly``
1565 On a function, this attribute indicates that the function may write to but
1566 does not read from memory.
1567
1568 On an argument, this attribute indicates that the function may write to but
1569 does not read through this pointer argument (even though it may read from
1570 the memory that the pointer points to).
Igor Laevsky39d662f2015-07-11 10:30:36 +00001571``argmemonly``
1572 This attribute indicates that the only memory accesses inside function are
1573 loads and stores from objects pointed to by its pointer-typed arguments,
1574 with arbitrary offsets. Or in other words, all memory operations in the
1575 function can refer to memory only using pointers based on its function
1576 arguments.
1577 Note that ``argmemonly`` can be used together with ``readonly`` attribute
1578 in order to specify that function reads only from its arguments.
Sean Silvab084af42012-12-07 10:36:55 +00001579``returns_twice``
1580 This attribute indicates that this function can return twice. The C
1581 ``setjmp`` is an example of such a function. The compiler disables
1582 some optimizations (like tail calls) in the caller of these
1583 functions.
Peter Collingbourne82437bf2015-06-15 21:07:11 +00001584``safestack``
1585 This attribute indicates that
1586 `SafeStack <http://clang.llvm.org/docs/SafeStack.html>`_
1587 protection is enabled for this function.
1588
1589 If a function that has a ``safestack`` attribute is inlined into a
1590 function that doesn't have a ``safestack`` attribute or which has an
1591 ``ssp``, ``sspstrong`` or ``sspreq`` attribute, then the resulting
1592 function will have a ``safestack`` attribute.
Kostya Serebryanycf880b92013-02-26 06:58:09 +00001593``sanitize_address``
1594 This attribute indicates that AddressSanitizer checks
1595 (dynamic address safety analysis) are enabled for this function.
1596``sanitize_memory``
1597 This attribute indicates that MemorySanitizer checks (dynamic detection
1598 of accesses to uninitialized memory) are enabled for this function.
1599``sanitize_thread``
1600 This attribute indicates that ThreadSanitizer checks
1601 (dynamic thread safety analysis) are enabled for this function.
Evgeniy Stepanovc667c1f2017-12-09 00:21:41 +00001602``sanitize_hwaddress``
1603 This attribute indicates that HWAddressSanitizer checks
1604 (dynamic address safety analysis based on tagged pointers) are enabled for
1605 this function.
Matt Arsenaultb19b57e2017-04-28 20:25:27 +00001606``speculatable``
1607 This function attribute indicates that the function does not have any
1608 effects besides calculating its result and does not have undefined behavior.
1609 Note that ``speculatable`` is not enough to conclude that along any
Xin Tongc7180202017-05-02 23:24:12 +00001610 particular execution path the number of calls to this function will not be
Matt Arsenaultb19b57e2017-04-28 20:25:27 +00001611 externally observable. This attribute is only valid on functions
1612 and declarations, not on individual call sites. If a function is
1613 incorrectly marked as speculatable and really does exhibit
1614 undefined behavior, the undefined behavior may be observed even
1615 if the call site is dead code.
1616
Sean Silvab084af42012-12-07 10:36:55 +00001617``ssp``
1618 This attribute indicates that the function should emit a stack
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00001619 smashing protector. It is in the form of a "canary" --- a random value
Sean Silvab084af42012-12-07 10:36:55 +00001620 placed on the stack before the local variables that's checked upon
1621 return from the function to see if it has been overwritten. A
1622 heuristic is used to determine if a function needs stack protectors
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001623 or not. The heuristic used will enable protectors for functions with:
Dmitri Gribenko69b56472013-01-29 23:14:41 +00001624
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001625 - Character arrays larger than ``ssp-buffer-size`` (default 8).
1626 - Aggregates containing character arrays larger than ``ssp-buffer-size``.
1627 - Calls to alloca() with variable sizes or constant sizes greater than
1628 ``ssp-buffer-size``.
Sean Silvab084af42012-12-07 10:36:55 +00001629
Josh Magee24c7f062014-02-01 01:36:16 +00001630 Variables that are identified as requiring a protector will be arranged
1631 on the stack such that they are adjacent to the stack protector guard.
1632
Sean Silvab084af42012-12-07 10:36:55 +00001633 If a function that has an ``ssp`` attribute is inlined into a
1634 function that doesn't have an ``ssp`` attribute, then the resulting
1635 function will have an ``ssp`` attribute.
1636``sspreq``
1637 This attribute indicates that the function should *always* emit a
1638 stack smashing protector. This overrides the ``ssp`` function
1639 attribute.
1640
Josh Magee24c7f062014-02-01 01:36:16 +00001641 Variables that are identified as requiring a protector will be arranged
1642 on the stack such that they are adjacent to the stack protector guard.
1643 The specific layout rules are:
1644
1645 #. Large arrays and structures containing large arrays
1646 (``>= ssp-buffer-size``) are closest to the stack protector.
1647 #. Small arrays and structures containing small arrays
1648 (``< ssp-buffer-size``) are 2nd closest to the protector.
1649 #. Variables that have had their address taken are 3rd closest to the
1650 protector.
1651
Sean Silvab084af42012-12-07 10:36:55 +00001652 If a function that has an ``sspreq`` attribute is inlined into a
1653 function that doesn't have an ``sspreq`` attribute or which has an
Bill Wendlingd154e2832013-01-23 06:41:41 +00001654 ``ssp`` or ``sspstrong`` attribute, then the resulting function will have
1655 an ``sspreq`` attribute.
1656``sspstrong``
1657 This attribute indicates that the function should emit a stack smashing
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001658 protector. This attribute causes a strong heuristic to be used when
Sean Silvaa1190322015-08-06 22:56:48 +00001659 determining if a function needs stack protectors. The strong heuristic
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001660 will enable protectors for functions with:
Dmitri Gribenko69b56472013-01-29 23:14:41 +00001661
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001662 - Arrays of any size and type
1663 - Aggregates containing an array of any size and type.
1664 - Calls to alloca().
1665 - Local variables that have had their address taken.
1666
Josh Magee24c7f062014-02-01 01:36:16 +00001667 Variables that are identified as requiring a protector will be arranged
1668 on the stack such that they are adjacent to the stack protector guard.
1669 The specific layout rules are:
1670
1671 #. Large arrays and structures containing large arrays
1672 (``>= ssp-buffer-size``) are closest to the stack protector.
1673 #. Small arrays and structures containing small arrays
1674 (``< ssp-buffer-size``) are 2nd closest to the protector.
1675 #. Variables that have had their address taken are 3rd closest to the
1676 protector.
1677
Bill Wendling7c8f96a2013-01-23 06:43:53 +00001678 This overrides the ``ssp`` function attribute.
Bill Wendlingd154e2832013-01-23 06:41:41 +00001679
1680 If a function that has an ``sspstrong`` attribute is inlined into a
1681 function that doesn't have an ``sspstrong`` attribute, then the
1682 resulting function will have an ``sspstrong`` attribute.
Andrew Kaylor53a5fbb2017-08-14 21:15:13 +00001683``strictfp``
1684 This attribute indicates that the function was called from a scope that
1685 requires strict floating point semantics. LLVM will not attempt any
1686 optimizations that require assumptions about the floating point rounding
1687 mode or that might alter the state of floating point status flags that
1688 might otherwise be set or cleared by calling this function.
Reid Kleckner5a2ab2b2015-03-04 00:08:56 +00001689``"thunk"``
1690 This attribute indicates that the function will delegate to some other
1691 function with a tail call. The prototype of a thunk should not be used for
1692 optimization purposes. The caller is expected to cast the thunk prototype to
1693 match the thunk target prototype.
Sean Silvab084af42012-12-07 10:36:55 +00001694``uwtable``
1695 This attribute indicates that the ABI being targeted requires that
Sean Silva706fba52015-08-06 22:56:24 +00001696 an unwind table entry be produced for this function even if we can
Sean Silvab084af42012-12-07 10:36:55 +00001697 show that no exceptions passes by it. This is normally the case for
1698 the ELF x86-64 abi, but it can be disabled for some compilation
1699 units.
Sean Silvab084af42012-12-07 10:36:55 +00001700
Javed Absarf3d79042017-05-11 12:28:08 +00001701.. _glattrs:
1702
1703Global Attributes
1704-----------------
1705
1706Attributes may be set to communicate additional information about a global variable.
1707Unlike :ref:`function attributes <fnattrs>`, attributes on a global variable
1708are grouped into a single :ref:`attribute group <attrgrp>`.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001709
1710.. _opbundles:
1711
1712Operand Bundles
1713---------------
1714
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001715Operand bundles are tagged sets of SSA values that can be associated
Sanjoy Dasb0e9d4a52015-09-25 00:05:40 +00001716with certain LLVM instructions (currently only ``call`` s and
1717``invoke`` s). In a way they are like metadata, but dropping them is
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001718incorrect and will change program semantics.
1719
1720Syntax::
David Majnemer34cacb42015-10-22 01:46:38 +00001721
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001722 operand bundle set ::= '[' operand bundle (, operand bundle )* ']'
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001723 operand bundle ::= tag '(' [ bundle operand ] (, bundle operand )* ')'
1724 bundle operand ::= SSA value
1725 tag ::= string constant
1726
1727Operand bundles are **not** part of a function's signature, and a
1728given function may be called from multiple places with different kinds
1729of operand bundles. This reflects the fact that the operand bundles
1730are conceptually a part of the ``call`` (or ``invoke``), not the
1731callee being dispatched to.
1732
1733Operand bundles are a generic mechanism intended to support
1734runtime-introspection-like functionality for managed languages. While
1735the exact semantics of an operand bundle depend on the bundle tag,
1736there are certain limitations to how much the presence of an operand
1737bundle can influence the semantics of a program. These restrictions
1738are described as the semantics of an "unknown" operand bundle. As
1739long as the behavior of an operand bundle is describable within these
1740restrictions, LLVM does not need to have special knowledge of the
1741operand bundle to not miscompile programs containing it.
1742
David Majnemer34cacb42015-10-22 01:46:38 +00001743- The bundle operands for an unknown operand bundle escape in unknown
1744 ways before control is transferred to the callee or invokee.
1745- Calls and invokes with operand bundles have unknown read / write
1746 effect on the heap on entry and exit (even if the call target is
Sylvestre Ledru84666a12016-02-14 20:16:22 +00001747 ``readnone`` or ``readonly``), unless they're overridden with
Sanjoy Das98a341b2015-10-22 03:12:22 +00001748 callsite specific attributes.
1749- An operand bundle at a call site cannot change the implementation
1750 of the called function. Inter-procedural optimizations work as
1751 usual as long as they take into account the first two properties.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00001752
Sanjoy Dascdafd842015-11-11 21:38:02 +00001753More specific types of operand bundles are described below.
1754
Sanjoy Dasb51325d2016-03-11 19:08:34 +00001755.. _deopt_opbundles:
1756
Sanjoy Dascdafd842015-11-11 21:38:02 +00001757Deoptimization Operand Bundles
1758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1759
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001760Deoptimization operand bundles are characterized by the ``"deopt"``
Sanjoy Dascdafd842015-11-11 21:38:02 +00001761operand bundle tag. These operand bundles represent an alternate
1762"safe" continuation for the call site they're attached to, and can be
1763used by a suitable runtime to deoptimize the compiled frame at the
Sanjoy Das9f3c1252015-11-21 09:12:07 +00001764specified call site. There can be at most one ``"deopt"`` operand
1765bundle attached to a call site. Exact details of deoptimization is
1766out of scope for the language reference, but it usually involves
1767rewriting a compiled frame into a set of interpreted frames.
Sanjoy Dascdafd842015-11-11 21:38:02 +00001768
1769From the compiler's perspective, deoptimization operand bundles make
1770the call sites they're attached to at least ``readonly``. They read
1771through all of their pointer typed operands (even if they're not
1772otherwise escaped) and the entire visible heap. Deoptimization
1773operand bundles do not capture their operands except during
1774deoptimization, in which case control will not be returned to the
1775compiled frame.
1776
Sanjoy Das2d161452015-11-18 06:23:38 +00001777The inliner knows how to inline through calls that have deoptimization
1778operand bundles. Just like inlining through a normal call site
1779involves composing the normal and exceptional continuations, inlining
1780through a call site with a deoptimization operand bundle needs to
1781appropriately compose the "safe" deoptimization continuation. The
1782inliner does this by prepending the parent's deoptimization
1783continuation to every deoptimization continuation in the inlined body.
1784E.g. inlining ``@f`` into ``@g`` in the following example
1785
1786.. code-block:: llvm
1787
1788 define void @f() {
1789 call void @x() ;; no deopt state
1790 call void @y() [ "deopt"(i32 10) ]
1791 call void @y() [ "deopt"(i32 10), "unknown"(i8* null) ]
1792 ret void
1793 }
1794
1795 define void @g() {
1796 call void @f() [ "deopt"(i32 20) ]
1797 ret void
1798 }
1799
1800will result in
1801
1802.. code-block:: llvm
1803
1804 define void @g() {
1805 call void @x() ;; still no deopt state
1806 call void @y() [ "deopt"(i32 20, i32 10) ]
1807 call void @y() [ "deopt"(i32 20, i32 10), "unknown"(i8* null) ]
1808 ret void
1809 }
1810
1811It is the frontend's responsibility to structure or encode the
1812deoptimization state in a way that syntactically prepending the
1813caller's deoptimization state to the callee's deoptimization state is
1814semantically equivalent to composing the caller's deoptimization
1815continuation after the callee's deoptimization continuation.
1816
Joseph Tremoulete28885e2016-01-10 04:28:38 +00001817.. _ob_funclet:
1818
David Majnemer3bb88c02015-12-15 21:27:27 +00001819Funclet Operand Bundles
1820^^^^^^^^^^^^^^^^^^^^^^^
1821
1822Funclet operand bundles are characterized by the ``"funclet"``
1823operand bundle tag. These operand bundles indicate that a call site
1824is within a particular funclet. There can be at most one
1825``"funclet"`` operand bundle attached to a call site and it must have
1826exactly one bundle operand.
1827
Joseph Tremoulete28885e2016-01-10 04:28:38 +00001828If any funclet EH pads have been "entered" but not "exited" (per the
1829`description in the EH doc\ <ExceptionHandling.html#wineh-constraints>`_),
1830it is undefined behavior to execute a ``call`` or ``invoke`` which:
1831
1832* does not have a ``"funclet"`` bundle and is not a ``call`` to a nounwind
1833 intrinsic, or
1834* has a ``"funclet"`` bundle whose operand is not the most-recently-entered
1835 not-yet-exited funclet EH pad.
1836
1837Similarly, if no funclet EH pads have been entered-but-not-yet-exited,
1838executing a ``call`` or ``invoke`` with a ``"funclet"`` bundle is undefined behavior.
1839
Sanjoy Dasa34ce952016-01-20 19:50:25 +00001840GC Transition Operand Bundles
1841^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1842
1843GC transition operand bundles are characterized by the
1844``"gc-transition"`` operand bundle tag. These operand bundles mark a
1845call as a transition between a function with one GC strategy to a
1846function with a different GC strategy. If coordinating the transition
1847between GC strategies requires additional code generation at the call
1848site, these bundles may contain any values that are needed by the
1849generated code. For more details, see :ref:`GC Transitions
1850<gc_transition_args>`.
1851
Sean Silvab084af42012-12-07 10:36:55 +00001852.. _moduleasm:
1853
1854Module-Level Inline Assembly
1855----------------------------
1856
1857Modules may contain "module-level inline asm" blocks, which corresponds
1858to the GCC "file scope inline asm" blocks. These blocks are internally
1859concatenated by LLVM and treated as a single unit, but may be separated
1860in the ``.ll`` file if desired. The syntax is very simple:
1861
1862.. code-block:: llvm
1863
1864 module asm "inline asm code goes here"
1865 module asm "more can go here"
1866
1867The strings can contain any character by escaping non-printable
1868characters. The escape sequence used is simply "\\xx" where "xx" is the
1869two digit hex code for the number.
1870
James Y Knightbc832ed2015-07-08 18:08:36 +00001871Note that the assembly string *must* be parseable by LLVM's integrated assembler
1872(unless it is disabled), even when emitting a ``.s`` file.
Sean Silvab084af42012-12-07 10:36:55 +00001873
Eli Benderskyfdc529a2013-06-07 19:40:08 +00001874.. _langref_datalayout:
1875
Sean Silvab084af42012-12-07 10:36:55 +00001876Data Layout
1877-----------
1878
1879A module may specify a target specific data layout string that specifies
1880how data is to be laid out in memory. The syntax for the data layout is
1881simply:
1882
1883.. code-block:: llvm
1884
1885 target datalayout = "layout specification"
1886
1887The *layout specification* consists of a list of specifications
1888separated by the minus sign character ('-'). Each specification starts
1889with a letter and may include other information after the letter to
1890define some aspect of the data layout. The specifications accepted are
1891as follows:
1892
1893``E``
1894 Specifies that the target lays out data in big-endian form. That is,
1895 the bits with the most significance have the lowest address
1896 location.
1897``e``
1898 Specifies that the target lays out data in little-endian form. That
1899 is, the bits with the least significance have the lowest address
1900 location.
1901``S<size>``
1902 Specifies the natural alignment of the stack in bits. Alignment
1903 promotion of stack variables is limited to the natural stack
1904 alignment to avoid dynamic stack realignment. The stack alignment
1905 must be a multiple of 8-bits. If omitted, the natural stack
1906 alignment defaults to "unspecified", which does not prevent any
1907 alignment promotions.
Dylan McKayced2fe62018-02-19 09:56:22 +00001908``P<address space>``
1909 Specifies the address space that corresponds to program memory.
1910 Harvard architectures can use this to specify what space LLVM
1911 should place things such as functions into. If omitted, the
1912 program memory space defaults to the default address space of 0,
1913 which corresponds to a Von Neumann architecture that has code
1914 and data in the same space.
Matt Arsenault3c1fc762017-04-10 22:27:50 +00001915``A<address space>``
Dylan McKayced2fe62018-02-19 09:56:22 +00001916 Specifies the address space of objects created by '``alloca``'.
Matt Arsenault3c1fc762017-04-10 22:27:50 +00001917 Defaults to the default address space of 0.
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00001918``p[n]:<size>:<abi>:<pref>:<idx>``
Sean Silvab084af42012-12-07 10:36:55 +00001919 This specifies the *size* of a pointer and its ``<abi>`` and
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00001920 ``<pref>``\erred alignments for address space ``n``. The fourth parameter
1921 ``<idx>`` is a size of index that used for address calculation. If not
1922 specified, the default index size is equal to the pointer size. All sizes
1923 are in bits. The address space, ``n``, is optional, and if not specified,
Sean Silvaa1190322015-08-06 22:56:48 +00001924 denotes the default address space 0. The value of ``n`` must be
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001925 in the range [1,2^23).
Sean Silvab084af42012-12-07 10:36:55 +00001926``i<size>:<abi>:<pref>``
1927 This specifies the alignment for an integer type of a given bit
1928 ``<size>``. The value of ``<size>`` must be in the range [1,2^23).
1929``v<size>:<abi>:<pref>``
1930 This specifies the alignment for a vector type of a given bit
1931 ``<size>``.
1932``f<size>:<abi>:<pref>``
1933 This specifies the alignment for a floating point type of a given bit
1934 ``<size>``. Only values of ``<size>`` that are supported by the target
1935 will work. 32 (float) and 64 (double) are supported on all targets; 80
1936 or 128 (different flavors of long double) are also supported on some
1937 targets.
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001938``a:<abi>:<pref>``
1939 This specifies the alignment for an object of aggregate type.
Rafael Espindola58873562014-01-03 19:21:54 +00001940``m:<mangling>``
Hans Wennborgd4245ac2014-01-15 02:49:17 +00001941 If present, specifies that llvm names are mangled in the output. The
1942 options are
1943
1944 * ``e``: ELF mangling: Private symbols get a ``.L`` prefix.
1945 * ``m``: Mips mangling: Private symbols get a ``$`` prefix.
1946 * ``o``: Mach-O mangling: Private symbols get ``L`` prefix. Other
1947 symbols get a ``_`` prefix.
1948 * ``w``: Windows COFF prefix: Similar to Mach-O, but stdcall and fastcall
1949 functions also get a suffix based on the frame size.
Saleem Abdulrasool70d2d642015-10-25 20:39:35 +00001950 * ``x``: Windows x86 COFF prefix: Similar to Windows COFF, but use a ``_``
1951 prefix for ``__cdecl`` functions.
Sean Silvab084af42012-12-07 10:36:55 +00001952``n<size1>:<size2>:<size3>...``
1953 This specifies a set of native integer widths for the target CPU in
1954 bits. For example, it might contain ``n32`` for 32-bit PowerPC,
1955 ``n32:64`` for PowerPC 64, or ``n8:16:32:64`` for X86-64. Elements of
1956 this set are considered to support most general arithmetic operations
1957 efficiently.
Sanjoy Dasc6af5ea2016-07-28 23:43:38 +00001958``ni:<address space0>:<address space1>:<address space2>...``
1959 This specifies pointer types with the specified address spaces
1960 as :ref:`Non-Integral Pointer Type <nointptrtype>` s. The ``0``
1961 address space cannot be specified as non-integral.
Sean Silvab084af42012-12-07 10:36:55 +00001962
Rafael Espindolaabdd7262014-01-06 21:40:24 +00001963On every specification that takes a ``<abi>:<pref>``, specifying the
1964``<pref>`` alignment is optional. If omitted, the preceding ``:``
1965should be omitted too and ``<pref>`` will be equal to ``<abi>``.
1966
Sean Silvab084af42012-12-07 10:36:55 +00001967When constructing the data layout for a given target, LLVM starts with a
1968default set of specifications which are then (possibly) overridden by
1969the specifications in the ``datalayout`` keyword. The default
1970specifications are given in this list:
1971
1972- ``E`` - big endian
Matt Arsenault24b49c42013-07-31 17:49:08 +00001973- ``p:64:64:64`` - 64-bit pointers with 64-bit alignment.
1974- ``p[n]:64:64:64`` - Other address spaces are assumed to be the
1975 same as the default address space.
Patrik Hagglunda832ab12013-01-30 09:02:06 +00001976- ``S0`` - natural stack alignment is unspecified
Sean Silvab084af42012-12-07 10:36:55 +00001977- ``i1:8:8`` - i1 is 8-bit (byte) aligned
1978- ``i8:8:8`` - i8 is 8-bit (byte) aligned
1979- ``i16:16:16`` - i16 is 16-bit aligned
1980- ``i32:32:32`` - i32 is 32-bit aligned
1981- ``i64:32:64`` - i64 has ABI alignment of 32-bits but preferred
1982 alignment of 64-bits
Patrik Hagglunda832ab12013-01-30 09:02:06 +00001983- ``f16:16:16`` - half is 16-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00001984- ``f32:32:32`` - float is 32-bit aligned
1985- ``f64:64:64`` - double is 64-bit aligned
Patrik Hagglunda832ab12013-01-30 09:02:06 +00001986- ``f128:128:128`` - quad is 128-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00001987- ``v64:64:64`` - 64-bit vector is 64-bit aligned
1988- ``v128:128:128`` - 128-bit vector is 128-bit aligned
Rafael Espindolae8f4d582013-12-12 17:21:51 +00001989- ``a:0:64`` - aggregates are 64-bit aligned
Sean Silvab084af42012-12-07 10:36:55 +00001990
1991When LLVM is determining the alignment for a given type, it uses the
1992following rules:
1993
1994#. If the type sought is an exact match for one of the specifications,
1995 that specification is used.
1996#. If no match is found, and the type sought is an integer type, then
1997 the smallest integer type that is larger than the bitwidth of the
1998 sought type is used. If none of the specifications are larger than
1999 the bitwidth then the largest integer type is used. For example,
2000 given the default specifications above, the i7 type will use the
2001 alignment of i8 (next largest) while both i65 and i256 will use the
2002 alignment of i64 (largest specified).
2003#. If no match is found, and the type sought is a vector type, then the
2004 largest vector type that is smaller than the sought vector type will
2005 be used as a fall back. This happens because <128 x double> can be
2006 implemented in terms of 64 <2 x double>, for example.
2007
2008The function of the data layout string may not be what you expect.
2009Notably, this is not a specification from the frontend of what alignment
2010the code generator should use.
2011
2012Instead, if specified, the target data layout is required to match what
2013the ultimate *code generator* expects. This string is used by the
2014mid-level optimizers to improve code, and this only works if it matches
Mehdi Amini4a121fa2015-03-14 22:04:06 +00002015what the ultimate code generator uses. There is no way to generate IR
2016that does not embed this target-specific detail into the IR. If you
2017don't specify the string, the default specifications will be used to
2018generate a Data Layout and the optimization phases will operate
2019accordingly and introduce target specificity into the IR with respect to
2020these default specifications.
Sean Silvab084af42012-12-07 10:36:55 +00002021
Bill Wendling5cc90842013-10-18 23:41:25 +00002022.. _langref_triple:
2023
2024Target Triple
2025-------------
2026
2027A module may specify a target triple string that describes the target
2028host. The syntax for the target triple is simply:
2029
2030.. code-block:: llvm
2031
2032 target triple = "x86_64-apple-macosx10.7.0"
2033
2034The *target triple* string consists of a series of identifiers delimited
2035by the minus sign character ('-'). The canonical forms are:
2036
2037::
2038
2039 ARCHITECTURE-VENDOR-OPERATING_SYSTEM
2040 ARCHITECTURE-VENDOR-OPERATING_SYSTEM-ENVIRONMENT
2041
2042This information is passed along to the backend so that it generates
2043code for the proper architecture. It's possible to override this on the
2044command line with the ``-mtriple`` command line option.
2045
Sean Silvab084af42012-12-07 10:36:55 +00002046.. _pointeraliasing:
2047
2048Pointer Aliasing Rules
2049----------------------
2050
2051Any memory access must be done through a pointer value associated with
2052an address range of the memory access, otherwise the behavior is
2053undefined. Pointer values are associated with address ranges according
2054to the following rules:
2055
2056- A pointer value is associated with the addresses associated with any
2057 value it is *based* on.
2058- An address of a global variable is associated with the address range
2059 of the variable's storage.
2060- The result value of an allocation instruction is associated with the
2061 address range of the allocated storage.
2062- A null pointer in the default address-space is associated with no
2063 address.
2064- An integer constant other than zero or a pointer value returned from
2065 a function not defined within LLVM may be associated with address
2066 ranges allocated through mechanisms other than those provided by
2067 LLVM. Such ranges shall not overlap with any ranges of addresses
2068 allocated by mechanisms provided by LLVM.
2069
2070A pointer value is *based* on another pointer value according to the
2071following rules:
2072
Sanjoy Das6d489492017-09-13 18:49:22 +00002073- A pointer value formed from a scalar ``getelementptr`` operation is *based* on
2074 the pointer-typed operand of the ``getelementptr``.
2075- The pointer in lane *l* of the result of a vector ``getelementptr`` operation
2076 is *based* on the pointer in lane *l* of the vector-of-pointers-typed operand
2077 of the ``getelementptr``.
Sean Silvab084af42012-12-07 10:36:55 +00002078- The result value of a ``bitcast`` is *based* on the operand of the
2079 ``bitcast``.
2080- A pointer value formed by an ``inttoptr`` is *based* on all pointer
2081 values that contribute (directly or indirectly) to the computation of
2082 the pointer's value.
2083- The "*based* on" relationship is transitive.
2084
2085Note that this definition of *"based"* is intentionally similar to the
2086definition of *"based"* in C99, though it is slightly weaker.
2087
2088LLVM IR does not associate types with memory. The result type of a
2089``load`` merely indicates the size and alignment of the memory from
2090which to load, as well as the interpretation of the value. The first
2091operand type of a ``store`` similarly only indicates the size and
2092alignment of the store.
2093
2094Consequently, type-based alias analysis, aka TBAA, aka
2095``-fstrict-aliasing``, is not applicable to general unadorned LLVM IR.
2096:ref:`Metadata <metadata>` may be used to encode additional information
2097which specialized optimization passes may use to implement type-based
2098alias analysis.
2099
2100.. _volatile:
2101
2102Volatile Memory Accesses
2103------------------------
2104
2105Certain memory accesses, such as :ref:`load <i_load>`'s,
2106:ref:`store <i_store>`'s, and :ref:`llvm.memcpy <int_memcpy>`'s may be
2107marked ``volatile``. The optimizers must not change the number of
2108volatile operations or change their order of execution relative to other
2109volatile operations. The optimizers *may* change the order of volatile
2110operations relative to non-volatile operations. This is not Java's
2111"volatile" and has no cross-thread synchronization behavior.
2112
Andrew Trick89fc5a62013-01-30 21:19:35 +00002113IR-level volatile loads and stores cannot safely be optimized into
2114llvm.memcpy or llvm.memmove intrinsics even when those intrinsics are
2115flagged volatile. Likewise, the backend should never split or merge
2116target-legal volatile load/store instructions.
2117
Andrew Trick7e6f9282013-01-31 00:49:39 +00002118.. admonition:: Rationale
2119
2120 Platforms may rely on volatile loads and stores of natively supported
2121 data width to be executed as single instruction. For example, in C
2122 this holds for an l-value of volatile primitive type with native
2123 hardware support, but not necessarily for aggregate types. The
2124 frontend upholds these expectations, which are intentionally
Sean Silva706fba52015-08-06 22:56:24 +00002125 unspecified in the IR. The rules above ensure that IR transformations
Andrew Trick7e6f9282013-01-31 00:49:39 +00002126 do not violate the frontend's contract with the language.
2127
Sean Silvab084af42012-12-07 10:36:55 +00002128.. _memmodel:
2129
2130Memory Model for Concurrent Operations
2131--------------------------------------
2132
2133The LLVM IR does not define any way to start parallel threads of
2134execution or to register signal handlers. Nonetheless, there are
2135platform-specific ways to create them, and we define LLVM IR's behavior
2136in their presence. This model is inspired by the C++0x memory model.
2137
2138For a more informal introduction to this model, see the :doc:`Atomics`.
2139
2140We define a *happens-before* partial order as the least partial order
2141that
2142
2143- Is a superset of single-thread program order, and
2144- When a *synchronizes-with* ``b``, includes an edge from ``a`` to
2145 ``b``. *Synchronizes-with* pairs are introduced by platform-specific
2146 techniques, like pthread locks, thread creation, thread joining,
2147 etc., and by atomic instructions. (See also :ref:`Atomic Memory Ordering
2148 Constraints <ordering>`).
2149
2150Note that program order does not introduce *happens-before* edges
2151between a thread and signals executing inside that thread.
2152
2153Every (defined) read operation (load instructions, memcpy, atomic
2154loads/read-modify-writes, etc.) R reads a series of bytes written by
2155(defined) write operations (store instructions, atomic
2156stores/read-modify-writes, memcpy, etc.). For the purposes of this
2157section, initialized globals are considered to have a write of the
2158initializer which is atomic and happens before any other read or write
2159of the memory in question. For each byte of a read R, R\ :sub:`byte`
2160may see any write to the same byte, except:
2161
2162- If write\ :sub:`1` happens before write\ :sub:`2`, and
2163 write\ :sub:`2` happens before R\ :sub:`byte`, then
2164 R\ :sub:`byte` does not see write\ :sub:`1`.
2165- If R\ :sub:`byte` happens before write\ :sub:`3`, then
2166 R\ :sub:`byte` does not see write\ :sub:`3`.
2167
2168Given that definition, R\ :sub:`byte` is defined as follows:
2169
2170- If R is volatile, the result is target-dependent. (Volatile is
2171 supposed to give guarantees which can support ``sig_atomic_t`` in
Richard Smith32dbdf62014-07-31 04:25:36 +00002172 C/C++, and may be used for accesses to addresses that do not behave
Sean Silvab084af42012-12-07 10:36:55 +00002173 like normal memory. It does not generally provide cross-thread
2174 synchronization.)
2175- Otherwise, if there is no write to the same byte that happens before
2176 R\ :sub:`byte`, R\ :sub:`byte` returns ``undef`` for that byte.
2177- Otherwise, if R\ :sub:`byte` may see exactly one write,
2178 R\ :sub:`byte` returns the value written by that write.
2179- Otherwise, if R is atomic, and all the writes R\ :sub:`byte` may
2180 see are atomic, it chooses one of the values written. See the :ref:`Atomic
2181 Memory Ordering Constraints <ordering>` section for additional
2182 constraints on how the choice is made.
2183- Otherwise R\ :sub:`byte` returns ``undef``.
2184
2185R returns the value composed of the series of bytes it read. This
2186implies that some bytes within the value may be ``undef`` **without**
2187the entire value being ``undef``. Note that this only defines the
2188semantics of the operation; it doesn't mean that targets will emit more
2189than one instruction to read the series of bytes.
2190
2191Note that in cases where none of the atomic intrinsics are used, this
2192model places only one restriction on IR transformations on top of what
2193is required for single-threaded execution: introducing a store to a byte
2194which might not otherwise be stored is not allowed in general.
2195(Specifically, in the case where another thread might write to and read
2196from an address, introducing a store can change a load that may see
2197exactly one write into a load that may see multiple writes.)
2198
2199.. _ordering:
2200
2201Atomic Memory Ordering Constraints
2202----------------------------------
2203
2204Atomic instructions (:ref:`cmpxchg <i_cmpxchg>`,
2205:ref:`atomicrmw <i_atomicrmw>`, :ref:`fence <i_fence>`,
2206:ref:`atomic load <i_load>`, and :ref:`atomic store <i_store>`) take
Tim Northovere94a5182014-03-11 10:48:52 +00002207ordering parameters that determine which other atomic instructions on
Sean Silvab084af42012-12-07 10:36:55 +00002208the same address they *synchronize with*. These semantics are borrowed
2209from Java and C++0x, but are somewhat more colloquial. If these
2210descriptions aren't precise enough, check those specs (see spec
2211references in the :doc:`atomics guide <Atomics>`).
2212:ref:`fence <i_fence>` instructions treat these orderings somewhat
2213differently since they don't take an address. See that instruction's
2214documentation for details.
2215
2216For a simpler introduction to the ordering constraints, see the
2217:doc:`Atomics`.
2218
2219``unordered``
2220 The set of values that can be read is governed by the happens-before
2221 partial order. A value cannot be read unless some operation wrote
2222 it. This is intended to provide a guarantee strong enough to model
2223 Java's non-volatile shared variables. This ordering cannot be
2224 specified for read-modify-write operations; it is not strong enough
2225 to make them atomic in any interesting way.
2226``monotonic``
2227 In addition to the guarantees of ``unordered``, there is a single
2228 total order for modifications by ``monotonic`` operations on each
2229 address. All modification orders must be compatible with the
2230 happens-before order. There is no guarantee that the modification
2231 orders can be combined to a global total order for the whole program
2232 (and this often will not be possible). The read in an atomic
2233 read-modify-write operation (:ref:`cmpxchg <i_cmpxchg>` and
2234 :ref:`atomicrmw <i_atomicrmw>`) reads the value in the modification
2235 order immediately before the value it writes. If one atomic read
2236 happens before another atomic read of the same address, the later
2237 read must see the same value or a later value in the address's
2238 modification order. This disallows reordering of ``monotonic`` (or
2239 stronger) operations on the same address. If an address is written
2240 ``monotonic``-ally by one thread, and other threads ``monotonic``-ally
2241 read that address repeatedly, the other threads must eventually see
2242 the write. This corresponds to the C++0x/C1x
2243 ``memory_order_relaxed``.
2244``acquire``
2245 In addition to the guarantees of ``monotonic``, a
2246 *synchronizes-with* edge may be formed with a ``release`` operation.
2247 This is intended to model C++'s ``memory_order_acquire``.
2248``release``
2249 In addition to the guarantees of ``monotonic``, if this operation
2250 writes a value which is subsequently read by an ``acquire``
2251 operation, it *synchronizes-with* that operation. (This isn't a
2252 complete description; see the C++0x definition of a release
2253 sequence.) This corresponds to the C++0x/C1x
2254 ``memory_order_release``.
2255``acq_rel`` (acquire+release)
2256 Acts as both an ``acquire`` and ``release`` operation on its
2257 address. This corresponds to the C++0x/C1x ``memory_order_acq_rel``.
2258``seq_cst`` (sequentially consistent)
2259 In addition to the guarantees of ``acq_rel`` (``acquire`` for an
Richard Smith32dbdf62014-07-31 04:25:36 +00002260 operation that only reads, ``release`` for an operation that only
Sean Silvab084af42012-12-07 10:36:55 +00002261 writes), there is a global total order on all
2262 sequentially-consistent operations on all addresses, which is
2263 consistent with the *happens-before* partial order and with the
2264 modification orders of all the affected addresses. Each
2265 sequentially-consistent read sees the last preceding write to the
2266 same address in this global order. This corresponds to the C++0x/C1x
2267 ``memory_order_seq_cst`` and Java volatile.
2268
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00002269.. _syncscope:
Sean Silvab084af42012-12-07 10:36:55 +00002270
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00002271If an atomic operation is marked ``syncscope("singlethread")``, it only
2272*synchronizes with* and only participates in the seq\_cst total orderings of
2273other operations running in the same thread (for example, in signal handlers).
2274
2275If an atomic operation is marked ``syncscope("<target-scope>")``, where
2276``<target-scope>`` is a target specific synchronization scope, then it is target
2277dependent if it *synchronizes with* and participates in the seq\_cst total
2278orderings of other operations.
2279
2280Otherwise, an atomic operation that is not marked ``syncscope("singlethread")``
2281or ``syncscope("<target-scope>")`` *synchronizes with* and participates in the
2282seq\_cst total orderings of other operations that are not marked
2283``syncscope("singlethread")`` or ``syncscope("<target-scope>")``.
Sean Silvab084af42012-12-07 10:36:55 +00002284
2285.. _fastmath:
2286
2287Fast-Math Flags
2288---------------
2289
Sanjay Patel629c4112017-11-06 16:27:15 +00002290LLVM IR floating-point operations (:ref:`fadd <i_fadd>`,
Sean Silvab084af42012-12-07 10:36:55 +00002291:ref:`fsub <i_fsub>`, :ref:`fmul <i_fmul>`, :ref:`fdiv <i_fdiv>`,
Matt Arsenault74b73e52017-01-10 18:06:38 +00002292:ref:`frem <i_frem>`, :ref:`fcmp <i_fcmp>`) and :ref:`call <i_call>`
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00002293may use the following flags to enable otherwise unsafe
Sanjay Patel629c4112017-11-06 16:27:15 +00002294floating-point transformations.
Sean Silvab084af42012-12-07 10:36:55 +00002295
2296``nnan``
2297 No NaNs - Allow optimizations to assume the arguments and result are not
2298 NaN. Such optimizations are required to retain defined behavior over
2299 NaNs, but the value of the result is undefined.
2300
2301``ninf``
2302 No Infs - Allow optimizations to assume the arguments and result are not
2303 +/-Inf. Such optimizations are required to retain defined behavior over
2304 +/-Inf, but the value of the result is undefined.
2305
2306``nsz``
2307 No Signed Zeros - Allow optimizations to treat the sign of a zero
2308 argument or result as insignificant.
2309
2310``arcp``
2311 Allow Reciprocal - Allow optimizations to use the reciprocal of an
2312 argument rather than perform division.
2313
Adam Nemetcd847a82017-03-28 20:11:52 +00002314``contract``
2315 Allow floating-point contraction (e.g. fusing a multiply followed by an
2316 addition into a fused multiply-and-add).
2317
Sanjay Patel629c4112017-11-06 16:27:15 +00002318``afn``
2319 Approximate functions - Allow substitution of approximate calculations for
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00002320 functions (sin, log, sqrt, etc). See floating-point intrinsic definitions
2321 for places where this can apply to LLVM's intrinsic math functions.
Sanjay Patel629c4112017-11-06 16:27:15 +00002322
2323``reassoc``
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00002324 Allow reassociation transformations for floating-point instructions.
Sanjay Patel629c4112017-11-06 16:27:15 +00002325 This may dramatically change results in floating point.
2326
Sean Silvab084af42012-12-07 10:36:55 +00002327``fast``
Sanjay Patel629c4112017-11-06 16:27:15 +00002328 This flag implies all of the others.
Sean Silvab084af42012-12-07 10:36:55 +00002329
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002330.. _uselistorder:
2331
2332Use-list Order Directives
2333-------------------------
2334
2335Use-list directives encode the in-memory order of each use-list, allowing the
Sean Silvaa1190322015-08-06 22:56:48 +00002336order to be recreated. ``<order-indexes>`` is a comma-separated list of
2337indexes that are assigned to the referenced value's uses. The referenced
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002338value's use-list is immediately sorted by these indexes.
2339
Sean Silvaa1190322015-08-06 22:56:48 +00002340Use-list directives may appear at function scope or global scope. They are not
2341instructions, and have no effect on the semantics of the IR. When they're at
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002342function scope, they must appear after the terminator of the final basic block.
2343
2344If basic blocks have their address taken via ``blockaddress()`` expressions,
2345``uselistorder_bb`` can be used to reorder their use-lists from outside their
2346function's scope.
2347
2348:Syntax:
2349
2350::
2351
2352 uselistorder <ty> <value>, { <order-indexes> }
2353 uselistorder_bb @function, %block { <order-indexes> }
2354
2355:Examples:
2356
2357::
2358
Duncan P. N. Exon Smith23046652014-08-19 21:48:04 +00002359 define void @foo(i32 %arg1, i32 %arg2) {
2360 entry:
2361 ; ... instructions ...
2362 bb:
2363 ; ... instructions ...
2364
2365 ; At function scope.
2366 uselistorder i32 %arg1, { 1, 0, 2 }
2367 uselistorder label %bb, { 1, 0 }
2368 }
Duncan P. N. Exon Smith0a448fb2014-08-19 21:30:15 +00002369
2370 ; At global scope.
2371 uselistorder i32* @global, { 1, 2, 0 }
2372 uselistorder i32 7, { 1, 0 }
2373 uselistorder i32 (i32) @bar, { 1, 0 }
2374 uselistorder_bb @foo, %bb, { 5, 1, 3, 2, 0, 4 }
2375
Teresa Johnsonde9b8b42016-04-22 13:09:17 +00002376.. _source_filename:
2377
2378Source Filename
2379---------------
2380
2381The *source filename* string is set to the original module identifier,
2382which will be the name of the compiled source file when compiling from
2383source through the clang front end, for example. It is then preserved through
2384the IR and bitcode.
2385
2386This is currently necessary to generate a consistent unique global
2387identifier for local functions used in profile data, which prepends the
2388source file name to the local function name.
2389
2390The syntax for the source file name is simply:
2391
Renato Golin124f2592016-07-20 12:16:38 +00002392.. code-block:: text
Teresa Johnsonde9b8b42016-04-22 13:09:17 +00002393
2394 source_filename = "/path/to/source.c"
2395
Sean Silvab084af42012-12-07 10:36:55 +00002396.. _typesystem:
2397
2398Type System
2399===========
2400
2401The LLVM type system is one of the most important features of the
2402intermediate representation. Being typed enables a number of
2403optimizations to be performed on the intermediate representation
2404directly, without having to do extra analyses on the side before the
2405transformation. A strong type system makes it easier to read the
2406generated code and enables novel analyses and transformations that are
2407not feasible to perform on normal three address code representations.
2408
Rafael Espindola08013342013-12-07 19:34:20 +00002409.. _t_void:
Eli Bendersky0220e6b2013-06-07 20:24:43 +00002410
Rafael Espindola08013342013-12-07 19:34:20 +00002411Void Type
2412---------
Sean Silvab084af42012-12-07 10:36:55 +00002413
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002414:Overview:
2415
Rafael Espindola08013342013-12-07 19:34:20 +00002416
2417The void type does not represent any value and has no size.
2418
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002419:Syntax:
2420
Rafael Espindola08013342013-12-07 19:34:20 +00002421
2422::
2423
2424 void
Sean Silvab084af42012-12-07 10:36:55 +00002425
2426
Rafael Espindola08013342013-12-07 19:34:20 +00002427.. _t_function:
Sean Silvab084af42012-12-07 10:36:55 +00002428
Rafael Espindola08013342013-12-07 19:34:20 +00002429Function Type
2430-------------
Sean Silvab084af42012-12-07 10:36:55 +00002431
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002432:Overview:
2433
Sean Silvab084af42012-12-07 10:36:55 +00002434
Rafael Espindola08013342013-12-07 19:34:20 +00002435The function type can be thought of as a function signature. It consists of a
2436return type and a list of formal parameter types. The return type of a function
2437type is a void type or first class type --- except for :ref:`label <t_label>`
2438and :ref:`metadata <t_metadata>` types.
Sean Silvab084af42012-12-07 10:36:55 +00002439
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002440:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002441
Rafael Espindola08013342013-12-07 19:34:20 +00002442::
Sean Silvab084af42012-12-07 10:36:55 +00002443
Rafael Espindola08013342013-12-07 19:34:20 +00002444 <returntype> (<parameter list>)
Sean Silvab084af42012-12-07 10:36:55 +00002445
Rafael Espindola08013342013-12-07 19:34:20 +00002446...where '``<parameter list>``' is a comma-separated list of type
2447specifiers. Optionally, the parameter list may include a type ``...``, which
Sean Silvaa1190322015-08-06 22:56:48 +00002448indicates that the function takes a variable number of arguments. Variable
Rafael Espindola08013342013-12-07 19:34:20 +00002449argument functions can access their arguments with the :ref:`variable argument
Sean Silvaa1190322015-08-06 22:56:48 +00002450handling intrinsic <int_varargs>` functions. '``<returntype>``' is any type
Rafael Espindola08013342013-12-07 19:34:20 +00002451except :ref:`label <t_label>` and :ref:`metadata <t_metadata>`.
Sean Silvab084af42012-12-07 10:36:55 +00002452
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002453:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002454
Rafael Espindola08013342013-12-07 19:34:20 +00002455+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2456| ``i32 (i32)`` | function taking an ``i32``, returning an ``i32`` |
2457+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2458| ``float (i16, i32 *) *`` | :ref:`Pointer <t_pointer>` to a function that takes an ``i16`` and a :ref:`pointer <t_pointer>` to ``i32``, returning ``float``. |
2459+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2460| ``i32 (i8*, ...)`` | A vararg function that takes at least one :ref:`pointer <t_pointer>` to ``i8`` (char in C), which returns an integer. This is the signature for ``printf`` in LLVM. |
2461+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2462| ``{i32, i32} (i32)`` | A function taking an ``i32``, returning a :ref:`structure <t_struct>` containing two ``i32`` values |
2463+---------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2464
2465.. _t_firstclass:
2466
2467First Class Types
2468-----------------
Sean Silvab084af42012-12-07 10:36:55 +00002469
2470The :ref:`first class <t_firstclass>` types are perhaps the most important.
2471Values of these types are the only ones which can be produced by
2472instructions.
2473
Rafael Espindola08013342013-12-07 19:34:20 +00002474.. _t_single_value:
Sean Silvab084af42012-12-07 10:36:55 +00002475
Rafael Espindola08013342013-12-07 19:34:20 +00002476Single Value Types
2477^^^^^^^^^^^^^^^^^^
Sean Silvab084af42012-12-07 10:36:55 +00002478
Rafael Espindola08013342013-12-07 19:34:20 +00002479These are the types that are valid in registers from CodeGen's perspective.
Sean Silvab084af42012-12-07 10:36:55 +00002480
2481.. _t_integer:
2482
2483Integer Type
Rafael Espindola08013342013-12-07 19:34:20 +00002484""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002485
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002486:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002487
2488The integer type is a very simple type that simply specifies an
2489arbitrary bit width for the integer type desired. Any bit width from 1
2490bit to 2\ :sup:`23`\ -1 (about 8 million) can be specified.
2491
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002492:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002493
2494::
2495
2496 iN
2497
2498The number of bits the integer will occupy is specified by the ``N``
2499value.
2500
2501Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002502*********
Sean Silvab084af42012-12-07 10:36:55 +00002503
2504+----------------+------------------------------------------------+
2505| ``i1`` | a single-bit integer. |
2506+----------------+------------------------------------------------+
2507| ``i32`` | a 32-bit integer. |
2508+----------------+------------------------------------------------+
2509| ``i1942652`` | a really big integer of over 1 million bits. |
2510+----------------+------------------------------------------------+
2511
2512.. _t_floating:
2513
2514Floating Point Types
Rafael Espindola08013342013-12-07 19:34:20 +00002515""""""""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002516
2517.. list-table::
2518 :header-rows: 1
2519
2520 * - Type
2521 - Description
2522
2523 * - ``half``
2524 - 16-bit floating point value
2525
2526 * - ``float``
2527 - 32-bit floating point value
2528
2529 * - ``double``
2530 - 64-bit floating point value
2531
2532 * - ``fp128``
2533 - 128-bit floating point value (112-bit mantissa)
2534
2535 * - ``x86_fp80``
2536 - 80-bit floating point value (X87)
2537
2538 * - ``ppc_fp128``
2539 - 128-bit floating point value (two 64-bits)
2540
Reid Kleckner9a16d082014-03-05 02:41:37 +00002541X86_mmx Type
2542""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002543
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002544:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002545
Reid Kleckner9a16d082014-03-05 02:41:37 +00002546The x86_mmx type represents a value held in an MMX register on an x86
Sean Silvab084af42012-12-07 10:36:55 +00002547machine. The operations allowed on it are quite limited: parameters and
2548return values, load and store, and bitcast. User-specified MMX
2549instructions are represented as intrinsic or asm calls with arguments
2550and/or results of this type. There are no arrays, vectors or constants
2551of this type.
2552
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002553:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002554
2555::
2556
Reid Kleckner9a16d082014-03-05 02:41:37 +00002557 x86_mmx
Sean Silvab084af42012-12-07 10:36:55 +00002558
Sean Silvab084af42012-12-07 10:36:55 +00002559
Rafael Espindola08013342013-12-07 19:34:20 +00002560.. _t_pointer:
2561
2562Pointer Type
2563""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002564
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002565:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002566
Rafael Espindola08013342013-12-07 19:34:20 +00002567The pointer type is used to specify memory locations. Pointers are
2568commonly used to reference objects in memory.
2569
2570Pointer types may have an optional address space attribute defining the
2571numbered address space where the pointed-to object resides. The default
2572address space is number zero. The semantics of non-zero address spaces
2573are target-specific.
2574
2575Note that LLVM does not permit pointers to void (``void*``) nor does it
2576permit pointers to labels (``label*``). Use ``i8*`` instead.
Sean Silvab084af42012-12-07 10:36:55 +00002577
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002578:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002579
2580::
2581
Rafael Espindola08013342013-12-07 19:34:20 +00002582 <type> *
2583
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002584:Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002585
2586+-------------------------+--------------------------------------------------------------------------------------------------------------+
2587| ``[4 x i32]*`` | A :ref:`pointer <t_pointer>` to :ref:`array <t_array>` of four ``i32`` values. |
2588+-------------------------+--------------------------------------------------------------------------------------------------------------+
2589| ``i32 (i32*) *`` | A :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32*``, returning an ``i32``. |
2590+-------------------------+--------------------------------------------------------------------------------------------------------------+
2591| ``i32 addrspace(5)*`` | A :ref:`pointer <t_pointer>` to an ``i32`` value that resides in address space #5. |
2592+-------------------------+--------------------------------------------------------------------------------------------------------------+
2593
2594.. _t_vector:
2595
2596Vector Type
2597"""""""""""
2598
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002599:Overview:
Rafael Espindola08013342013-12-07 19:34:20 +00002600
2601A vector type is a simple derived type that represents a vector of
2602elements. Vector types are used when multiple primitive data are
2603operated in parallel using a single instruction (SIMD). A vector type
2604requires a size (number of elements) and an underlying primitive data
2605type. Vector types are considered :ref:`first class <t_firstclass>`.
2606
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002607:Syntax:
Rafael Espindola08013342013-12-07 19:34:20 +00002608
2609::
2610
2611 < <# elements> x <elementtype> >
2612
2613The number of elements is a constant integer value larger than 0;
Manuel Jacob961f7872014-07-30 12:30:06 +00002614elementtype may be any integer, floating point or pointer type. Vectors
2615of size zero are not allowed.
Rafael Espindola08013342013-12-07 19:34:20 +00002616
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002617:Examples:
Rafael Espindola08013342013-12-07 19:34:20 +00002618
2619+-------------------+--------------------------------------------------+
2620| ``<4 x i32>`` | Vector of 4 32-bit integer values. |
2621+-------------------+--------------------------------------------------+
2622| ``<8 x float>`` | Vector of 8 32-bit floating-point values. |
2623+-------------------+--------------------------------------------------+
2624| ``<2 x i64>`` | Vector of 2 64-bit integer values. |
2625+-------------------+--------------------------------------------------+
2626| ``<4 x i64*>`` | Vector of 4 pointers to 64-bit integer values. |
2627+-------------------+--------------------------------------------------+
Sean Silvab084af42012-12-07 10:36:55 +00002628
2629.. _t_label:
2630
2631Label Type
2632^^^^^^^^^^
2633
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002634:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002635
2636The label type represents code labels.
2637
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002638:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002639
2640::
2641
2642 label
2643
David Majnemerb611e3f2015-08-14 05:09:07 +00002644.. _t_token:
2645
2646Token Type
2647^^^^^^^^^^
2648
2649:Overview:
2650
2651The token type is used when a value is associated with an instruction
2652but all uses of the value must not attempt to introspect or obscure it.
2653As such, it is not appropriate to have a :ref:`phi <i_phi>` or
2654:ref:`select <i_select>` of type token.
2655
2656:Syntax:
2657
2658::
2659
2660 token
2661
2662
2663
Sean Silvab084af42012-12-07 10:36:55 +00002664.. _t_metadata:
2665
2666Metadata Type
2667^^^^^^^^^^^^^
2668
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002669:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002670
2671The metadata type represents embedded metadata. No derived types may be
2672created from metadata except for :ref:`function <t_function>` arguments.
2673
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002674:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002675
2676::
2677
2678 metadata
2679
Sean Silvab084af42012-12-07 10:36:55 +00002680.. _t_aggregate:
2681
2682Aggregate Types
2683^^^^^^^^^^^^^^^
2684
2685Aggregate Types are a subset of derived types that can contain multiple
2686member types. :ref:`Arrays <t_array>` and :ref:`structs <t_struct>` are
2687aggregate types. :ref:`Vectors <t_vector>` are not considered to be
2688aggregate types.
2689
2690.. _t_array:
2691
2692Array Type
Rafael Espindola08013342013-12-07 19:34:20 +00002693""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002694
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002695:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002696
2697The array type is a very simple derived type that arranges elements
2698sequentially in memory. The array type requires a size (number of
2699elements) and an underlying data type.
2700
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002701:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002702
2703::
2704
2705 [<# elements> x <elementtype>]
2706
2707The number of elements is a constant integer value; ``elementtype`` may
2708be any type with a size.
2709
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002710:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002711
2712+------------------+--------------------------------------+
2713| ``[40 x i32]`` | Array of 40 32-bit integer values. |
2714+------------------+--------------------------------------+
2715| ``[41 x i32]`` | Array of 41 32-bit integer values. |
2716+------------------+--------------------------------------+
2717| ``[4 x i8]`` | Array of 4 8-bit integer values. |
2718+------------------+--------------------------------------+
2719
2720Here are some examples of multidimensional arrays:
2721
2722+-----------------------------+----------------------------------------------------------+
2723| ``[3 x [4 x i32]]`` | 3x4 array of 32-bit integer values. |
2724+-----------------------------+----------------------------------------------------------+
2725| ``[12 x [10 x float]]`` | 12x10 array of single precision floating point values. |
2726+-----------------------------+----------------------------------------------------------+
2727| ``[2 x [3 x [4 x i16]]]`` | 2x3x4 array of 16-bit integer values. |
2728+-----------------------------+----------------------------------------------------------+
2729
2730There is no restriction on indexing beyond the end of the array implied
2731by a static type (though there are restrictions on indexing beyond the
2732bounds of an allocated object in some cases). This means that
2733single-dimension 'variable sized array' addressing can be implemented in
2734LLVM with a zero length array type. An implementation of 'pascal style
2735arrays' in LLVM could use the type "``{ i32, [0 x float]}``", for
2736example.
2737
Sean Silvab084af42012-12-07 10:36:55 +00002738.. _t_struct:
2739
2740Structure Type
Rafael Espindola08013342013-12-07 19:34:20 +00002741""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002742
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002743:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002744
2745The structure type is used to represent a collection of data members
2746together in memory. The elements of a structure may be any type that has
2747a size.
2748
2749Structures in memory are accessed using '``load``' and '``store``' by
2750getting a pointer to a field with the '``getelementptr``' instruction.
2751Structures in registers are accessed using the '``extractvalue``' and
2752'``insertvalue``' instructions.
2753
2754Structures may optionally be "packed" structures, which indicate that
2755the alignment of the struct is one byte, and that there is no padding
2756between the elements. In non-packed structs, padding between field types
2757is inserted as defined by the DataLayout string in the module, which is
2758required to match what the underlying code generator expects.
2759
2760Structures can either be "literal" or "identified". A literal structure
2761is defined inline with other types (e.g. ``{i32, i32}*``) whereas
2762identified types are always defined at the top level with a name.
2763Literal types are uniqued by their contents and can never be recursive
2764or opaque since there is no way to write one. Identified types can be
2765recursive, can be opaqued, and are never uniqued.
2766
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002767:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002768
2769::
2770
2771 %T1 = type { <type list> } ; Identified normal struct type
2772 %T2 = type <{ <type list> }> ; Identified packed struct type
2773
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002774:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002775
2776+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2777| ``{ i32, i32, i32 }`` | A triple of three ``i32`` values |
2778+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00002779| ``{ float, i32 (i32) * }`` | A pair, where the first element is a ``float`` and the second element is a :ref:`pointer <t_pointer>` to a :ref:`function <t_function>` that takes an ``i32``, returning an ``i32``. |
Sean Silvab084af42012-12-07 10:36:55 +00002780+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2781| ``<{ i8, i32 }>`` | A packed struct known to be 5 bytes in size. |
2782+------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
2783
2784.. _t_opaque:
2785
2786Opaque Structure Types
Rafael Espindola08013342013-12-07 19:34:20 +00002787""""""""""""""""""""""
Sean Silvab084af42012-12-07 10:36:55 +00002788
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002789:Overview:
Sean Silvab084af42012-12-07 10:36:55 +00002790
2791Opaque structure types are used to represent named structure types that
2792do not have a body specified. This corresponds (for example) to the C
2793notion of a forward declared structure.
2794
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002795:Syntax:
Sean Silvab084af42012-12-07 10:36:55 +00002796
2797::
2798
2799 %X = type opaque
2800 %52 = type opaque
2801
Rafael Espindola2f6d7b92013-12-10 14:53:22 +00002802:Examples:
Sean Silvab084af42012-12-07 10:36:55 +00002803
2804+--------------+-------------------+
2805| ``opaque`` | An opaque type. |
2806+--------------+-------------------+
2807
Sean Silva1703e702014-04-08 21:06:22 +00002808.. _constants:
2809
Sean Silvab084af42012-12-07 10:36:55 +00002810Constants
2811=========
2812
2813LLVM has several different basic types of constants. This section
2814describes them all and their syntax.
2815
2816Simple Constants
2817----------------
2818
2819**Boolean constants**
2820 The two strings '``true``' and '``false``' are both valid constants
2821 of the ``i1`` type.
2822**Integer constants**
2823 Standard integers (such as '4') are constants of the
2824 :ref:`integer <t_integer>` type. Negative numbers may be used with
2825 integer types.
2826**Floating point constants**
2827 Floating point constants use standard decimal notation (e.g.
2828 123.421), exponential notation (e.g. 1.23421e+2), or a more precise
2829 hexadecimal notation (see below). The assembler requires the exact
2830 decimal value of a floating-point constant. For example, the
2831 assembler accepts 1.25 but rejects 1.3 because 1.3 is a repeating
2832 decimal in binary. Floating point constants must have a :ref:`floating
2833 point <t_floating>` type.
2834**Null pointer constants**
2835 The identifier '``null``' is recognized as a null pointer constant
2836 and must be of :ref:`pointer type <t_pointer>`.
David Majnemerf0f224d2015-11-11 21:57:16 +00002837**Token constants**
2838 The identifier '``none``' is recognized as an empty token constant
2839 and must be of :ref:`token type <t_token>`.
Sean Silvab084af42012-12-07 10:36:55 +00002840
2841The one non-intuitive notation for constants is the hexadecimal form of
2842floating point constants. For example, the form
2843'``double 0x432ff973cafa8000``' is equivalent to (but harder to read
2844than) '``double 4.5e+15``'. The only time hexadecimal floating point
2845constants are required (and the only time that they are generated by the
2846disassembler) is when a floating point constant must be emitted but it
2847cannot be represented as a decimal floating point number in a reasonable
2848number of digits. For example, NaN's, infinities, and other special
2849values are represented in their IEEE hexadecimal format so that assembly
2850and disassembly do not cause any bits to change in the constants.
2851
2852When using the hexadecimal form, constants of types half, float, and
2853double are represented using the 16-digit form shown above (which
2854matches the IEEE754 representation for double); half and float values
Dmitri Gribenko4dc2ba12013-01-16 23:40:37 +00002855must, however, be exactly representable as IEEE 754 half and single
Sean Silvab084af42012-12-07 10:36:55 +00002856precision, respectively. Hexadecimal format is always used for long
2857double, and there are three forms of long double. The 80-bit format used
2858by x86 is represented as ``0xK`` followed by 20 hexadecimal digits. The
2859128-bit format used by PowerPC (two adjacent doubles) is represented by
2860``0xM`` followed by 32 hexadecimal digits. The IEEE 128-bit format is
Richard Sandifordae426b42013-05-03 14:32:27 +00002861represented by ``0xL`` followed by 32 hexadecimal digits. Long doubles
2862will only work if they match the long double format on your target.
2863The IEEE 16-bit format (half precision) is represented by ``0xH``
2864followed by 4 hexadecimal digits. All hexadecimal formats are big-endian
2865(sign bit at the left).
Sean Silvab084af42012-12-07 10:36:55 +00002866
Reid Kleckner9a16d082014-03-05 02:41:37 +00002867There are no constants of type x86_mmx.
Sean Silvab084af42012-12-07 10:36:55 +00002868
Eli Bendersky0220e6b2013-06-07 20:24:43 +00002869.. _complexconstants:
2870
Sean Silvab084af42012-12-07 10:36:55 +00002871Complex Constants
2872-----------------
2873
2874Complex constants are a (potentially recursive) combination of simple
2875constants and smaller complex constants.
2876
2877**Structure constants**
2878 Structure constants are represented with notation similar to
2879 structure type definitions (a comma separated list of elements,
2880 surrounded by braces (``{}``)). For example:
2881 "``{ i32 4, float 17.0, i32* @G }``", where "``@G``" is declared as
2882 "``@G = external global i32``". Structure constants must have
2883 :ref:`structure type <t_struct>`, and the number and types of elements
2884 must match those specified by the type.
2885**Array constants**
2886 Array constants are represented with notation similar to array type
2887 definitions (a comma separated list of elements, surrounded by
2888 square brackets (``[]``)). For example:
2889 "``[ i32 42, i32 11, i32 74 ]``". Array constants must have
2890 :ref:`array type <t_array>`, and the number and types of elements must
Daniel Sandersf6051842014-09-11 12:02:59 +00002891 match those specified by the type. As a special case, character array
2892 constants may also be represented as a double-quoted string using the ``c``
2893 prefix. For example: "``c"Hello World\0A\00"``".
Sean Silvab084af42012-12-07 10:36:55 +00002894**Vector constants**
2895 Vector constants are represented with notation similar to vector
2896 type definitions (a comma separated list of elements, surrounded by
2897 less-than/greater-than's (``<>``)). For example:
2898 "``< i32 42, i32 11, i32 74, i32 100 >``". Vector constants
2899 must have :ref:`vector type <t_vector>`, and the number and types of
2900 elements must match those specified by the type.
2901**Zero initialization**
2902 The string '``zeroinitializer``' can be used to zero initialize a
2903 value to zero of *any* type, including scalar and
2904 :ref:`aggregate <t_aggregate>` types. This is often used to avoid
2905 having to print large zero initializers (e.g. for large arrays) and
2906 is always exactly equivalent to using explicit zero initializers.
2907**Metadata node**
Sean Silvaa1190322015-08-06 22:56:48 +00002908 A metadata node is a constant tuple without types. For example:
2909 "``!{!0, !{!2, !0}, !"test"}``". Metadata can reference constant values,
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00002910 for example: "``!{!0, i32 0, i8* @global, i64 (i64)* @function, !"str"}``".
2911 Unlike other typed constants that are meant to be interpreted as part of
2912 the instruction stream, metadata is a place to attach additional
Sean Silvab084af42012-12-07 10:36:55 +00002913 information such as debug info.
2914
2915Global Variable and Function Addresses
2916--------------------------------------
2917
2918The addresses of :ref:`global variables <globalvars>` and
2919:ref:`functions <functionstructure>` are always implicitly valid
2920(link-time) constants. These constants are explicitly referenced when
2921the :ref:`identifier for the global <identifiers>` is used and always have
2922:ref:`pointer <t_pointer>` type. For example, the following is a legal LLVM
2923file:
2924
2925.. code-block:: llvm
2926
2927 @X = global i32 17
2928 @Y = global i32 42
2929 @Z = global [2 x i32*] [ i32* @X, i32* @Y ]
2930
2931.. _undefvalues:
2932
2933Undefined Values
2934----------------
2935
2936The string '``undef``' can be used anywhere a constant is expected, and
2937indicates that the user of the value may receive an unspecified
2938bit-pattern. Undefined values may be of any type (other than '``label``'
2939or '``void``') and be used anywhere a constant is permitted.
2940
2941Undefined values are useful because they indicate to the compiler that
2942the program is well defined no matter what value is used. This gives the
2943compiler more freedom to optimize. Here are some examples of
2944(potentially surprising) transformations that are valid (in pseudo IR):
2945
2946.. code-block:: llvm
2947
2948 %A = add %X, undef
2949 %B = sub %X, undef
2950 %C = xor %X, undef
2951 Safe:
2952 %A = undef
2953 %B = undef
2954 %C = undef
2955
2956This is safe because all of the output bits are affected by the undef
2957bits. Any output bit can have a zero or one depending on the input bits.
2958
2959.. code-block:: llvm
2960
2961 %A = or %X, undef
2962 %B = and %X, undef
2963 Safe:
2964 %A = -1
2965 %B = 0
Sanjoy Das151493a2016-09-15 01:56:58 +00002966 Safe:
2967 %A = %X ;; By choosing undef as 0
2968 %B = %X ;; By choosing undef as -1
Sean Silvab084af42012-12-07 10:36:55 +00002969 Unsafe:
2970 %A = undef
2971 %B = undef
2972
2973These logical operations have bits that are not always affected by the
2974input. For example, if ``%X`` has a zero bit, then the output of the
2975'``and``' operation will always be a zero for that bit, no matter what
2976the corresponding bit from the '``undef``' is. As such, it is unsafe to
2977optimize or assume that the result of the '``and``' is '``undef``'.
2978However, it is safe to assume that all bits of the '``undef``' could be
29790, and optimize the '``and``' to 0. Likewise, it is safe to assume that
2980all the bits of the '``undef``' operand to the '``or``' could be set,
2981allowing the '``or``' to be folded to -1.
2982
2983.. code-block:: llvm
2984
2985 %A = select undef, %X, %Y
2986 %B = select undef, 42, %Y
2987 %C = select %X, %Y, undef
2988 Safe:
2989 %A = %X (or %Y)
2990 %B = 42 (or %Y)
2991 %C = %Y
2992 Unsafe:
2993 %A = undef
2994 %B = undef
2995 %C = undef
2996
2997This set of examples shows that undefined '``select``' (and conditional
2998branch) conditions can go *either way*, but they have to come from one
2999of the two operands. In the ``%A`` example, if ``%X`` and ``%Y`` were
3000both known to have a clear low bit, then ``%A`` would have to have a
3001cleared low bit. However, in the ``%C`` example, the optimizer is
3002allowed to assume that the '``undef``' operand could be the same as
3003``%Y``, allowing the whole '``select``' to be eliminated.
3004
Renato Golin124f2592016-07-20 12:16:38 +00003005.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00003006
3007 %A = xor undef, undef
3008
3009 %B = undef
3010 %C = xor %B, %B
3011
3012 %D = undef
Jonathan Roelofsec81c0b2014-10-16 19:28:10 +00003013 %E = icmp slt %D, 4
Sean Silvab084af42012-12-07 10:36:55 +00003014 %F = icmp gte %D, 4
3015
3016 Safe:
3017 %A = undef
3018 %B = undef
3019 %C = undef
3020 %D = undef
3021 %E = undef
3022 %F = undef
3023
3024This example points out that two '``undef``' operands are not
3025necessarily the same. This can be surprising to people (and also matches
3026C semantics) where they assume that "``X^X``" is always zero, even if
3027``X`` is undefined. This isn't true for a number of reasons, but the
3028short answer is that an '``undef``' "variable" can arbitrarily change
3029its value over its "live range". This is true because the variable
3030doesn't actually *have a live range*. Instead, the value is logically
3031read from arbitrary registers that happen to be around when needed, so
3032the value is not necessarily consistent over time. In fact, ``%A`` and
3033``%C`` need to have the same semantics or the core LLVM "replace all
3034uses with" concept would not hold.
3035
3036.. code-block:: llvm
3037
3038 %A = fdiv undef, %X
3039 %B = fdiv %X, undef
3040 Safe:
3041 %A = undef
3042 b: unreachable
3043
3044These examples show the crucial difference between an *undefined value*
3045and *undefined behavior*. An undefined value (like '``undef``') is
3046allowed to have an arbitrary bit-pattern. This means that the ``%A``
3047operation can be constant folded to '``undef``', because the '``undef``'
3048could be an SNaN, and ``fdiv`` is not (currently) defined on SNaN's.
3049However, in the second example, we can make a more aggressive
3050assumption: because the ``undef`` is allowed to be an arbitrary value,
3051we are allowed to assume that it could be zero. Since a divide by zero
3052has *undefined behavior*, we are allowed to assume that the operation
3053does not execute at all. This allows us to delete the divide and all
3054code after it. Because the undefined operation "can't happen", the
3055optimizer can assume that it occurs in dead code.
3056
Renato Golin124f2592016-07-20 12:16:38 +00003057.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00003058
3059 a: store undef -> %X
3060 b: store %X -> undef
3061 Safe:
3062 a: <deleted>
3063 b: unreachable
3064
3065These examples reiterate the ``fdiv`` example: a store *of* an undefined
3066value can be assumed to not have any effect; we can assume that the
3067value is overwritten with bits that happen to match what was already
3068there. However, a store *to* an undefined location could clobber
3069arbitrary memory, therefore, it has undefined behavior.
3070
3071.. _poisonvalues:
3072
3073Poison Values
3074-------------
3075
3076Poison values are similar to :ref:`undef values <undefvalues>`, however
3077they also represent the fact that an instruction or constant expression
Richard Smith32dbdf62014-07-31 04:25:36 +00003078that cannot evoke side effects has nevertheless detected a condition
3079that results in undefined behavior.
Sean Silvab084af42012-12-07 10:36:55 +00003080
3081There is currently no way of representing a poison value in the IR; they
3082only exist when produced by operations such as :ref:`add <i_add>` with
3083the ``nsw`` flag.
3084
3085Poison value behavior is defined in terms of value *dependence*:
3086
3087- Values other than :ref:`phi <i_phi>` nodes depend on their operands.
3088- :ref:`Phi <i_phi>` nodes depend on the operand corresponding to
3089 their dynamic predecessor basic block.
3090- Function arguments depend on the corresponding actual argument values
3091 in the dynamic callers of their functions.
3092- :ref:`Call <i_call>` instructions depend on the :ref:`ret <i_ret>`
3093 instructions that dynamically transfer control back to them.
3094- :ref:`Invoke <i_invoke>` instructions depend on the
3095 :ref:`ret <i_ret>`, :ref:`resume <i_resume>`, or exception-throwing
3096 call instructions that dynamically transfer control back to them.
3097- Non-volatile loads and stores depend on the most recent stores to all
3098 of the referenced memory addresses, following the order in the IR
3099 (including loads and stores implied by intrinsics such as
3100 :ref:`@llvm.memcpy <int_memcpy>`.)
3101- An instruction with externally visible side effects depends on the
3102 most recent preceding instruction with externally visible side
3103 effects, following the order in the IR. (This includes :ref:`volatile
3104 operations <volatile>`.)
3105- An instruction *control-depends* on a :ref:`terminator
3106 instruction <terminators>` if the terminator instruction has
3107 multiple successors and the instruction is always executed when
3108 control transfers to one of the successors, and may not be executed
3109 when control is transferred to another.
3110- Additionally, an instruction also *control-depends* on a terminator
3111 instruction if the set of instructions it otherwise depends on would
3112 be different if the terminator had transferred control to a different
3113 successor.
3114- Dependence is transitive.
3115
Richard Smith32dbdf62014-07-31 04:25:36 +00003116Poison values have the same behavior as :ref:`undef values <undefvalues>`,
3117with the additional effect that any instruction that has a *dependence*
Sean Silvab084af42012-12-07 10:36:55 +00003118on a poison value has undefined behavior.
3119
3120Here are some examples:
3121
3122.. code-block:: llvm
3123
3124 entry:
3125 %poison = sub nuw i32 0, 1 ; Results in a poison value.
3126 %still_poison = and i32 %poison, 0 ; 0, but also poison.
David Blaikie16a97eb2015-03-04 22:02:58 +00003127 %poison_yet_again = getelementptr i32, i32* @h, i32 %still_poison
Sean Silvab084af42012-12-07 10:36:55 +00003128 store i32 0, i32* %poison_yet_again ; memory at @h[0] is poisoned
3129
3130 store i32 %poison, i32* @g ; Poison value stored to memory.
David Blaikiec7aabbb2015-03-04 22:06:14 +00003131 %poison2 = load i32, i32* @g ; Poison value loaded back from memory.
Sean Silvab084af42012-12-07 10:36:55 +00003132
3133 store volatile i32 %poison, i32* @g ; External observation; undefined behavior.
3134
3135 %narrowaddr = bitcast i32* @g to i16*
3136 %wideaddr = bitcast i32* @g to i64*
David Blaikiec7aabbb2015-03-04 22:06:14 +00003137 %poison3 = load i16, i16* %narrowaddr ; Returns a poison value.
3138 %poison4 = load i64, i64* %wideaddr ; Returns a poison value.
Sean Silvab084af42012-12-07 10:36:55 +00003139
3140 %cmp = icmp slt i32 %poison, 0 ; Returns a poison value.
3141 br i1 %cmp, label %true, label %end ; Branch to either destination.
3142
3143 true:
3144 store volatile i32 0, i32* @g ; This is control-dependent on %cmp, so
3145 ; it has undefined behavior.
3146 br label %end
3147
3148 end:
3149 %p = phi i32 [ 0, %entry ], [ 1, %true ]
3150 ; Both edges into this PHI are
3151 ; control-dependent on %cmp, so this
3152 ; always results in a poison value.
3153
3154 store volatile i32 0, i32* @g ; This would depend on the store in %true
3155 ; if %cmp is true, or the store in %entry
3156 ; otherwise, so this is undefined behavior.
3157
3158 br i1 %cmp, label %second_true, label %second_end
3159 ; The same branch again, but this time the
3160 ; true block doesn't have side effects.
3161
3162 second_true:
3163 ; No side effects!
3164 ret void
3165
3166 second_end:
3167 store volatile i32 0, i32* @g ; This time, the instruction always depends
3168 ; on the store in %end. Also, it is
3169 ; control-equivalent to %end, so this is
3170 ; well-defined (ignoring earlier undefined
3171 ; behavior in this example).
3172
3173.. _blockaddress:
3174
3175Addresses of Basic Blocks
3176-------------------------
3177
3178``blockaddress(@function, %block)``
3179
3180The '``blockaddress``' constant computes the address of the specified
3181basic block in the specified function, and always has an ``i8*`` type.
3182Taking the address of the entry block is illegal.
3183
3184This value only has defined behavior when used as an operand to the
3185':ref:`indirectbr <i_indirectbr>`' instruction, or for comparisons
3186against null. Pointer equality tests between labels addresses results in
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00003187undefined behavior --- though, again, comparison against null is ok, and
Sean Silvab084af42012-12-07 10:36:55 +00003188no label is equal to the null pointer. This may be passed around as an
3189opaque pointer sized value as long as the bits are not inspected. This
3190allows ``ptrtoint`` and arithmetic to be performed on these values so
3191long as the original value is reconstituted before the ``indirectbr``
3192instruction.
3193
3194Finally, some targets may provide defined semantics when using the value
3195as the operand to an inline assembly, but that is target specific.
3196
Eli Bendersky0220e6b2013-06-07 20:24:43 +00003197.. _constantexprs:
3198
Sean Silvab084af42012-12-07 10:36:55 +00003199Constant Expressions
3200--------------------
3201
3202Constant expressions are used to allow expressions involving other
3203constants to be used as constants. Constant expressions may be of any
3204:ref:`first class <t_firstclass>` type and may involve any LLVM operation
3205that does not have side effects (e.g. load and call are not supported).
3206The following is the syntax for constant expressions:
3207
3208``trunc (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003209 Perform the :ref:`trunc operation <i_trunc>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003210``zext (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003211 Perform the :ref:`zext operation <i_zext>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003212``sext (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003213 Perform the :ref:`sext operation <i_sext>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003214``fptrunc (CST to TYPE)``
3215 Truncate a floating point constant to another floating point type.
3216 The size of CST must be larger than the size of TYPE. Both types
3217 must be floating point.
3218``fpext (CST to TYPE)``
3219 Floating point extend a constant to another type. The size of CST
3220 must be smaller or equal to the size of TYPE. Both types must be
3221 floating point.
3222``fptoui (CST to TYPE)``
3223 Convert a floating point constant to the corresponding unsigned
3224 integer constant. TYPE must be a scalar or vector integer type. CST
3225 must be of scalar or vector floating point type. Both CST and TYPE
3226 must be scalars, or vectors of the same number of elements. If the
3227 value won't fit in the integer type, the results are undefined.
3228``fptosi (CST to TYPE)``
3229 Convert a floating point constant to the corresponding signed
3230 integer constant. TYPE must be a scalar or vector integer type. CST
3231 must be of scalar or vector floating point type. Both CST and TYPE
3232 must be scalars, or vectors of the same number of elements. If the
3233 value won't fit in the integer type, the results are undefined.
3234``uitofp (CST to TYPE)``
3235 Convert an unsigned integer constant to the corresponding floating
3236 point constant. TYPE must be a scalar or vector floating point type.
3237 CST must be of scalar or vector integer type. Both CST and TYPE must
3238 be scalars, or vectors of the same number of elements. If the value
3239 won't fit in the floating point type, the results are undefined.
3240``sitofp (CST to TYPE)``
3241 Convert a signed integer constant to the corresponding floating
3242 point constant. TYPE must be a scalar or vector floating point type.
3243 CST must be of scalar or vector integer type. Both CST and TYPE must
3244 be scalars, or vectors of the same number of elements. If the value
3245 won't fit in the floating point type, the results are undefined.
3246``ptrtoint (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003247 Perform the :ref:`ptrtoint operation <i_ptrtoint>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003248``inttoptr (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003249 Perform the :ref:`inttoptr operation <i_inttoptr>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003250 This one is *really* dangerous!
3251``bitcast (CST to TYPE)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003252 Convert a constant, CST, to another TYPE.
3253 The constraints of the operands are the same as those for the
3254 :ref:`bitcast instruction <i_bitcast>`.
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00003255``addrspacecast (CST to TYPE)``
3256 Convert a constant pointer or constant vector of pointer, CST, to another
3257 TYPE in a different address space. The constraints of the operands are the
3258 same as those for the :ref:`addrspacecast instruction <i_addrspacecast>`.
David Blaikief72d05b2015-03-13 18:20:45 +00003259``getelementptr (TY, CSTPTR, IDX0, IDX1, ...)``, ``getelementptr inbounds (TY, CSTPTR, IDX0, IDX1, ...)``
Sean Silvab084af42012-12-07 10:36:55 +00003260 Perform the :ref:`getelementptr operation <i_getelementptr>` on
3261 constants. As with the :ref:`getelementptr <i_getelementptr>`
David Blaikief91b0302017-06-19 05:34:21 +00003262 instruction, the index list may have one or more indexes, which are
David Blaikief72d05b2015-03-13 18:20:45 +00003263 required to make sense for the type of "pointer to TY".
Sean Silvab084af42012-12-07 10:36:55 +00003264``select (COND, VAL1, VAL2)``
3265 Perform the :ref:`select operation <i_select>` on constants.
3266``icmp COND (VAL1, VAL2)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003267 Perform the :ref:`icmp operation <i_icmp>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003268``fcmp COND (VAL1, VAL2)``
Bjorn Petterssone1285e32017-10-24 11:59:20 +00003269 Perform the :ref:`fcmp operation <i_fcmp>` on constants.
Sean Silvab084af42012-12-07 10:36:55 +00003270``extractelement (VAL, IDX)``
3271 Perform the :ref:`extractelement operation <i_extractelement>` on
3272 constants.
3273``insertelement (VAL, ELT, IDX)``
3274 Perform the :ref:`insertelement operation <i_insertelement>` on
3275 constants.
3276``shufflevector (VEC1, VEC2, IDXMASK)``
3277 Perform the :ref:`shufflevector operation <i_shufflevector>` on
3278 constants.
3279``extractvalue (VAL, IDX0, IDX1, ...)``
3280 Perform the :ref:`extractvalue operation <i_extractvalue>` on
3281 constants. The index list is interpreted in a similar manner as
3282 indices in a ':ref:`getelementptr <i_getelementptr>`' operation. At
3283 least one index value must be specified.
3284``insertvalue (VAL, ELT, IDX0, IDX1, ...)``
3285 Perform the :ref:`insertvalue operation <i_insertvalue>` on constants.
3286 The index list is interpreted in a similar manner as indices in a
3287 ':ref:`getelementptr <i_getelementptr>`' operation. At least one index
3288 value must be specified.
3289``OPCODE (LHS, RHS)``
3290 Perform the specified operation of the LHS and RHS constants. OPCODE
3291 may be any of the :ref:`binary <binaryops>` or :ref:`bitwise
3292 binary <bitwiseops>` operations. The constraints on operands are
3293 the same as those for the corresponding instruction (e.g. no bitwise
3294 operations on floating point values are allowed).
3295
3296Other Values
3297============
3298
Eli Bendersky0220e6b2013-06-07 20:24:43 +00003299.. _inlineasmexprs:
3300
Sean Silvab084af42012-12-07 10:36:55 +00003301Inline Assembler Expressions
3302----------------------------
3303
3304LLVM supports inline assembler expressions (as opposed to :ref:`Module-Level
James Y Knightbc832ed2015-07-08 18:08:36 +00003305Inline Assembly <moduleasm>`) through the use of a special value. This value
3306represents the inline assembler as a template string (containing the
3307instructions to emit), a list of operand constraints (stored as a string), a
3308flag that indicates whether or not the inline asm expression has side effects,
3309and a flag indicating whether the function containing the asm needs to align its
3310stack conservatively.
3311
3312The template string supports argument substitution of the operands using "``$``"
3313followed by a number, to indicate substitution of the given register/memory
3314location, as specified by the constraint string. "``${NUM:MODIFIER}``" may also
3315be used, where ``MODIFIER`` is a target-specific annotation for how to print the
3316operand (See :ref:`inline-asm-modifiers`).
3317
3318A literal "``$``" may be included by using "``$$``" in the template. To include
3319other special characters into the output, the usual "``\XX``" escapes may be
3320used, just as in other strings. Note that after template substitution, the
3321resulting assembly string is parsed by LLVM's integrated assembler unless it is
3322disabled -- even when emitting a ``.s`` file -- and thus must contain assembly
3323syntax known to LLVM.
3324
Reid Kleckner71cb1642017-02-06 18:08:45 +00003325LLVM also supports a few more substitions useful for writing inline assembly:
3326
3327- ``${:uid}``: Expands to a decimal integer unique to this inline assembly blob.
3328 This substitution is useful when declaring a local label. Many standard
3329 compiler optimizations, such as inlining, may duplicate an inline asm blob.
3330 Adding a blob-unique identifier ensures that the two labels will not conflict
3331 during assembly. This is used to implement `GCC's %= special format
3332 string <https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html>`_.
3333- ``${:comment}``: Expands to the comment character of the current target's
3334 assembly dialect. This is usually ``#``, but many targets use other strings,
3335 such as ``;``, ``//``, or ``!``.
3336- ``${:private}``: Expands to the assembler private label prefix. Labels with
3337 this prefix will not appear in the symbol table of the assembled object.
3338 Typically the prefix is ``L``, but targets may use other strings. ``.L`` is
3339 relatively popular.
3340
James Y Knightbc832ed2015-07-08 18:08:36 +00003341LLVM's support for inline asm is modeled closely on the requirements of Clang's
3342GCC-compatible inline-asm support. Thus, the feature-set and the constraint and
3343modifier codes listed here are similar or identical to those in GCC's inline asm
3344support. However, to be clear, the syntax of the template and constraint strings
3345described here is *not* the same as the syntax accepted by GCC and Clang, and,
3346while most constraint letters are passed through as-is by Clang, some get
3347translated to other codes when converting from the C source to the LLVM
3348assembly.
3349
3350An example inline assembler expression is:
Sean Silvab084af42012-12-07 10:36:55 +00003351
3352.. code-block:: llvm
3353
3354 i32 (i32) asm "bswap $0", "=r,r"
3355
3356Inline assembler expressions may **only** be used as the callee operand
3357of a :ref:`call <i_call>` or an :ref:`invoke <i_invoke>` instruction.
3358Thus, typically we have:
3359
3360.. code-block:: llvm
3361
3362 %X = call i32 asm "bswap $0", "=r,r"(i32 %Y)
3363
3364Inline asms with side effects not visible in the constraint list must be
3365marked as having side effects. This is done through the use of the
3366'``sideeffect``' keyword, like so:
3367
3368.. code-block:: llvm
3369
3370 call void asm sideeffect "eieio", ""()
3371
3372In some cases inline asms will contain code that will not work unless
3373the stack is aligned in some way, such as calls or SSE instructions on
3374x86, yet will not contain code that does that alignment within the asm.
3375The compiler should make conservative assumptions about what the asm
3376might contain and should generate its usual stack alignment code in the
3377prologue if the '``alignstack``' keyword is present:
3378
3379.. code-block:: llvm
3380
3381 call void asm alignstack "eieio", ""()
3382
3383Inline asms also support using non-standard assembly dialects. The
3384assumed dialect is ATT. When the '``inteldialect``' keyword is present,
3385the inline asm is using the Intel dialect. Currently, ATT and Intel are
3386the only supported dialects. An example is:
3387
3388.. code-block:: llvm
3389
3390 call void asm inteldialect "eieio", ""()
3391
3392If multiple keywords appear the '``sideeffect``' keyword must come
3393first, the '``alignstack``' keyword second and the '``inteldialect``'
3394keyword last.
3395
James Y Knightbc832ed2015-07-08 18:08:36 +00003396Inline Asm Constraint String
3397^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3398
3399The constraint list is a comma-separated string, each element containing one or
3400more constraint codes.
3401
3402For each element in the constraint list an appropriate register or memory
3403operand will be chosen, and it will be made available to assembly template
3404string expansion as ``$0`` for the first constraint in the list, ``$1`` for the
3405second, etc.
3406
3407There are three different types of constraints, which are distinguished by a
3408prefix symbol in front of the constraint code: Output, Input, and Clobber. The
3409constraints must always be given in that order: outputs first, then inputs, then
3410clobbers. They cannot be intermingled.
3411
3412There are also three different categories of constraint codes:
3413
3414- Register constraint. This is either a register class, or a fixed physical
3415 register. This kind of constraint will allocate a register, and if necessary,
3416 bitcast the argument or result to the appropriate type.
3417- Memory constraint. This kind of constraint is for use with an instruction
3418 taking a memory operand. Different constraints allow for different addressing
3419 modes used by the target.
3420- Immediate value constraint. This kind of constraint is for an integer or other
3421 immediate value which can be rendered directly into an instruction. The
3422 various target-specific constraints allow the selection of a value in the
3423 proper range for the instruction you wish to use it with.
3424
3425Output constraints
3426""""""""""""""""""
3427
3428Output constraints are specified by an "``=``" prefix (e.g. "``=r``"). This
3429indicates that the assembly will write to this operand, and the operand will
3430then be made available as a return value of the ``asm`` expression. Output
3431constraints do not consume an argument from the call instruction. (Except, see
3432below about indirect outputs).
3433
3434Normally, it is expected that no output locations are written to by the assembly
3435expression until *all* of the inputs have been read. As such, LLVM may assign
3436the same register to an output and an input. If this is not safe (e.g. if the
3437assembly contains two instructions, where the first writes to one output, and
3438the second reads an input and writes to a second output), then the "``&``"
3439modifier must be used (e.g. "``=&r``") to specify that the output is an
Sylvestre Ledru84666a12016-02-14 20:16:22 +00003440"early-clobber" output. Marking an output as "early-clobber" ensures that LLVM
James Y Knightbc832ed2015-07-08 18:08:36 +00003441will not use the same register for any inputs (other than an input tied to this
3442output).
3443
3444Input constraints
3445"""""""""""""""""
3446
3447Input constraints do not have a prefix -- just the constraint codes. Each input
3448constraint will consume one argument from the call instruction. It is not
3449permitted for the asm to write to any input register or memory location (unless
3450that input is tied to an output). Note also that multiple inputs may all be
3451assigned to the same register, if LLVM can determine that they necessarily all
3452contain the same value.
3453
3454Instead of providing a Constraint Code, input constraints may also "tie"
3455themselves to an output constraint, by providing an integer as the constraint
3456string. Tied inputs still consume an argument from the call instruction, and
3457take up a position in the asm template numbering as is usual -- they will simply
3458be constrained to always use the same register as the output they've been tied
3459to. For example, a constraint string of "``=r,0``" says to assign a register for
3460output, and use that register as an input as well (it being the 0'th
3461constraint).
3462
3463It is permitted to tie an input to an "early-clobber" output. In that case, no
3464*other* input may share the same register as the input tied to the early-clobber
3465(even when the other input has the same value).
3466
3467You may only tie an input to an output which has a register constraint, not a
3468memory constraint. Only a single input may be tied to an output.
3469
3470There is also an "interesting" feature which deserves a bit of explanation: if a
3471register class constraint allocates a register which is too small for the value
3472type operand provided as input, the input value will be split into multiple
3473registers, and all of them passed to the inline asm.
3474
3475However, this feature is often not as useful as you might think.
3476
3477Firstly, the registers are *not* guaranteed to be consecutive. So, on those
3478architectures that have instructions which operate on multiple consecutive
3479instructions, this is not an appropriate way to support them. (e.g. the 32-bit
3480SparcV8 has a 64-bit load, which instruction takes a single 32-bit register. The
3481hardware then loads into both the named register, and the next register. This
3482feature of inline asm would not be useful to support that.)
3483
3484A few of the targets provide a template string modifier allowing explicit access
3485to the second register of a two-register operand (e.g. MIPS ``L``, ``M``, and
3486``D``). On such an architecture, you can actually access the second allocated
3487register (yet, still, not any subsequent ones). But, in that case, you're still
3488probably better off simply splitting the value into two separate operands, for
3489clarity. (e.g. see the description of the ``A`` constraint on X86, which,
3490despite existing only for use with this feature, is not really a good idea to
3491use)
3492
3493Indirect inputs and outputs
3494"""""""""""""""""""""""""""
3495
3496Indirect output or input constraints can be specified by the "``*``" modifier
3497(which goes after the "``=``" in case of an output). This indicates that the asm
3498will write to or read from the contents of an *address* provided as an input
3499argument. (Note that in this way, indirect outputs act more like an *input* than
3500an output: just like an input, they consume an argument of the call expression,
3501rather than producing a return value. An indirect output constraint is an
3502"output" only in that the asm is expected to write to the contents of the input
3503memory location, instead of just read from it).
3504
3505This is most typically used for memory constraint, e.g. "``=*m``", to pass the
3506address of a variable as a value.
3507
3508It is also possible to use an indirect *register* constraint, but only on output
3509(e.g. "``=*r``"). This will cause LLVM to allocate a register for an output
3510value normally, and then, separately emit a store to the address provided as
3511input, after the provided inline asm. (It's not clear what value this
3512functionality provides, compared to writing the store explicitly after the asm
3513statement, and it can only produce worse code, since it bypasses many
3514optimization passes. I would recommend not using it.)
3515
3516
3517Clobber constraints
3518"""""""""""""""""""
3519
3520A clobber constraint is indicated by a "``~``" prefix. A clobber does not
3521consume an input operand, nor generate an output. Clobbers cannot use any of the
3522general constraint code letters -- they may use only explicit register
3523constraints, e.g. "``~{eax}``". The one exception is that a clobber string of
3524"``~{memory}``" indicates that the assembly writes to arbitrary undeclared
3525memory locations -- not only the memory pointed to by a declared indirect
3526output.
3527
Peter Zotov00257232016-08-30 10:48:31 +00003528Note that clobbering named registers that are also present in output
3529constraints is not legal.
3530
James Y Knightbc832ed2015-07-08 18:08:36 +00003531
3532Constraint Codes
3533""""""""""""""""
3534After a potential prefix comes constraint code, or codes.
3535
3536A Constraint Code is either a single letter (e.g. "``r``"), a "``^``" character
3537followed by two letters (e.g. "``^wc``"), or "``{``" register-name "``}``"
3538(e.g. "``{eax}``").
3539
3540The one and two letter constraint codes are typically chosen to be the same as
3541GCC's constraint codes.
3542
3543A single constraint may include one or more than constraint code in it, leaving
3544it up to LLVM to choose which one to use. This is included mainly for
3545compatibility with the translation of GCC inline asm coming from clang.
3546
3547There are two ways to specify alternatives, and either or both may be used in an
3548inline asm constraint list:
3549
35501) Append the codes to each other, making a constraint code set. E.g. "``im``"
3551 or "``{eax}m``". This means "choose any of the options in the set". The
3552 choice of constraint is made independently for each constraint in the
3553 constraint list.
3554
35552) Use "``|``" between constraint code sets, creating alternatives. Every
3556 constraint in the constraint list must have the same number of alternative
3557 sets. With this syntax, the same alternative in *all* of the items in the
3558 constraint list will be chosen together.
3559
3560Putting those together, you might have a two operand constraint string like
3561``"rm|r,ri|rm"``. This indicates that if operand 0 is ``r`` or ``m``, then
3562operand 1 may be one of ``r`` or ``i``. If operand 0 is ``r``, then operand 1
3563may be one of ``r`` or ``m``. But, operand 0 and 1 cannot both be of type m.
3564
3565However, the use of either of the alternatives features is *NOT* recommended, as
3566LLVM is not able to make an intelligent choice about which one to use. (At the
3567point it currently needs to choose, not enough information is available to do so
3568in a smart way.) Thus, it simply tries to make a choice that's most likely to
3569compile, not one that will be optimal performance. (e.g., given "``rm``", it'll
3570always choose to use memory, not registers). And, if given multiple registers,
3571or multiple register classes, it will simply choose the first one. (In fact, it
3572doesn't currently even ensure explicitly specified physical registers are
3573unique, so specifying multiple physical registers as alternatives, like
3574``{r11}{r12},{r11}{r12}``, will assign r11 to both operands, not at all what was
3575intended.)
3576
3577Supported Constraint Code List
3578""""""""""""""""""""""""""""""
3579
3580The constraint codes are, in general, expected to behave the same way they do in
3581GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3582inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3583and GCC likely indicates a bug in LLVM.
3584
3585Some constraint codes are typically supported by all targets:
3586
3587- ``r``: A register in the target's general purpose register class.
3588- ``m``: A memory address operand. It is target-specific what addressing modes
3589 are supported, typical examples are register, or register + register offset,
3590 or register + immediate offset (of some target-specific size).
3591- ``i``: An integer constant (of target-specific width). Allows either a simple
3592 immediate, or a relocatable value.
3593- ``n``: An integer constant -- *not* including relocatable values.
3594- ``s``: An integer constant, but allowing *only* relocatable values.
3595- ``X``: Allows an operand of any kind, no constraint whatsoever. Typically
3596 useful to pass a label for an asm branch or call.
3597
3598 .. FIXME: but that surely isn't actually okay to jump out of an asm
3599 block without telling llvm about the control transfer???)
3600
3601- ``{register-name}``: Requires exactly the named physical register.
3602
3603Other constraints are target-specific:
3604
3605AArch64:
3606
3607- ``z``: An immediate integer 0. Outputs ``WZR`` or ``XZR``, as appropriate.
3608- ``I``: An immediate integer valid for an ``ADD`` or ``SUB`` instruction,
3609 i.e. 0 to 4095 with optional shift by 12.
3610- ``J``: An immediate integer that, when negated, is valid for an ``ADD`` or
3611 ``SUB`` instruction, i.e. -1 to -4095 with optional left shift by 12.
3612- ``K``: An immediate integer that is valid for the 'bitmask immediate 32' of a
3613 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 32-bit register.
3614- ``L``: An immediate integer that is valid for the 'bitmask immediate 64' of a
3615 logical instruction like ``AND``, ``EOR``, or ``ORR`` with a 64-bit register.
3616- ``M``: An immediate integer for use with the ``MOV`` assembly alias on a
3617 32-bit register. This is a superset of ``K``: in addition to the bitmask
3618 immediate, also allows immediate integers which can be loaded with a single
3619 ``MOVZ`` or ``MOVL`` instruction.
3620- ``N``: An immediate integer for use with the ``MOV`` assembly alias on a
3621 64-bit register. This is a superset of ``L``.
3622- ``Q``: Memory address operand must be in a single register (no
3623 offsets). (However, LLVM currently does this for the ``m`` constraint as
3624 well.)
3625- ``r``: A 32 or 64-bit integer register (W* or X*).
3626- ``w``: A 32, 64, or 128-bit floating-point/SIMD register.
3627- ``x``: A lower 128-bit floating-point/SIMD register (``V0`` to ``V15``).
3628
3629AMDGPU:
3630
3631- ``r``: A 32 or 64-bit integer register.
3632- ``[0-9]v``: The 32-bit VGPR register, number 0-9.
3633- ``[0-9]s``: The 32-bit SGPR register, number 0-9.
3634
3635
3636All ARM modes:
3637
3638- ``Q``, ``Um``, ``Un``, ``Uq``, ``Us``, ``Ut``, ``Uv``, ``Uy``: Memory address
3639 operand. Treated the same as operand ``m``, at the moment.
3640
3641ARM and ARM's Thumb2 mode:
3642
3643- ``j``: An immediate integer between 0 and 65535 (valid for ``MOVW``)
3644- ``I``: An immediate integer valid for a data-processing instruction.
3645- ``J``: An immediate integer between -4095 and 4095.
3646- ``K``: An immediate integer whose bitwise inverse is valid for a
3647 data-processing instruction. (Can be used with template modifier "``B``" to
3648 print the inverted value).
3649- ``L``: An immediate integer whose negation is valid for a data-processing
3650 instruction. (Can be used with template modifier "``n``" to print the negated
3651 value).
3652- ``M``: A power of two or a integer between 0 and 32.
3653- ``N``: Invalid immediate constraint.
3654- ``O``: Invalid immediate constraint.
3655- ``r``: A general-purpose 32-bit integer register (``r0-r15``).
3656- ``l``: In Thumb2 mode, low 32-bit GPR registers (``r0-r7``). In ARM mode, same
3657 as ``r``.
3658- ``h``: In Thumb2 mode, a high 32-bit GPR register (``r8-r15``). In ARM mode,
3659 invalid.
3660- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3661 ``d0-d31``, or ``q0-q15``.
3662- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3663 ``d0-d7``, or ``q0-q3``.
Pablo Barrioe28cb832018-02-15 14:44:22 +00003664- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or
3665 ``q0-q8``.
James Y Knightbc832ed2015-07-08 18:08:36 +00003666
3667ARM's Thumb1 mode:
3668
3669- ``I``: An immediate integer between 0 and 255.
3670- ``J``: An immediate integer between -255 and -1.
3671- ``K``: An immediate integer between 0 and 255, with optional left-shift by
3672 some amount.
3673- ``L``: An immediate integer between -7 and 7.
3674- ``M``: An immediate integer which is a multiple of 4 between 0 and 1020.
3675- ``N``: An immediate integer between 0 and 31.
3676- ``O``: An immediate integer which is a multiple of 4 between -508 and 508.
3677- ``r``: A low 32-bit GPR register (``r0-r7``).
3678- ``l``: A low 32-bit GPR register (``r0-r7``).
3679- ``h``: A high GPR register (``r0-r7``).
3680- ``w``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s31``,
3681 ``d0-d31``, or ``q0-q15``.
3682- ``x``: A 32, 64, or 128-bit floating-point/SIMD register: ``s0-s15``,
3683 ``d0-d7``, or ``q0-q3``.
Pablo Barrioe28cb832018-02-15 14:44:22 +00003684- ``t``: A low floating-point/SIMD register: ``s0-s31``, ``d0-d16``, or
3685 ``q0-q8``.
James Y Knightbc832ed2015-07-08 18:08:36 +00003686
3687
3688Hexagon:
3689
3690- ``o``, ``v``: A memory address operand, treated the same as constraint ``m``,
3691 at the moment.
3692- ``r``: A 32 or 64-bit register.
3693
3694MSP430:
3695
3696- ``r``: An 8 or 16-bit register.
3697
3698MIPS:
3699
3700- ``I``: An immediate signed 16-bit integer.
3701- ``J``: An immediate integer zero.
3702- ``K``: An immediate unsigned 16-bit integer.
3703- ``L``: An immediate 32-bit integer, where the lower 16 bits are 0.
3704- ``N``: An immediate integer between -65535 and -1.
3705- ``O``: An immediate signed 15-bit integer.
3706- ``P``: An immediate integer between 1 and 65535.
3707- ``m``: A memory address operand. In MIPS-SE mode, allows a base address
3708 register plus 16-bit immediate offset. In MIPS mode, just a base register.
3709- ``R``: A memory address operand. In MIPS-SE mode, allows a base address
3710 register plus a 9-bit signed offset. In MIPS mode, the same as constraint
3711 ``m``.
3712- ``ZC``: A memory address operand, suitable for use in a ``pref``, ``ll``, or
3713 ``sc`` instruction on the given subtarget (details vary).
3714- ``r``, ``d``, ``y``: A 32 or 64-bit GPR register.
3715- ``f``: A 32 or 64-bit FPU register (``F0-F31``), or a 128-bit MSA register
Daniel Sanders3745e022015-07-13 09:24:21 +00003716 (``W0-W31``). In the case of MSA registers, it is recommended to use the ``w``
3717 argument modifier for compatibility with GCC.
James Y Knightbc832ed2015-07-08 18:08:36 +00003718- ``c``: A 32-bit or 64-bit GPR register suitable for indirect jump (always
3719 ``25``).
3720- ``l``: The ``lo`` register, 32 or 64-bit.
3721- ``x``: Invalid.
3722
3723NVPTX:
3724
3725- ``b``: A 1-bit integer register.
3726- ``c`` or ``h``: A 16-bit integer register.
3727- ``r``: A 32-bit integer register.
3728- ``l`` or ``N``: A 64-bit integer register.
3729- ``f``: A 32-bit float register.
3730- ``d``: A 64-bit float register.
3731
3732
3733PowerPC:
3734
3735- ``I``: An immediate signed 16-bit integer.
3736- ``J``: An immediate unsigned 16-bit integer, shifted left 16 bits.
3737- ``K``: An immediate unsigned 16-bit integer.
3738- ``L``: An immediate signed 16-bit integer, shifted left 16 bits.
3739- ``M``: An immediate integer greater than 31.
3740- ``N``: An immediate integer that is an exact power of 2.
3741- ``O``: The immediate integer constant 0.
3742- ``P``: An immediate integer constant whose negation is a signed 16-bit
3743 constant.
3744- ``es``, ``o``, ``Q``, ``Z``, ``Zy``: A memory address operand, currently
3745 treated the same as ``m``.
3746- ``r``: A 32 or 64-bit integer register.
3747- ``b``: A 32 or 64-bit integer register, excluding ``R0`` (that is:
3748 ``R1-R31``).
3749- ``f``: A 32 or 64-bit float register (``F0-F31``), or when QPX is enabled, a
3750 128 or 256-bit QPX register (``Q0-Q31``; aliases the ``F`` registers).
3751- ``v``: For ``4 x f32`` or ``4 x f64`` types, when QPX is enabled, a
3752 128 or 256-bit QPX register (``Q0-Q31``), otherwise a 128-bit
3753 altivec vector register (``V0-V31``).
3754
3755 .. FIXME: is this a bug that v accepts QPX registers? I think this
3756 is supposed to only use the altivec vector registers?
3757
3758- ``y``: Condition register (``CR0-CR7``).
3759- ``wc``: An individual CR bit in a CR register.
3760- ``wa``, ``wd``, ``wf``: Any 128-bit VSX vector register, from the full VSX
3761 register set (overlapping both the floating-point and vector register files).
3762- ``ws``: A 32 or 64-bit floating point register, from the full VSX register
3763 set.
3764
3765Sparc:
3766
3767- ``I``: An immediate 13-bit signed integer.
3768- ``r``: A 32-bit integer register.
James Y Knightd4e1b002017-05-12 15:59:10 +00003769- ``f``: Any floating-point register on SparcV8, or a floating point
3770 register in the "low" half of the registers on SparcV9.
3771- ``e``: Any floating point register. (Same as ``f`` on SparcV8.)
James Y Knightbc832ed2015-07-08 18:08:36 +00003772
3773SystemZ:
3774
3775- ``I``: An immediate unsigned 8-bit integer.
3776- ``J``: An immediate unsigned 12-bit integer.
3777- ``K``: An immediate signed 16-bit integer.
3778- ``L``: An immediate signed 20-bit integer.
3779- ``M``: An immediate integer 0x7fffffff.
Ulrich Weiganddaae87aa2016-06-13 14:24:05 +00003780- ``Q``: A memory address operand with a base address and a 12-bit immediate
3781 unsigned displacement.
3782- ``R``: A memory address operand with a base address, a 12-bit immediate
3783 unsigned displacement, and an index register.
3784- ``S``: A memory address operand with a base address and a 20-bit immediate
3785 signed displacement.
3786- ``T``: A memory address operand with a base address, a 20-bit immediate
3787 signed displacement, and an index register.
James Y Knightbc832ed2015-07-08 18:08:36 +00003788- ``r`` or ``d``: A 32, 64, or 128-bit integer register.
3789- ``a``: A 32, 64, or 128-bit integer address register (excludes R0, which in an
3790 address context evaluates as zero).
3791- ``h``: A 32-bit value in the high part of a 64bit data register
3792 (LLVM-specific)
3793- ``f``: A 32, 64, or 128-bit floating point register.
3794
3795X86:
3796
3797- ``I``: An immediate integer between 0 and 31.
3798- ``J``: An immediate integer between 0 and 64.
3799- ``K``: An immediate signed 8-bit integer.
3800- ``L``: An immediate integer, 0xff or 0xffff or (in 64-bit mode only)
3801 0xffffffff.
3802- ``M``: An immediate integer between 0 and 3.
3803- ``N``: An immediate unsigned 8-bit integer.
3804- ``O``: An immediate integer between 0 and 127.
3805- ``e``: An immediate 32-bit signed integer.
3806- ``Z``: An immediate 32-bit unsigned integer.
3807- ``o``, ``v``: Treated the same as ``m``, at the moment.
3808- ``q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3809 ``l`` integer register. On X86-32, this is the ``a``, ``b``, ``c``, and ``d``
3810 registers, and on X86-64, it is all of the integer registers.
3811- ``Q``: An 8, 16, 32, or 64-bit register which can be accessed as an 8-bit
3812 ``h`` integer register. This is the ``a``, ``b``, ``c``, and ``d`` registers.
3813- ``r`` or ``l``: An 8, 16, 32, or 64-bit integer register.
3814- ``R``: An 8, 16, 32, or 64-bit "legacy" integer register -- one which has
3815 existed since i386, and can be accessed without the REX prefix.
3816- ``f``: A 32, 64, or 80-bit '387 FPU stack pseudo-register.
3817- ``y``: A 64-bit MMX register, if MMX is enabled.
3818- ``x``: If SSE is enabled: a 32 or 64-bit scalar operand, or 128-bit vector
3819 operand in a SSE register. If AVX is also enabled, can also be a 256-bit
3820 vector operand in an AVX register. If AVX-512 is also enabled, can also be a
3821 512-bit vector operand in an AVX512 register, Otherwise, an error.
3822- ``Y``: The same as ``x``, if *SSE2* is enabled, otherwise an error.
3823- ``A``: Special case: allocates EAX first, then EDX, for a single operand (in
3824 32-bit mode, a 64-bit integer operand will get split into two registers). It
3825 is not recommended to use this constraint, as in 64-bit mode, the 64-bit
3826 operand will get allocated only to RAX -- if two 32-bit operands are needed,
3827 you're better off splitting it yourself, before passing it to the asm
3828 statement.
3829
3830XCore:
3831
3832- ``r``: A 32-bit integer register.
3833
3834
3835.. _inline-asm-modifiers:
3836
3837Asm template argument modifiers
3838^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
3839
3840In the asm template string, modifiers can be used on the operand reference, like
3841"``${0:n}``".
3842
3843The modifiers are, in general, expected to behave the same way they do in
3844GCC. LLVM's support is often implemented on an 'as-needed' basis, to support C
3845inline asm code which was supported by GCC. A mismatch in behavior between LLVM
3846and GCC likely indicates a bug in LLVM.
3847
3848Target-independent:
3849
Sean Silvaa1190322015-08-06 22:56:48 +00003850- ``c``: Print an immediate integer constant unadorned, without
James Y Knightbc832ed2015-07-08 18:08:36 +00003851 the target-specific immediate punctuation (e.g. no ``$`` prefix).
3852- ``n``: Negate and print immediate integer constant unadorned, without the
3853 target-specific immediate punctuation (e.g. no ``$`` prefix).
3854- ``l``: Print as an unadorned label, without the target-specific label
3855 punctuation (e.g. no ``$`` prefix).
3856
3857AArch64:
3858
3859- ``w``: Print a GPR register with a ``w*`` name instead of ``x*`` name. E.g.,
3860 instead of ``x30``, print ``w30``.
3861- ``x``: Print a GPR register with a ``x*`` name. (this is the default, anyhow).
3862- ``b``, ``h``, ``s``, ``d``, ``q``: Print a floating-point/SIMD register with a
3863 ``b*``, ``h*``, ``s*``, ``d*``, or ``q*`` name, rather than the default of
3864 ``v*``.
3865
3866AMDGPU:
3867
3868- ``r``: No effect.
3869
3870ARM:
3871
3872- ``a``: Print an operand as an address (with ``[`` and ``]`` surrounding a
3873 register).
3874- ``P``: No effect.
3875- ``q``: No effect.
3876- ``y``: Print a VFP single-precision register as an indexed double (e.g. print
3877 as ``d4[1]`` instead of ``s9``)
3878- ``B``: Bitwise invert and print an immediate integer constant without ``#``
3879 prefix.
3880- ``L``: Print the low 16-bits of an immediate integer constant.
3881- ``M``: Print as a register set suitable for ldm/stm. Also prints *all*
3882 register operands subsequent to the specified one (!), so use carefully.
3883- ``Q``: Print the low-order register of a register-pair, or the low-order
3884 register of a two-register operand.
3885- ``R``: Print the high-order register of a register-pair, or the high-order
3886 register of a two-register operand.
3887- ``H``: Print the second register of a register-pair. (On a big-endian system,
3888 ``H`` is equivalent to ``Q``, and on little-endian system, ``H`` is equivalent
3889 to ``R``.)
3890
3891 .. FIXME: H doesn't currently support printing the second register
3892 of a two-register operand.
3893
3894- ``e``: Print the low doubleword register of a NEON quad register.
3895- ``f``: Print the high doubleword register of a NEON quad register.
3896- ``m``: Print the base register of a memory operand without the ``[`` and ``]``
3897 adornment.
3898
3899Hexagon:
3900
3901- ``L``: Print the second register of a two-register operand. Requires that it
3902 has been allocated consecutively to the first.
3903
3904 .. FIXME: why is it restricted to consecutive ones? And there's
3905 nothing that ensures that happens, is there?
3906
3907- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3908 nothing. Used to print 'addi' vs 'add' instructions.
3909
3910MSP430:
3911
3912No additional modifiers.
3913
3914MIPS:
3915
3916- ``X``: Print an immediate integer as hexadecimal
3917- ``x``: Print the low 16 bits of an immediate integer as hexadecimal.
3918- ``d``: Print an immediate integer as decimal.
3919- ``m``: Subtract one and print an immediate integer as decimal.
3920- ``z``: Print $0 if an immediate zero, otherwise print normally.
3921- ``L``: Print the low-order register of a two-register operand, or prints the
3922 address of the low-order word of a double-word memory operand.
3923
3924 .. FIXME: L seems to be missing memory operand support.
3925
3926- ``M``: Print the high-order register of a two-register operand, or prints the
3927 address of the high-order word of a double-word memory operand.
3928
3929 .. FIXME: M seems to be missing memory operand support.
3930
3931- ``D``: Print the second register of a two-register operand, or prints the
3932 second word of a double-word memory operand. (On a big-endian system, ``D`` is
3933 equivalent to ``L``, and on little-endian system, ``D`` is equivalent to
3934 ``M``.)
Daniel Sanders3745e022015-07-13 09:24:21 +00003935- ``w``: No effect. Provided for compatibility with GCC which requires this
3936 modifier in order to print MSA registers (``W0-W31``) with the ``f``
3937 constraint.
James Y Knightbc832ed2015-07-08 18:08:36 +00003938
3939NVPTX:
3940
3941- ``r``: No effect.
3942
3943PowerPC:
3944
3945- ``L``: Print the second register of a two-register operand. Requires that it
3946 has been allocated consecutively to the first.
3947
3948 .. FIXME: why is it restricted to consecutive ones? And there's
3949 nothing that ensures that happens, is there?
3950
3951- ``I``: Print the letter 'i' if the operand is an integer constant, otherwise
3952 nothing. Used to print 'addi' vs 'add' instructions.
3953- ``y``: For a memory operand, prints formatter for a two-register X-form
3954 instruction. (Currently always prints ``r0,OPERAND``).
3955- ``U``: Prints 'u' if the memory operand is an update form, and nothing
3956 otherwise. (NOTE: LLVM does not support update form, so this will currently
3957 always print nothing)
3958- ``X``: Prints 'x' if the memory operand is an indexed form. (NOTE: LLVM does
3959 not support indexed form, so this will currently always print nothing)
3960
3961Sparc:
3962
3963- ``r``: No effect.
3964
3965SystemZ:
3966
3967SystemZ implements only ``n``, and does *not* support any of the other
3968target-independent modifiers.
3969
3970X86:
3971
3972- ``c``: Print an unadorned integer or symbol name. (The latter is
3973 target-specific behavior for this typically target-independent modifier).
3974- ``A``: Print a register name with a '``*``' before it.
3975- ``b``: Print an 8-bit register name (e.g. ``al``); do nothing on a memory
3976 operand.
3977- ``h``: Print the upper 8-bit register name (e.g. ``ah``); do nothing on a
3978 memory operand.
3979- ``w``: Print the 16-bit register name (e.g. ``ax``); do nothing on a memory
3980 operand.
3981- ``k``: Print the 32-bit register name (e.g. ``eax``); do nothing on a memory
3982 operand.
3983- ``q``: Print the 64-bit register name (e.g. ``rax``), if 64-bit registers are
3984 available, otherwise the 32-bit register name; do nothing on a memory operand.
3985- ``n``: Negate and print an unadorned integer, or, for operands other than an
3986 immediate integer (e.g. a relocatable symbol expression), print a '-' before
3987 the operand. (The behavior for relocatable symbol expressions is a
3988 target-specific behavior for this typically target-independent modifier)
3989- ``H``: Print a memory reference with additional offset +8.
3990- ``P``: Print a memory reference or operand for use as the argument of a call
3991 instruction. (E.g. omit ``(rip)``, even though it's PC-relative.)
3992
3993XCore:
3994
3995No additional modifiers.
3996
3997
Sean Silvab084af42012-12-07 10:36:55 +00003998Inline Asm Metadata
3999^^^^^^^^^^^^^^^^^^^
4000
4001The call instructions that wrap inline asm nodes may have a
4002"``!srcloc``" MDNode attached to it that contains a list of constant
4003integers. If present, the code generator will use the integer as the
4004location cookie value when report errors through the ``LLVMContext``
4005error reporting mechanisms. This allows a front-end to correlate backend
4006errors that occur with inline asm back to the source code that produced
4007it. For example:
4008
4009.. code-block:: llvm
4010
4011 call void asm sideeffect "something bad", ""(), !srcloc !42
4012 ...
4013 !42 = !{ i32 1234567 }
4014
4015It is up to the front-end to make sense of the magic numbers it places
4016in the IR. If the MDNode contains multiple constants, the code generator
4017will use the one that corresponds to the line of the asm that the error
4018occurs on.
4019
4020.. _metadata:
4021
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004022Metadata
4023========
Sean Silvab084af42012-12-07 10:36:55 +00004024
4025LLVM IR allows metadata to be attached to instructions in the program
4026that can convey extra information about the code to the optimizers and
4027code generator. One example application of metadata is source-level
4028debug information. There are two metadata primitives: strings and nodes.
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004029
Sean Silvaa1190322015-08-06 22:56:48 +00004030Metadata does not have a type, and is not a value. If referenced from a
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004031``call`` instruction, it uses the ``metadata`` type.
4032
4033All metadata are identified in syntax by a exclamation point ('``!``').
4034
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004035.. _metadata-string:
4036
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004037Metadata Nodes and Metadata Strings
4038-----------------------------------
Sean Silvab084af42012-12-07 10:36:55 +00004039
4040A metadata string is a string surrounded by double quotes. It can
4041contain any character by escaping non-printable characters with
4042"``\xx``" where "``xx``" is the two digit hex code. For example:
4043"``!"test\00"``".
4044
4045Metadata nodes are represented with notation similar to structure
4046constants (a comma separated list of elements, surrounded by braces and
4047preceded by an exclamation point). Metadata nodes can have any values as
4048their operand. For example:
4049
4050.. code-block:: llvm
4051
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004052 !{ !"test\00", i32 10}
Sean Silvab084af42012-12-07 10:36:55 +00004053
Duncan P. N. Exon Smith090a19b2015-01-08 22:38:29 +00004054Metadata nodes that aren't uniqued use the ``distinct`` keyword. For example:
4055
Renato Golin124f2592016-07-20 12:16:38 +00004056.. code-block:: text
Duncan P. N. Exon Smith090a19b2015-01-08 22:38:29 +00004057
4058 !0 = distinct !{!"test\00", i32 10}
4059
Duncan P. N. Exon Smith99010342015-01-08 23:50:26 +00004060``distinct`` nodes are useful when nodes shouldn't be merged based on their
Sean Silvaa1190322015-08-06 22:56:48 +00004061content. They can also occur when transformations cause uniquing collisions
Duncan P. N. Exon Smith99010342015-01-08 23:50:26 +00004062when metadata operands change.
4063
Sean Silvab084af42012-12-07 10:36:55 +00004064A :ref:`named metadata <namedmetadatastructure>` is a collection of
4065metadata nodes, which can be looked up in the module symbol table. For
4066example:
4067
4068.. code-block:: llvm
4069
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004070 !foo = !{!4, !3}
Sean Silvab084af42012-12-07 10:36:55 +00004071
Adrian Prantl1b842da2017-07-28 20:44:29 +00004072Metadata can be used as function arguments. Here the ``llvm.dbg.value``
4073intrinsic is using three metadata arguments:
Sean Silvab084af42012-12-07 10:36:55 +00004074
4075.. code-block:: llvm
4076
Adrian Prantlabe04752017-07-28 20:21:02 +00004077 call void @llvm.dbg.value(metadata !24, metadata !25, metadata !26)
Sean Silvab084af42012-12-07 10:36:55 +00004078
Peter Collingbourne50108682015-11-06 02:41:02 +00004079Metadata can be attached to an instruction. Here metadata ``!21`` is attached
4080to the ``add`` instruction using the ``!dbg`` identifier:
Sean Silvab084af42012-12-07 10:36:55 +00004081
4082.. code-block:: llvm
4083
4084 %indvar.next = add i64 %indvar, 1, !dbg !21
4085
Peter Collingbourne7b5b7c72017-01-25 21:50:14 +00004086Metadata can also be attached to a function or a global variable. Here metadata
4087``!22`` is attached to the ``f1`` and ``f2 functions, and the globals ``g1``
4088and ``g2`` using the ``!dbg`` identifier:
Peter Collingbourne50108682015-11-06 02:41:02 +00004089
4090.. code-block:: llvm
4091
Peter Collingbourne7b5b7c72017-01-25 21:50:14 +00004092 declare !dbg !22 void @f1()
4093 define void @f2() !dbg !22 {
Peter Collingbourne50108682015-11-06 02:41:02 +00004094 ret void
4095 }
4096
Peter Collingbourne7b5b7c72017-01-25 21:50:14 +00004097 @g1 = global i32 0, !dbg !22
4098 @g2 = external global i32, !dbg !22
4099
4100A transformation is required to drop any metadata attachment that it does not
4101know or know it can't preserve. Currently there is an exception for metadata
4102attachment to globals for ``!type`` and ``!absolute_symbol`` which can't be
4103unconditionally dropped unless the global is itself deleted.
4104
4105Metadata attached to a module using named metadata may not be dropped, with
4106the exception of debug metadata (named metadata with the name ``!llvm.dbg.*``).
4107
Sean Silvab084af42012-12-07 10:36:55 +00004108More information about specific metadata nodes recognized by the
4109optimizers and code generator is found below.
4110
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004111.. _specialized-metadata:
4112
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004113Specialized Metadata Nodes
4114^^^^^^^^^^^^^^^^^^^^^^^^^^
4115
4116Specialized metadata nodes are custom data structures in metadata (as opposed
Sean Silvaa1190322015-08-06 22:56:48 +00004117to generic tuples). Their fields are labelled, and can be specified in any
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004118order.
4119
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004120These aren't inherently debug info centric, but currently all the specialized
4121metadata nodes are related to debug info.
4122
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004123.. _DICompileUnit:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004124
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004125DICompileUnit
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004126"""""""""""""
4127
Sean Silvaa1190322015-08-06 22:56:48 +00004128``DICompileUnit`` nodes represent a compile unit. The ``enums:``,
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004129``retainedTypes:``, ``globals:``, ``imports:`` and ``macros:`` fields are tuples
4130containing the debug info to be emitted along with the compile unit, regardless
4131of code optimizations (some nodes are only emitted if there are references to
4132them from instructions). The ``debugInfoForProfiling:`` field is a boolean
4133indicating whether or not line-table discriminators are updated to provide
4134more-accurate debug info for profiling results.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004135
Renato Golin124f2592016-07-20 12:16:38 +00004136.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004137
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004138 !0 = !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "clang",
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004139 isOptimized: true, flags: "-O2", runtimeVersion: 2,
Adrian Prantlb8089512016-04-01 00:16:49 +00004140 splitDebugFilename: "abc.debug", emissionKind: FullDebug,
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004141 enums: !2, retainedTypes: !3, globals: !4, imports: !5,
4142 macros: !6, dwoId: 0x0abcd)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004143
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004144Compile unit descriptors provide the root scope for objects declared in a
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004145specific compilation unit. File descriptors are defined using this scope. These
4146descriptors are collected by a named metadata node ``!llvm.dbg.cu``. They keep
4147track of global variables, type information, and imported entities (declarations
4148and namespaces).
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004149
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004150.. _DIFile:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004151
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004152DIFile
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004153""""""
4154
Sean Silvaa1190322015-08-06 22:56:48 +00004155``DIFile`` nodes represent files. The ``filename:`` can include slashes.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004156
Aaron Ballmanb3c51512017-01-17 21:48:31 +00004157.. code-block:: none
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004158
Amjad Aboud7faeecc2016-12-25 10:12:09 +00004159 !0 = !DIFile(filename: "path/to/file", directory: "/path/to/dir",
4160 checksumkind: CSK_MD5,
4161 checksum: "000102030405060708090a0b0c0d0e0f")
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004162
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004163Files are sometimes used in ``scope:`` fields, and are the only valid target
4164for ``file:`` fields.
Amjad Aboud7faeecc2016-12-25 10:12:09 +00004165Valid values for ``checksumkind:`` field are: {CSK_None, CSK_MD5, CSK_SHA1}
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004166
Michael Kuperstein605308a2015-05-14 10:58:59 +00004167.. _DIBasicType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004168
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004169DIBasicType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004170"""""""""""
4171
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004172``DIBasicType`` nodes represent primitive types, such as ``int``, ``bool`` and
Sean Silvaa1190322015-08-06 22:56:48 +00004173``float``. ``tag:`` defaults to ``DW_TAG_base_type``.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004174
Renato Golin124f2592016-07-20 12:16:38 +00004175.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004176
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004177 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004178 encoding: DW_ATE_unsigned_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004179 !1 = !DIBasicType(tag: DW_TAG_unspecified_type, name: "decltype(nullptr)")
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004180
Sean Silvaa1190322015-08-06 22:56:48 +00004181The ``encoding:`` describes the details of the type. Usually it's one of the
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004182following:
4183
Renato Golin124f2592016-07-20 12:16:38 +00004184.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004185
4186 DW_ATE_address = 1
4187 DW_ATE_boolean = 2
4188 DW_ATE_float = 4
4189 DW_ATE_signed = 5
4190 DW_ATE_signed_char = 6
4191 DW_ATE_unsigned = 7
4192 DW_ATE_unsigned_char = 8
4193
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004194.. _DISubroutineType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004195
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004196DISubroutineType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004197""""""""""""""""
4198
Sean Silvaa1190322015-08-06 22:56:48 +00004199``DISubroutineType`` nodes represent subroutine types. Their ``types:`` field
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004200refers to a tuple; the first operand is the return type, while the rest are the
Sean Silvaa1190322015-08-06 22:56:48 +00004201types of the formal arguments in order. If the first operand is ``null``, that
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004202represents a function with no return value (such as ``void foo() {}`` in C++).
4203
Renato Golin124f2592016-07-20 12:16:38 +00004204.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004205
4206 !0 = !BasicType(name: "int", size: 32, align: 32, DW_ATE_signed)
4207 !1 = !BasicType(name: "char", size: 8, align: 8, DW_ATE_signed_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004208 !2 = !DISubroutineType(types: !{null, !0, !1}) ; void (int, char)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004209
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004210.. _DIDerivedType:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004211
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004212DIDerivedType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004213"""""""""""""
4214
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004215``DIDerivedType`` nodes represent types derived from other types, such as
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004216qualified types.
4217
Renato Golin124f2592016-07-20 12:16:38 +00004218.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004219
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004220 !0 = !DIBasicType(name: "unsigned char", size: 8, align: 8,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004221 encoding: DW_ATE_unsigned_char)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004222 !1 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !0, size: 32,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004223 align: 32)
4224
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004225The following ``tag:`` values are valid:
4226
Renato Golin124f2592016-07-20 12:16:38 +00004227.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004228
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004229 DW_TAG_member = 13
4230 DW_TAG_pointer_type = 15
4231 DW_TAG_reference_type = 16
4232 DW_TAG_typedef = 22
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004233 DW_TAG_inheritance = 28
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004234 DW_TAG_ptr_to_member_type = 31
4235 DW_TAG_const_type = 38
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004236 DW_TAG_friend = 42
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004237 DW_TAG_volatile_type = 53
4238 DW_TAG_restrict_type = 55
Victor Leschuke1156c22016-10-31 19:09:38 +00004239 DW_TAG_atomic_type = 71
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004240
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004241.. _DIDerivedTypeMember:
4242
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004243``DW_TAG_member`` is used to define a member of a :ref:`composite type
Duncan P. N. Exon Smith90990cd2016-04-17 00:45:00 +00004244<DICompositeType>`. The type of the member is the ``baseType:``. The
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004245``offset:`` is the member's bit offset. If the composite type has an ODR
4246``identifier:`` and does not set ``flags: DIFwdDecl``, then the member is
4247uniqued based only on its ``name:`` and ``scope:``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004248
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004249``DW_TAG_inheritance`` and ``DW_TAG_friend`` are used in the ``elements:``
4250field of :ref:`composite types <DICompositeType>` to describe parents and
4251friends.
4252
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004253``DW_TAG_typedef`` is used to provide a name for the ``baseType:``.
4254
4255``DW_TAG_pointer_type``, ``DW_TAG_reference_type``, ``DW_TAG_const_type``,
Victor Leschuke1156c22016-10-31 19:09:38 +00004256``DW_TAG_volatile_type``, ``DW_TAG_restrict_type`` and ``DW_TAG_atomic_type``
4257are used to qualify the ``baseType:``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004258
4259Note that the ``void *`` type is expressed as a type derived from NULL.
4260
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004261.. _DICompositeType:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004262
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004263DICompositeType
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004264"""""""""""""""
4265
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004266``DICompositeType`` nodes represent types composed of other types, like
Sean Silvaa1190322015-08-06 22:56:48 +00004267structures and unions. ``elements:`` points to a tuple of the composed types.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004268
4269If the source language supports ODR, the ``identifier:`` field gives the unique
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004270identifier used for type merging between modules. When specified,
4271:ref:`subprogram declarations <DISubprogramDeclaration>` and :ref:`member
4272derived types <DIDerivedTypeMember>` that reference the ODR-type in their
4273``scope:`` change uniquing rules.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004274
Duncan P. N. Exon Smith5ab2be02016-04-17 03:58:21 +00004275For a given ``identifier:``, there should only be a single composite type that
4276does not have ``flags: DIFlagFwdDecl`` set. LLVM tools that link modules
4277together will unique such definitions at parse time via the ``identifier:``
4278field, even if the nodes are ``distinct``.
4279
Renato Golin124f2592016-07-20 12:16:38 +00004280.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004281
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004282 !0 = !DIEnumerator(name: "SixKind", value: 7)
4283 !1 = !DIEnumerator(name: "SevenKind", value: 7)
4284 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
4285 !3 = !DICompositeType(tag: DW_TAG_enumeration_type, name: "Enum", file: !12,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004286 line: 2, size: 32, align: 32, identifier: "_M4Enum",
4287 elements: !{!0, !1, !2})
4288
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004289The following ``tag:`` values are valid:
4290
Renato Golin124f2592016-07-20 12:16:38 +00004291.. code-block:: text
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004292
4293 DW_TAG_array_type = 1
4294 DW_TAG_class_type = 2
4295 DW_TAG_enumeration_type = 4
4296 DW_TAG_structure_type = 19
4297 DW_TAG_union_type = 23
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004298
4299For ``DW_TAG_array_type``, the ``elements:`` should be :ref:`subrange
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004300descriptors <DISubrange>`, each representing the range of subscripts at that
Sean Silvaa1190322015-08-06 22:56:48 +00004301level of indexing. The ``DIFlagVector`` flag to ``flags:`` indicates that an
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004302array type is a native packed vector.
4303
4304For ``DW_TAG_enumeration_type``, the ``elements:`` should be :ref:`enumerator
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004305descriptors <DIEnumerator>`, each representing the definition of an enumeration
Sean Silvaa1190322015-08-06 22:56:48 +00004306value for the set. All enumeration type descriptors are collected in the
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004307``enums:`` field of the :ref:`compile unit <DICompileUnit>`.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004308
4309For ``DW_TAG_structure_type``, ``DW_TAG_class_type``, and
4310``DW_TAG_union_type``, the ``elements:`` should be :ref:`derived types
Duncan P. N. Exon Smitha3f3de12016-04-16 22:46:47 +00004311<DIDerivedType>` with ``tag: DW_TAG_member``, ``tag: DW_TAG_inheritance``, or
4312``tag: DW_TAG_friend``; or :ref:`subprograms <DISubprogram>` with
4313``isDefinition: false``.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004314
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004315.. _DISubrange:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004316
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004317DISubrange
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004318""""""""""
4319
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004320``DISubrange`` nodes are the elements for ``DW_TAG_array_type`` variants of
Sander de Smalen1cb94312018-01-24 10:30:23 +00004321:ref:`DICompositeType`.
4322
4323- ``count: -1`` indicates an empty array.
4324- ``count: !9`` describes the count with a :ref:`DILocalVariable`.
4325- ``count: !11`` describes the count with a :ref:`DIGlobalVariable`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004326
4327.. code-block:: llvm
4328
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004329 !0 = !DISubrange(count: 5, lowerBound: 0) ; array counting from 0
4330 !1 = !DISubrange(count: 5, lowerBound: 1) ; array counting from 1
4331 !2 = !DISubrange(count: -1) ; empty array.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004332
Sander de Smalenfdf40912018-01-24 09:56:07 +00004333 ; Scopes used in rest of example
4334 !6 = !DIFile(filename: "vla.c", directory: "/path/to/file")
4335 !7 = distinct !DICompileUnit(language: DW_LANG_C99, ...
4336 !8 = distinct !DISubprogram(name: "foo", scope: !7, file: !6, line: 5, ...
4337
4338 ; Use of local variable as count value
4339 !9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
4340 !10 = !DILocalVariable(name: "count", scope: !8, file: !6, line: 42, type: !9)
4341 !11 = !DISubrange(count !10, lowerBound: 0)
4342
4343 ; Use of global variable as count value
4344 !12 = !DIGlobalVariable(name: "count", scope: !8, file: !6, line: 22, type: !9)
4345 !13 = !DISubrange(count !12, lowerBound: 0)
4346
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004347.. _DIEnumerator:
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004348
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004349DIEnumerator
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004350""""""""""""
4351
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004352``DIEnumerator`` nodes are the elements for ``DW_TAG_enumeration_type``
4353variants of :ref:`DICompositeType`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004354
4355.. code-block:: llvm
4356
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004357 !0 = !DIEnumerator(name: "SixKind", value: 7)
4358 !1 = !DIEnumerator(name: "SevenKind", value: 7)
4359 !2 = !DIEnumerator(name: "NegEightKind", value: -8)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004360
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004361DITemplateTypeParameter
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004362"""""""""""""""""""""""
4363
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004364``DITemplateTypeParameter`` nodes represent type parameters to generic source
Sean Silvaa1190322015-08-06 22:56:48 +00004365language constructs. They are used (optionally) in :ref:`DICompositeType` and
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004366:ref:`DISubprogram` ``templateParams:`` fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004367
4368.. code-block:: llvm
4369
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004370 !0 = !DITemplateTypeParameter(name: "Ty", type: !1)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004371
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004372DITemplateValueParameter
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004373""""""""""""""""""""""""
4374
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004375``DITemplateValueParameter`` nodes represent value parameters to generic source
Sean Silvaa1190322015-08-06 22:56:48 +00004376language constructs. ``tag:`` defaults to ``DW_TAG_template_value_parameter``,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004377but if specified can also be set to ``DW_TAG_GNU_template_template_param`` or
Sean Silvaa1190322015-08-06 22:56:48 +00004378``DW_TAG_GNU_template_param_pack``. They are used (optionally) in
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004379:ref:`DICompositeType` and :ref:`DISubprogram` ``templateParams:`` fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004380
4381.. code-block:: llvm
4382
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004383 !0 = !DITemplateValueParameter(name: "Ty", type: !1, value: i32 7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004384
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004385DINamespace
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004386"""""""""""
4387
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004388``DINamespace`` nodes represent namespaces in the source language.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004389
4390.. code-block:: llvm
4391
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004392 !0 = !DINamespace(name: "myawesomeproject", scope: !1, file: !2, line: 7)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004393
Sander de Smalen1cb94312018-01-24 10:30:23 +00004394.. _DIGlobalVariable:
4395
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004396DIGlobalVariable
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004397""""""""""""""""
4398
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004399``DIGlobalVariable`` nodes represent global variables in the source language.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004400
4401.. code-block:: llvm
4402
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004403 !0 = !DIGlobalVariable(name: "foo", linkageName: "foo", scope: !1,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004404 file: !2, line: 7, type: !3, isLocal: true,
4405 isDefinition: false, variable: i32* @foo,
4406 declaration: !4)
4407
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004408All global variables should be referenced by the `globals:` field of a
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004409:ref:`compile unit <DICompileUnit>`.
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004410
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004411.. _DISubprogram:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004412
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004413DISubprogram
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004414""""""""""""
4415
Peter Collingbourne50108682015-11-06 02:41:02 +00004416``DISubprogram`` nodes represent functions from the source language. A
4417``DISubprogram`` may be attached to a function definition using ``!dbg``
4418metadata. The ``variables:`` field points at :ref:`variables <DILocalVariable>`
4419that must be retained, even if their IR counterparts are optimized out of
4420the IR. The ``type:`` field must point at an :ref:`DISubroutineType`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004421
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004422.. _DISubprogramDeclaration:
4423
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004424When ``isDefinition: false``, subprograms describe a declaration in the type
Duncan P. N. Exon Smitha59d3e52016-04-23 21:08:00 +00004425tree as opposed to a definition of a function. If the scope is a composite
4426type with an ODR ``identifier:`` and that does not set ``flags: DIFwdDecl``,
4427then the subprogram declaration is uniqued based only on its ``linkageName:``
4428and ``scope:``.
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004429
Renato Golin124f2592016-07-20 12:16:38 +00004430.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004431
Peter Collingbourne50108682015-11-06 02:41:02 +00004432 define void @_Z3foov() !dbg !0 {
4433 ...
4434 }
4435
4436 !0 = distinct !DISubprogram(name: "foo", linkageName: "_Zfoov", scope: !1,
4437 file: !2, line: 7, type: !3, isLocal: true,
Duncan P. N. Exon Smith05ebfd02016-04-17 02:30:20 +00004438 isDefinition: true, scopeLine: 8,
Peter Collingbourne50108682015-11-06 02:41:02 +00004439 containingType: !4,
4440 virtuality: DW_VIRTUALITY_pure_virtual,
4441 virtualIndex: 10, flags: DIFlagPrototyped,
Adrian Prantl6c2497f2017-06-12 23:59:43 +00004442 isOptimized: true, unit: !5, templateParams: !6,
4443 declaration: !7, variables: !8, thrownTypes: !9)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004444
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004445.. _DILexicalBlock:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004446
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004447DILexicalBlock
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004448""""""""""""""
4449
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004450``DILexicalBlock`` nodes describe nested blocks within a :ref:`subprogram
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004451<DISubprogram>`. The line number and column numbers are used to distinguish
Sean Silvaa1190322015-08-06 22:56:48 +00004452two lexical blocks at same depth. They are valid targets for ``scope:``
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004453fields.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004454
Renato Golin124f2592016-07-20 12:16:38 +00004455.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004456
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004457 !0 = distinct !DILexicalBlock(scope: !1, file: !2, line: 7, column: 35)
Duncan P. N. Exon Smithd937cd92015-03-17 23:41:05 +00004458
4459Usually lexical blocks are ``distinct`` to prevent node merging based on
4460operands.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004461
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004462.. _DILexicalBlockFile:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004463
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004464DILexicalBlockFile
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004465""""""""""""""""""
4466
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004467``DILexicalBlockFile`` nodes are used to discriminate between sections of a
Sean Silvaa1190322015-08-06 22:56:48 +00004468:ref:`lexical block <DILexicalBlock>`. The ``file:`` field can be changed to
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004469indicate textual inclusion, or the ``discriminator:`` field can be used to
4470discriminate between control flow within a single block in the source language.
4471
4472.. code-block:: llvm
4473
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004474 !0 = !DILexicalBlock(scope: !3, file: !4, line: 7, column: 35)
4475 !1 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 0)
4476 !2 = !DILexicalBlockFile(scope: !0, file: !4, discriminator: 1)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004477
Michael Kuperstein605308a2015-05-14 10:58:59 +00004478.. _DILocation:
4479
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004480DILocation
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004481""""""""""
4482
Sean Silvaa1190322015-08-06 22:56:48 +00004483``DILocation`` nodes represent source debug locations. The ``scope:`` field is
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004484mandatory, and points at an :ref:`DILexicalBlockFile`, an
4485:ref:`DILexicalBlock`, or an :ref:`DISubprogram`.
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004486
4487.. code-block:: llvm
4488
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004489 !0 = !DILocation(line: 2900, column: 42, scope: !1, inlinedAt: !2)
Duncan P. N. Exon Smith6a484832015-01-13 21:10:44 +00004490
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004491.. _DILocalVariable:
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004492
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004493DILocalVariable
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004494"""""""""""""""
4495
Sean Silvaa1190322015-08-06 22:56:48 +00004496``DILocalVariable`` nodes represent local variables in the source language. If
Duncan P. N. Exon Smithed013cd2015-07-31 18:58:39 +00004497the ``arg:`` field is set to non-zero, then this variable is a subprogram
4498parameter, and it will be included in the ``variables:`` field of its
4499:ref:`DISubprogram`.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004500
Renato Golin124f2592016-07-20 12:16:38 +00004501.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004502
Duncan P. N. Exon Smithed013cd2015-07-31 18:58:39 +00004503 !0 = !DILocalVariable(name: "this", arg: 1, scope: !3, file: !2, line: 7,
4504 type: !3, flags: DIFlagArtificial)
4505 !1 = !DILocalVariable(name: "x", arg: 2, scope: !4, file: !2, line: 7,
4506 type: !3)
4507 !2 = !DILocalVariable(name: "y", scope: !5, file: !2, line: 7, type: !3)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004508
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004509DIExpression
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004510""""""""""""
4511
Adrian Prantlb44c7762017-03-22 18:01:01 +00004512``DIExpression`` nodes represent expressions that are inspired by the DWARF
4513expression language. They are used in :ref:`debug intrinsics<dbg_intrinsics>`
4514(such as ``llvm.dbg.declare`` and ``llvm.dbg.value``) to describe how the
4515referenced LLVM variable relates to the source language variable.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004516
4517The current supported vocabulary is limited:
4518
Adrian Prantl6825fb62017-04-18 01:21:53 +00004519- ``DW_OP_deref`` dereferences the top of the expression stack.
Florian Hahnffc498d2017-06-14 13:14:38 +00004520- ``DW_OP_plus`` pops the last two entries from the expression stack, adds
4521 them together and appends the result to the expression stack.
4522- ``DW_OP_minus`` pops the last two entries from the expression stack, subtracts
4523 the last entry from the second last entry and appends the result to the
4524 expression stack.
Florian Hahnc9c403c2017-06-13 16:54:44 +00004525- ``DW_OP_plus_uconst, 93`` adds ``93`` to the working expression.
Adrian Prantlb44c7762017-03-22 18:01:01 +00004526- ``DW_OP_LLVM_fragment, 16, 8`` specifies the offset and size (``16`` and ``8``
4527 here, respectively) of the variable fragment from the working expression. Note
Hiroshi Inoue760c0c92018-01-16 13:19:48 +00004528 that contrary to DW_OP_bit_piece, the offset is describing the location
Adrian Prantlb44c7762017-03-22 18:01:01 +00004529 within the described source variable.
Konstantin Zhuravlyovf9b41cd2017-03-08 00:28:57 +00004530- ``DW_OP_swap`` swaps top two stack entries.
4531- ``DW_OP_xderef`` provides extended dereference mechanism. The entry at the top
4532 of the stack is treated as an address. The second stack entry is treated as an
4533 address space identifier.
Adrian Prantlb44c7762017-03-22 18:01:01 +00004534- ``DW_OP_stack_value`` marks a constant value.
4535
Adrian Prantl6825fb62017-04-18 01:21:53 +00004536DWARF specifies three kinds of simple location descriptions: Register, memory,
4537and implicit location descriptions. Register and memory location descriptions
4538describe the *location* of a source variable (in the sense that a debugger might
4539modify its value), whereas implicit locations describe merely the *value* of a
4540source variable. DIExpressions also follow this model: A DIExpression that
4541doesn't have a trailing ``DW_OP_stack_value`` will describe an *address* when
4542combined with a concrete location.
4543
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00004544.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004545
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004546 !0 = !DIExpression(DW_OP_deref)
Florian Hahnc9c403c2017-06-13 16:54:44 +00004547 !1 = !DIExpression(DW_OP_plus_uconst, 3)
Florian Hahnffc498d2017-06-14 13:14:38 +00004548 !1 = !DIExpression(DW_OP_constu, 3, DW_OP_plus)
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004549 !2 = !DIExpression(DW_OP_bit_piece, 3, 7)
Florian Hahnffc498d2017-06-14 13:14:38 +00004550 !3 = !DIExpression(DW_OP_deref, DW_OP_constu, 3, DW_OP_plus, DW_OP_LLVM_fragment, 3, 7)
Konstantin Zhuravlyovf9b41cd2017-03-08 00:28:57 +00004551 !4 = !DIExpression(DW_OP_constu, 2, DW_OP_swap, DW_OP_xderef)
Adrian Prantlb44c7762017-03-22 18:01:01 +00004552 !5 = !DIExpression(DW_OP_constu, 42, DW_OP_stack_value)
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004553
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004554DIObjCProperty
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004555""""""""""""""
4556
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004557``DIObjCProperty`` nodes represent Objective-C property nodes.
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004558
4559.. code-block:: llvm
4560
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004561 !3 = !DIObjCProperty(name: "foo", file: !1, line: 7, setter: "setFoo",
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004562 getter: "getFoo", attributes: 7, type: !2)
4563
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004564DIImportedEntity
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004565""""""""""""""""
4566
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004567``DIImportedEntity`` nodes represent entities (such as modules) imported into a
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004568compile unit.
4569
Renato Golin124f2592016-07-20 12:16:38 +00004570.. code-block:: text
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004571
Duncan P. N. Exon Smitha9308c42015-04-29 16:38:44 +00004572 !2 = !DIImportedEntity(tag: DW_TAG_imported_module, name: "foo", scope: !0,
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +00004573 entity: !1, line: 7)
4574
Amjad Abouda9bcf162015-12-10 12:56:35 +00004575DIMacro
4576"""""""
4577
4578``DIMacro`` nodes represent definition or undefinition of a macro identifiers.
4579The ``name:`` field is the macro identifier, followed by macro parameters when
Sylvestre Ledru7d540502016-07-02 19:28:40 +00004580defining a function-like macro, and the ``value`` field is the token-string
Amjad Abouda9bcf162015-12-10 12:56:35 +00004581used to expand the macro identifier.
4582
Renato Golin124f2592016-07-20 12:16:38 +00004583.. code-block:: text
Amjad Abouda9bcf162015-12-10 12:56:35 +00004584
4585 !2 = !DIMacro(macinfo: DW_MACINFO_define, line: 7, name: "foo(x)",
4586 value: "((x) + 1)")
4587 !3 = !DIMacro(macinfo: DW_MACINFO_undef, line: 30, name: "foo")
4588
4589DIMacroFile
4590"""""""""""
4591
4592``DIMacroFile`` nodes represent inclusion of source files.
4593The ``nodes:`` field is a list of ``DIMacro`` and ``DIMacroFile`` nodes that
4594appear in the included source file.
4595
Renato Golin124f2592016-07-20 12:16:38 +00004596.. code-block:: text
Amjad Abouda9bcf162015-12-10 12:56:35 +00004597
4598 !2 = !DIMacroFile(macinfo: DW_MACINFO_start_file, line: 7, file: !2,
4599 nodes: !3)
4600
Sean Silvab084af42012-12-07 10:36:55 +00004601'``tbaa``' Metadata
4602^^^^^^^^^^^^^^^^^^^
4603
4604In LLVM IR, memory does not have types, so LLVM's own type system is not
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004605suitable for doing type based alias analysis (TBAA). Instead, metadata is
4606added to the IR to describe a type system of a higher level language. This
4607can be used to implement C/C++ strict type aliasing rules, but it can also
4608be used to implement custom alias analysis behavior for other languages.
Sean Silvab084af42012-12-07 10:36:55 +00004609
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004610This description of LLVM's TBAA system is broken into two parts:
4611:ref:`Semantics<tbaa_node_semantics>` talks about high level issues, and
4612:ref:`Representation<tbaa_node_representation>` talks about the metadata
4613encoding of various entities.
Sean Silvab084af42012-12-07 10:36:55 +00004614
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004615It is always possible to trace any TBAA node to a "root" TBAA node (details
4616in the :ref:`Representation<tbaa_node_representation>` section). TBAA
4617nodes with different roots have an unknown aliasing relationship, and LLVM
4618conservatively infers ``MayAlias`` between them. The rules mentioned in
4619this section only pertain to TBAA nodes living under the same root.
Sean Silvab084af42012-12-07 10:36:55 +00004620
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004621.. _tbaa_node_semantics:
Sean Silvab084af42012-12-07 10:36:55 +00004622
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004623Semantics
4624"""""""""
Sean Silvab084af42012-12-07 10:36:55 +00004625
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004626The TBAA metadata system, referred to as "struct path TBAA" (not to be
4627confused with ``tbaa.struct``), consists of the following high level
4628concepts: *Type Descriptors*, further subdivided into scalar type
4629descriptors and struct type descriptors; and *Access Tags*.
Sean Silvab084af42012-12-07 10:36:55 +00004630
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004631**Type descriptors** describe the type system of the higher level language
4632being compiled. **Scalar type descriptors** describe types that do not
4633contain other types. Each scalar type has a parent type, which must also
4634be a scalar type or the TBAA root. Via this parent relation, scalar types
4635within a TBAA root form a tree. **Struct type descriptors** denote types
4636that contain a sequence of other type descriptors, at known offsets. These
4637contained type descriptors can either be struct type descriptors themselves
4638or scalar type descriptors.
4639
4640**Access tags** are metadata nodes attached to load and store instructions.
4641Access tags use type descriptors to describe the *location* being accessed
4642in terms of the type system of the higher level language. Access tags are
4643tuples consisting of a base type, an access type and an offset. The base
4644type is a scalar type descriptor or a struct type descriptor, the access
4645type is a scalar type descriptor, and the offset is a constant integer.
4646
4647The access tag ``(BaseTy, AccessTy, Offset)`` can describe one of two
4648things:
4649
4650 * If ``BaseTy`` is a struct type, the tag describes a memory access (load
4651 or store) of a value of type ``AccessTy`` contained in the struct type
4652 ``BaseTy`` at offset ``Offset``.
4653
4654 * If ``BaseTy`` is a scalar type, ``Offset`` must be 0 and ``BaseTy`` and
4655 ``AccessTy`` must be the same; and the access tag describes a scalar
4656 access with scalar type ``AccessTy``.
4657
4658We first define an ``ImmediateParent`` relation on ``(BaseTy, Offset)``
4659tuples this way:
4660
4661 * If ``BaseTy`` is a scalar type then ``ImmediateParent(BaseTy, 0)`` is
4662 ``(ParentTy, 0)`` where ``ParentTy`` is the parent of the scalar type as
4663 described in the TBAA metadata. ``ImmediateParent(BaseTy, Offset)`` is
4664 undefined if ``Offset`` is non-zero.
4665
4666 * If ``BaseTy`` is a struct type then ``ImmediateParent(BaseTy, Offset)``
4667 is ``(NewTy, NewOffset)`` where ``NewTy`` is the type contained in
4668 ``BaseTy`` at offset ``Offset`` and ``NewOffset`` is ``Offset`` adjusted
4669 to be relative within that inner type.
4670
4671A memory access with an access tag ``(BaseTy1, AccessTy1, Offset1)``
4672aliases a memory access with an access tag ``(BaseTy2, AccessTy2,
4673Offset2)`` if either ``(BaseTy1, Offset1)`` is reachable from ``(Base2,
4674Offset2)`` via the ``Parent`` relation or vice versa.
4675
4676As a concrete example, the type descriptor graph for the following program
4677
4678.. code-block:: c
4679
4680 struct Inner {
4681 int i; // offset 0
4682 float f; // offset 4
4683 };
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00004684
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004685 struct Outer {
4686 float f; // offset 0
4687 double d; // offset 4
4688 struct Inner inner_a; // offset 12
4689 };
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00004690
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004691 void f(struct Outer* outer, struct Inner* inner, float* f, int* i, char* c) {
4692 outer->f = 0; // tag0: (OuterStructTy, FloatScalarTy, 0)
4693 outer->inner_a.i = 0; // tag1: (OuterStructTy, IntScalarTy, 12)
4694 outer->inner_a.f = 0.0; // tag2: (OuterStructTy, IntScalarTy, 16)
4695 *f = 0.0; // tag3: (FloatScalarTy, FloatScalarTy, 0)
4696 }
4697
4698is (note that in C and C++, ``char`` can be used to access any arbitrary
4699type):
4700
4701.. code-block:: text
4702
4703 Root = "TBAA Root"
4704 CharScalarTy = ("char", Root, 0)
4705 FloatScalarTy = ("float", CharScalarTy, 0)
4706 DoubleScalarTy = ("double", CharScalarTy, 0)
4707 IntScalarTy = ("int", CharScalarTy, 0)
4708 InnerStructTy = {"Inner" (IntScalarTy, 0), (FloatScalarTy, 4)}
4709 OuterStructTy = {"Outer", (FloatScalarTy, 0), (DoubleScalarTy, 4),
4710 (InnerStructTy, 12)}
4711
4712
4713with (e.g.) ``ImmediateParent(OuterStructTy, 12)`` = ``(InnerStructTy,
47140)``, ``ImmediateParent(InnerStructTy, 0)`` = ``(IntScalarTy, 0)``, and
4715``ImmediateParent(IntScalarTy, 0)`` = ``(CharScalarTy, 0)``.
4716
4717.. _tbaa_node_representation:
4718
4719Representation
4720""""""""""""""
4721
4722The root node of a TBAA type hierarchy is an ``MDNode`` with 0 operands or
4723with exactly one ``MDString`` operand.
4724
4725Scalar type descriptors are represented as an ``MDNode`` s with two
4726operands. The first operand is an ``MDString`` denoting the name of the
4727struct type. LLVM does not assign meaning to the value of this operand, it
4728only cares about it being an ``MDString``. The second operand is an
4729``MDNode`` which points to the parent for said scalar type descriptor,
4730which is either another scalar type descriptor or the TBAA root. Scalar
4731type descriptors can have an optional third argument, but that must be the
4732constant integer zero.
4733
4734Struct type descriptors are represented as ``MDNode`` s with an odd number
4735of operands greater than 1. The first operand is an ``MDString`` denoting
4736the name of the struct type. Like in scalar type descriptors the actual
4737value of this name operand is irrelevant to LLVM. After the name operand,
4738the struct type descriptors have a sequence of alternating ``MDNode`` and
4739``ConstantInt`` operands. With N starting from 1, the 2N - 1 th operand,
4740an ``MDNode``, denotes a contained field, and the 2N th operand, a
4741``ConstantInt``, is the offset of the said contained field. The offsets
4742must be in non-decreasing order.
4743
4744Access tags are represented as ``MDNode`` s with either 3 or 4 operands.
4745The first operand is an ``MDNode`` pointing to the node representing the
4746base type. The second operand is an ``MDNode`` pointing to the node
4747representing the access type. The third operand is a ``ConstantInt`` that
4748states the offset of the access. If a fourth field is present, it must be
4749a ``ConstantInt`` valued at 0 or 1. If it is 1 then the access tag states
4750that the location being accessed is "constant" (meaning
Sean Silvab084af42012-12-07 10:36:55 +00004751``pointsToConstantMemory`` should return true; see `other useful
Sanjoy Dasa3ff9942017-02-13 23:14:03 +00004752AliasAnalysis methods <AliasAnalysis.html#OtherItfs>`_). The TBAA root of
4753the access type and the base type of an access tag must be the same, and
4754that is the TBAA root of the access tag.
Sean Silvab084af42012-12-07 10:36:55 +00004755
4756'``tbaa.struct``' Metadata
4757^^^^^^^^^^^^^^^^^^^^^^^^^^
4758
4759The :ref:`llvm.memcpy <int_memcpy>` is often used to implement
4760aggregate assignment operations in C and similar languages, however it
4761is defined to copy a contiguous region of memory, which is more than
4762strictly necessary for aggregate types which contain holes due to
4763padding. Also, it doesn't contain any TBAA information about the fields
4764of the aggregate.
4765
4766``!tbaa.struct`` metadata can describe which memory subregions in a
4767memcpy are padding and what the TBAA tags of the struct are.
4768
4769The current metadata format is very simple. ``!tbaa.struct`` metadata
4770nodes are a list of operands which are in conceptual groups of three.
4771For each group of three, the first operand gives the byte offset of a
4772field in bytes, the second gives its size in bytes, and the third gives
4773its tbaa tag. e.g.:
4774
4775.. code-block:: llvm
4776
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004777 !4 = !{ i64 0, i64 4, !1, i64 8, i64 4, !2 }
Sean Silvab084af42012-12-07 10:36:55 +00004778
4779This describes a struct with two fields. The first is at offset 0 bytes
4780with size 4 bytes, and has tbaa tag !1. The second is at offset 8 bytes
4781and has size 4 bytes and has tbaa tag !2.
4782
4783Note that the fields need not be contiguous. In this example, there is a
47844 byte gap between the two fields. This gap represents padding which
4785does not carry useful data and need not be preserved.
4786
Hal Finkel94146652014-07-24 14:25:39 +00004787'``noalias``' and '``alias.scope``' Metadata
Dan Liewbafdcba2014-07-28 13:33:51 +00004788^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Hal Finkel94146652014-07-24 14:25:39 +00004789
4790``noalias`` and ``alias.scope`` metadata provide the ability to specify generic
4791noalias memory-access sets. This means that some collection of memory access
4792instructions (loads, stores, memory-accessing calls, etc.) that carry
4793``noalias`` metadata can specifically be specified not to alias with some other
4794collection of memory access instructions that carry ``alias.scope`` metadata.
Hal Finkel029cde62014-07-25 15:50:02 +00004795Each type of metadata specifies a list of scopes where each scope has an id and
Adam Nemet569a5b32016-04-27 00:52:48 +00004796a domain.
4797
4798When evaluating an aliasing query, if for some domain, the set
Hal Finkel029cde62014-07-25 15:50:02 +00004799of scopes with that domain in one instruction's ``alias.scope`` list is a
Arch D. Robison96cf7ab2015-02-24 20:11:49 +00004800subset of (or equal to) the set of scopes for that domain in another
Hal Finkel029cde62014-07-25 15:50:02 +00004801instruction's ``noalias`` list, then the two memory accesses are assumed not to
4802alias.
Hal Finkel94146652014-07-24 14:25:39 +00004803
Adam Nemet569a5b32016-04-27 00:52:48 +00004804Because scopes in one domain don't affect scopes in other domains, separate
4805domains can be used to compose multiple independent noalias sets. This is
4806used for example during inlining. As the noalias function parameters are
4807turned into noalias scope metadata, a new domain is used every time the
4808function is inlined.
4809
Hal Finkel029cde62014-07-25 15:50:02 +00004810The metadata identifying each domain is itself a list containing one or two
4811entries. The first entry is the name of the domain. Note that if the name is a
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004812string then it can be combined across functions and translation units. A
Hal Finkel029cde62014-07-25 15:50:02 +00004813self-reference can be used to create globally unique domain names. A
4814descriptive string may optionally be provided as a second list entry.
4815
4816The metadata identifying each scope is also itself a list containing two or
4817three entries. The first entry is the name of the scope. Note that if the name
Bruce Mitchenere9ffb452015-09-12 01:17:08 +00004818is a string then it can be combined across functions and translation units. A
Hal Finkel029cde62014-07-25 15:50:02 +00004819self-reference can be used to create globally unique scope names. A metadata
4820reference to the scope's domain is the second entry. A descriptive string may
4821optionally be provided as a third list entry.
Hal Finkel94146652014-07-24 14:25:39 +00004822
4823For example,
4824
4825.. code-block:: llvm
4826
Hal Finkel029cde62014-07-25 15:50:02 +00004827 ; Two scope domains:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004828 !0 = !{!0}
4829 !1 = !{!1}
Hal Finkel94146652014-07-24 14:25:39 +00004830
Hal Finkel029cde62014-07-25 15:50:02 +00004831 ; Some scopes in these domains:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004832 !2 = !{!2, !0}
4833 !3 = !{!3, !0}
4834 !4 = !{!4, !1}
Hal Finkel94146652014-07-24 14:25:39 +00004835
Hal Finkel029cde62014-07-25 15:50:02 +00004836 ; Some scope lists:
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004837 !5 = !{!4} ; A list containing only scope !4
4838 !6 = !{!4, !3, !2}
4839 !7 = !{!3}
Hal Finkel94146652014-07-24 14:25:39 +00004840
4841 ; These two instructions don't alias:
David Blaikiec7aabbb2015-03-04 22:06:14 +00004842 %0 = load float, float* %c, align 4, !alias.scope !5
Hal Finkel029cde62014-07-25 15:50:02 +00004843 store float %0, float* %arrayidx.i, align 4, !noalias !5
Hal Finkel94146652014-07-24 14:25:39 +00004844
Hal Finkel029cde62014-07-25 15:50:02 +00004845 ; These two instructions also don't alias (for domain !1, the set of scopes
4846 ; in the !alias.scope equals that in the !noalias list):
David Blaikiec7aabbb2015-03-04 22:06:14 +00004847 %2 = load float, float* %c, align 4, !alias.scope !5
Hal Finkel029cde62014-07-25 15:50:02 +00004848 store float %2, float* %arrayidx.i2, align 4, !noalias !6
Hal Finkel94146652014-07-24 14:25:39 +00004849
Adam Nemet0a8416f2015-05-11 08:30:28 +00004850 ; These two instructions may alias (for domain !0, the set of scopes in
Hal Finkel029cde62014-07-25 15:50:02 +00004851 ; the !noalias list is not a superset of, or equal to, the scopes in the
4852 ; !alias.scope list):
David Blaikiec7aabbb2015-03-04 22:06:14 +00004853 %2 = load float, float* %c, align 4, !alias.scope !6
Hal Finkel029cde62014-07-25 15:50:02 +00004854 store float %0, float* %arrayidx.i, align 4, !noalias !7
Hal Finkel94146652014-07-24 14:25:39 +00004855
Sean Silvab084af42012-12-07 10:36:55 +00004856'``fpmath``' Metadata
4857^^^^^^^^^^^^^^^^^^^^^
4858
4859``fpmath`` metadata may be attached to any instruction of floating point
4860type. It can be used to express the maximum acceptable error in the
4861result of that instruction, in ULPs, thus potentially allowing the
4862compiler to use a more efficient but less accurate method of computing
4863it. ULP is defined as follows:
4864
4865 If ``x`` is a real number that lies between two finite consecutive
4866 floating-point numbers ``a`` and ``b``, without being equal to one
4867 of them, then ``ulp(x) = |b - a|``, otherwise ``ulp(x)`` is the
4868 distance between the two non-equal finite floating-point numbers
4869 nearest ``x``. Moreover, ``ulp(NaN)`` is ``NaN``.
4870
Matt Arsenault82f41512016-06-27 19:43:15 +00004871The metadata node shall consist of a single positive float type number
4872representing the maximum relative error, for example:
Sean Silvab084af42012-12-07 10:36:55 +00004873
4874.. code-block:: llvm
4875
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004876 !0 = !{ float 2.5 } ; maximum acceptable inaccuracy is 2.5 ULPs
Sean Silvab084af42012-12-07 10:36:55 +00004877
Philip Reamesf8bf9dd2015-02-27 23:14:50 +00004878.. _range-metadata:
4879
Sean Silvab084af42012-12-07 10:36:55 +00004880'``range``' Metadata
4881^^^^^^^^^^^^^^^^^^^^
4882
Jingyue Wu37fcb592014-06-19 16:50:16 +00004883``range`` metadata may be attached only to ``load``, ``call`` and ``invoke`` of
4884integer types. It expresses the possible ranges the loaded value or the value
4885returned by the called function at this call site is in. The ranges are
4886represented with a flattened list of integers. The loaded value or the value
4887returned is known to be in the union of the ranges defined by each consecutive
4888pair. Each pair has the following properties:
Sean Silvab084af42012-12-07 10:36:55 +00004889
4890- The type must match the type loaded by the instruction.
4891- The pair ``a,b`` represents the range ``[a,b)``.
4892- Both ``a`` and ``b`` are constants.
4893- The range is allowed to wrap.
4894- The range should not represent the full or empty set. That is,
4895 ``a!=b``.
4896
4897In addition, the pairs must be in signed order of the lower bound and
4898they must be non-contiguous.
4899
4900Examples:
4901
4902.. code-block:: llvm
4903
David Blaikiec7aabbb2015-03-04 22:06:14 +00004904 %a = load i8, i8* %x, align 1, !range !0 ; Can only be 0 or 1
4905 %b = load i8, i8* %y, align 1, !range !1 ; Can only be 255 (-1), 0 or 1
Jingyue Wu37fcb592014-06-19 16:50:16 +00004906 %c = call i8 @foo(), !range !2 ; Can only be 0, 1, 3, 4 or 5
4907 %d = invoke i8 @bar() to label %cont
4908 unwind label %lpad, !range !3 ; Can only be -2, -1, 3, 4 or 5
Sean Silvab084af42012-12-07 10:36:55 +00004909 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004910 !0 = !{ i8 0, i8 2 }
4911 !1 = !{ i8 255, i8 2 }
4912 !2 = !{ i8 0, i8 2, i8 3, i8 6 }
4913 !3 = !{ i8 -2, i8 0, i8 3, i8 6 }
Sean Silvab084af42012-12-07 10:36:55 +00004914
Peter Collingbourne235c2752016-12-08 19:01:00 +00004915'``absolute_symbol``' Metadata
4916^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4917
4918``absolute_symbol`` metadata may be attached to a global variable
4919declaration. It marks the declaration as a reference to an absolute symbol,
4920which causes the backend to use absolute relocations for the symbol even
4921in position independent code, and expresses the possible ranges that the
4922global variable's *address* (not its value) is in, in the same format as
Peter Collingbourned88f9282017-01-20 21:56:37 +00004923``range`` metadata, with the extension that the pair ``all-ones,all-ones``
4924may be used to represent the full set.
Peter Collingbourne235c2752016-12-08 19:01:00 +00004925
Peter Collingbourned88f9282017-01-20 21:56:37 +00004926Example (assuming 64-bit pointers):
Peter Collingbourne235c2752016-12-08 19:01:00 +00004927
4928.. code-block:: llvm
4929
4930 @a = external global i8, !absolute_symbol !0 ; Absolute symbol in range [0,256)
Peter Collingbourned88f9282017-01-20 21:56:37 +00004931 @b = external global i8, !absolute_symbol !1 ; Absolute symbol in range [0,2^64)
Peter Collingbourne235c2752016-12-08 19:01:00 +00004932
4933 ...
4934 !0 = !{ i64 0, i64 256 }
Peter Collingbourned88f9282017-01-20 21:56:37 +00004935 !1 = !{ i64 -1, i64 -1 }
Peter Collingbourne235c2752016-12-08 19:01:00 +00004936
Matthew Simpson36bbc8c2017-10-16 22:22:11 +00004937'``callees``' Metadata
4938^^^^^^^^^^^^^^^^^^^^^^
4939
4940``callees`` metadata may be attached to indirect call sites. If ``callees``
4941metadata is attached to a call site, and any callee is not among the set of
4942functions provided by the metadata, the behavior is undefined. The intent of
4943this metadata is to facilitate optimizations such as indirect-call promotion.
4944For example, in the code below, the call instruction may only target the
4945``add`` or ``sub`` functions:
4946
4947.. code-block:: llvm
4948
4949 %result = call i64 %binop(i64 %x, i64 %y), !callees !0
4950
4951 ...
4952 !0 = !{i64 (i64, i64)* @add, i64 (i64, i64)* @sub}
4953
Sanjay Patela99ab1f2015-09-02 19:06:43 +00004954'``unpredictable``' Metadata
Sanjay Patel1f12b342015-09-02 19:35:31 +00004955^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sanjay Patela99ab1f2015-09-02 19:06:43 +00004956
4957``unpredictable`` metadata may be attached to any branch or switch
4958instruction. It can be used to express the unpredictability of control
4959flow. Similar to the llvm.expect intrinsic, it may be used to alter
4960optimizations related to compare and branch instructions. The metadata
4961is treated as a boolean value; if it exists, it signals that the branch
4962or switch that it is attached to is completely unpredictable.
4963
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004964'``llvm.loop``'
4965^^^^^^^^^^^^^^^
4966
4967It is sometimes useful to attach information to loop constructs. Currently,
4968loop metadata is implemented as metadata attached to the branch instruction
4969in the loop latch block. This type of metadata refer to a metadata node that is
Matt Arsenault24b49c42013-07-31 17:49:08 +00004970guaranteed to be separate for each loop. The loop identifier metadata is
Paul Redmond5fdf8362013-05-28 20:00:34 +00004971specified with the name ``llvm.loop``.
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004972
4973The loop identifier metadata is implemented using a metadata that refers to
Michael Liaoa7699082013-03-06 18:24:34 +00004974itself to avoid merging it with any other identifier metadata, e.g.,
4975during module linkage or function inlining. That is, each loop should refer
4976to their own identification metadata even if they reside in separate functions.
4977The following example contains loop identifier metadata for two separate loop
Pekka Jaaskelainen119a2b62013-02-22 12:03:07 +00004978constructs:
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004979
4980.. code-block:: llvm
Paul Redmondeaaed3b2013-02-21 17:20:45 +00004981
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004982 !0 = !{!0}
4983 !1 = !{!1}
Pekka Jaaskelainen119a2b62013-02-22 12:03:07 +00004984
Mark Heffernan893752a2014-07-18 19:24:51 +00004985The loop identifier metadata can be used to specify additional
4986per-loop metadata. Any operands after the first operand can be treated
4987as user-defined metadata. For example the ``llvm.loop.unroll.count``
4988suggests an unroll factor to the loop unroller:
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004989
Paul Redmond5fdf8362013-05-28 20:00:34 +00004990.. code-block:: llvm
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00004991
Paul Redmond5fdf8362013-05-28 20:00:34 +00004992 br i1 %exitcond, label %._crit_edge, label %.lr.ph, !llvm.loop !0
4993 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00004994 !0 = !{!0, !1}
4995 !1 = !{!"llvm.loop.unroll.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00004996
Mark Heffernan9d20e422014-07-21 23:11:03 +00004997'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
4998^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mark Heffernan893752a2014-07-18 19:24:51 +00004999
Mark Heffernan9d20e422014-07-21 23:11:03 +00005000Metadata prefixed with ``llvm.loop.vectorize`` or ``llvm.loop.interleave`` are
5001used to control per-loop vectorization and interleaving parameters such as
Sean Silvaa1190322015-08-06 22:56:48 +00005002vectorization width and interleave count. These metadata should be used in
5003conjunction with ``llvm.loop`` loop identification metadata. The
Mark Heffernan9d20e422014-07-21 23:11:03 +00005004``llvm.loop.vectorize`` and ``llvm.loop.interleave`` metadata are only
5005optimization hints and the optimizer will only interleave and vectorize loops if
Sean Silvaa1190322015-08-06 22:56:48 +00005006it believes it is safe to do so. The ``llvm.mem.parallel_loop_access`` metadata
Mark Heffernan9d20e422014-07-21 23:11:03 +00005007which contains information about loop-carried memory dependencies can be helpful
5008in determining the safety of these transformations.
Mark Heffernan893752a2014-07-18 19:24:51 +00005009
Mark Heffernan9d20e422014-07-21 23:11:03 +00005010'``llvm.loop.interleave.count``' Metadata
Mark Heffernan893752a2014-07-18 19:24:51 +00005011^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5012
Mark Heffernan9d20e422014-07-21 23:11:03 +00005013This metadata suggests an interleave count to the loop interleaver.
5014The first operand is the string ``llvm.loop.interleave.count`` and the
Mark Heffernan893752a2014-07-18 19:24:51 +00005015second operand is an integer specifying the interleave count. For
5016example:
5017
5018.. code-block:: llvm
5019
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005020 !0 = !{!"llvm.loop.interleave.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00005021
Mark Heffernan9d20e422014-07-21 23:11:03 +00005022Note that setting ``llvm.loop.interleave.count`` to 1 disables interleaving
Sean Silvaa1190322015-08-06 22:56:48 +00005023multiple iterations of the loop. If ``llvm.loop.interleave.count`` is set to 0
Mark Heffernan9d20e422014-07-21 23:11:03 +00005024then the interleave count will be determined automatically.
5025
5026'``llvm.loop.vectorize.enable``' Metadata
Dan Liew9a1829d2014-07-22 14:59:38 +00005027^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Mark Heffernan9d20e422014-07-21 23:11:03 +00005028
5029This metadata selectively enables or disables vectorization for the loop. The
5030first operand is the string ``llvm.loop.vectorize.enable`` and the second operand
Sean Silvaa1190322015-08-06 22:56:48 +00005031is a bit. If the bit operand value is 1 vectorization is enabled. A value of
Mark Heffernan9d20e422014-07-21 23:11:03 +000050320 disables vectorization:
5033
5034.. code-block:: llvm
5035
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005036 !0 = !{!"llvm.loop.vectorize.enable", i1 0}
5037 !1 = !{!"llvm.loop.vectorize.enable", i1 1}
Mark Heffernan893752a2014-07-18 19:24:51 +00005038
5039'``llvm.loop.vectorize.width``' Metadata
5040^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5041
5042This metadata sets the target width of the vectorizer. The first
5043operand is the string ``llvm.loop.vectorize.width`` and the second
5044operand is an integer specifying the width. For example:
5045
5046.. code-block:: llvm
5047
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005048 !0 = !{!"llvm.loop.vectorize.width", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00005049
5050Note that setting ``llvm.loop.vectorize.width`` to 1 disables
Sean Silvaa1190322015-08-06 22:56:48 +00005051vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
Mark Heffernan893752a2014-07-18 19:24:51 +000050520 or if the loop does not have this metadata the width will be
5053determined automatically.
5054
5055'``llvm.loop.unroll``'
5056^^^^^^^^^^^^^^^^^^^^^^
5057
5058Metadata prefixed with ``llvm.loop.unroll`` are loop unrolling
5059optimization hints such as the unroll factor. ``llvm.loop.unroll``
5060metadata should be used in conjunction with ``llvm.loop`` loop
5061identification metadata. The ``llvm.loop.unroll`` metadata are only
5062optimization hints and the unrolling will only be performed if the
5063optimizer believes it is safe to do so.
5064
Mark Heffernan893752a2014-07-18 19:24:51 +00005065'``llvm.loop.unroll.count``' Metadata
5066^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5067
5068This metadata suggests an unroll factor to the loop unroller. The
5069first operand is the string ``llvm.loop.unroll.count`` and the second
5070operand is a positive integer specifying the unroll factor. For
5071example:
5072
5073.. code-block:: llvm
5074
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005075 !0 = !{!"llvm.loop.unroll.count", i32 4}
Mark Heffernan893752a2014-07-18 19:24:51 +00005076
5077If the trip count of the loop is less than the unroll count the loop
5078will be partially unrolled.
5079
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005080'``llvm.loop.unroll.disable``' Metadata
5081^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5082
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00005083This metadata disables loop unrolling. The metadata has a single operand
Sean Silvaa1190322015-08-06 22:56:48 +00005084which is the string ``llvm.loop.unroll.disable``. For example:
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005085
5086.. code-block:: llvm
5087
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005088 !0 = !{!"llvm.loop.unroll.disable"}
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005089
Kevin Qin715b01e2015-03-09 06:14:18 +00005090'``llvm.loop.unroll.runtime.disable``' Metadata
Dan Liew868b0742015-03-11 13:34:49 +00005091^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Kevin Qin715b01e2015-03-09 06:14:18 +00005092
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00005093This metadata disables runtime loop unrolling. The metadata has a single
Sean Silvaa1190322015-08-06 22:56:48 +00005094operand which is the string ``llvm.loop.unroll.runtime.disable``. For example:
Kevin Qin715b01e2015-03-09 06:14:18 +00005095
5096.. code-block:: llvm
5097
5098 !0 = !{!"llvm.loop.unroll.runtime.disable"}
5099
Mark Heffernan89391542015-08-10 17:28:08 +00005100'``llvm.loop.unroll.enable``' Metadata
5101^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5102
5103This metadata suggests that the loop should be fully unrolled if the trip count
5104is known at compile time and partially unrolled if the trip count is not known
5105at compile time. The metadata has a single operand which is the string
5106``llvm.loop.unroll.enable``. For example:
5107
5108.. code-block:: llvm
5109
5110 !0 = !{!"llvm.loop.unroll.enable"}
5111
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005112'``llvm.loop.unroll.full``' Metadata
5113^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5114
Mark Heffernan3e32a4e2015-06-30 22:48:51 +00005115This metadata suggests that the loop should be unrolled fully. The
5116metadata has a single operand which is the string ``llvm.loop.unroll.full``.
Mark Heffernane6b4ba12014-07-23 17:31:37 +00005117For example:
5118
5119.. code-block:: llvm
5120
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005121 !0 = !{!"llvm.loop.unroll.full"}
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005122
Ashutosh Nemadf6763a2016-02-06 07:47:48 +00005123'``llvm.loop.licm_versioning.disable``' Metadata
Ashutosh Nema5f0e4722016-02-06 09:24:37 +00005124^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Ashutosh Nemadf6763a2016-02-06 07:47:48 +00005125
5126This metadata indicates that the loop should not be versioned for the purpose
5127of enabling loop-invariant code motion (LICM). The metadata has a single operand
5128which is the string ``llvm.loop.licm_versioning.disable``. For example:
5129
5130.. code-block:: llvm
5131
5132 !0 = !{!"llvm.loop.licm_versioning.disable"}
5133
Adam Nemetd2fa4142016-04-27 05:28:18 +00005134'``llvm.loop.distribute.enable``' Metadata
Adam Nemet55dc0af2016-04-27 05:59:51 +00005135^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Adam Nemetd2fa4142016-04-27 05:28:18 +00005136
5137Loop distribution allows splitting a loop into multiple loops. Currently,
5138this is only performed if the entire loop cannot be vectorized due to unsafe
Hiroshi Inoueb93daec2017-07-02 12:44:27 +00005139memory dependencies. The transformation will attempt to isolate the unsafe
Adam Nemetd2fa4142016-04-27 05:28:18 +00005140dependencies into their own loop.
5141
5142This metadata can be used to selectively enable or disable distribution of the
5143loop. The first operand is the string ``llvm.loop.distribute.enable`` and the
5144second operand is a bit. If the bit operand value is 1 distribution is
5145enabled. A value of 0 disables distribution:
5146
5147.. code-block:: llvm
5148
5149 !0 = !{!"llvm.loop.distribute.enable", i1 0}
5150 !1 = !{!"llvm.loop.distribute.enable", i1 1}
5151
5152This metadata should be used in conjunction with ``llvm.loop`` loop
5153identification metadata.
5154
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005155'``llvm.mem``'
5156^^^^^^^^^^^^^^^
5157
5158Metadata types used to annotate memory accesses with information helpful
5159for optimizations are prefixed with ``llvm.mem``.
5160
5161'``llvm.mem.parallel_loop_access``' Metadata
5162^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5163
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005164The ``llvm.mem.parallel_loop_access`` metadata refers to a loop identifier,
5165or metadata containing a list of loop identifiers for nested loops.
5166The metadata is attached to memory accessing instructions and denotes that
5167no loop carried memory dependence exist between it and other instructions denoted
Hal Finkel411d31a2016-04-26 02:00:36 +00005168with the same loop identifier. The metadata on memory reads also implies that
5169if conversion (i.e. speculative execution within a loop iteration) is safe.
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005170
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005171Precisely, given two instructions ``m1`` and ``m2`` that both have the
5172``llvm.mem.parallel_loop_access`` metadata, with ``L1`` and ``L2`` being the
5173set of loops associated with that metadata, respectively, then there is no loop
5174carried dependence between ``m1`` and ``m2`` for loops in both ``L1`` and
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005175``L2``.
5176
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005177As a special case, if all memory accessing instructions in a loop have
5178``llvm.mem.parallel_loop_access`` metadata that refers to that loop, then the
5179loop has no loop carried memory dependences and is considered to be a parallel
5180loop.
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005181
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005182Note that if not all memory access instructions have such metadata referring to
5183the loop, then the loop is considered not being trivially parallel. Additional
Sean Silvaa1190322015-08-06 22:56:48 +00005184memory dependence analysis is required to make that determination. As a fail
Mehdi Amini4a121fa2015-03-14 22:04:06 +00005185safe mechanism, this causes loops that were originally parallel to be considered
5186sequential (if optimization passes that are unaware of the parallel semantics
Pekka Jaaskelainen23b222cc2014-05-23 11:35:46 +00005187insert new memory instructions into the loop body).
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005188
5189Example of a loop that is considered parallel due to its correct use of
Paul Redmond5fdf8362013-05-28 20:00:34 +00005190both ``llvm.loop`` and ``llvm.mem.parallel_loop_access``
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005191metadata types that refer to the same loop identifier metadata.
5192
5193.. code-block:: llvm
5194
5195 for.body:
Paul Redmond5fdf8362013-05-28 20:00:34 +00005196 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00005197 %val0 = load i32, i32* %arrayidx, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005198 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005199 store i32 %val0, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005200 ...
5201 br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005202
5203 for.end:
5204 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005205 !0 = !{!0}
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005206
5207It is also possible to have nested parallel loops. In that case the
5208memory accesses refer to a list of loop identifier metadata nodes instead of
5209the loop identifier metadata node directly:
5210
5211.. code-block:: llvm
5212
5213 outer.for.body:
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005214 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00005215 %val1 = load i32, i32* %arrayidx3, !llvm.mem.parallel_loop_access !2
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005216 ...
5217 br label %inner.for.body
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005218
5219 inner.for.body:
Paul Redmond5fdf8362013-05-28 20:00:34 +00005220 ...
David Blaikiec7aabbb2015-03-04 22:06:14 +00005221 %val0 = load i32, i32* %arrayidx1, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005222 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005223 store i32 %val0, i32* %arrayidx2, !llvm.mem.parallel_loop_access !0
Paul Redmond5fdf8362013-05-28 20:00:34 +00005224 ...
5225 br i1 %exitcond, label %inner.for.end, label %inner.for.body, !llvm.loop !1
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005226
5227 inner.for.end:
Paul Redmond5fdf8362013-05-28 20:00:34 +00005228 ...
Tobias Grosserfbe95dc2014-03-05 13:36:04 +00005229 store i32 %val1, i32* %arrayidx4, !llvm.mem.parallel_loop_access !2
Paul Redmond5fdf8362013-05-28 20:00:34 +00005230 ...
5231 br i1 %exitcond, label %outer.for.end, label %outer.for.body, !llvm.loop !2
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005232
5233 outer.for.end: ; preds = %for.body
5234 ...
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005235 !0 = !{!1, !2} ; a list of loop identifiers
5236 !1 = !{!1} ; an identifier for the inner loop
5237 !2 = !{!2} ; an identifier for the outer loop
Pekka Jaaskelainen0d237252013-02-13 18:08:57 +00005238
Hiroshi Yamauchidce9def2017-11-02 22:26:51 +00005239'``irr_loop``' Metadata
5240^^^^^^^^^^^^^^^^^^^^^^^
5241
5242``irr_loop`` metadata may be attached to the terminator instruction of a basic
5243block that's an irreducible loop header (note that an irreducible loop has more
5244than once header basic blocks.) If ``irr_loop`` metadata is attached to the
5245terminator instruction of a basic block that is not really an irreducible loop
5246header, the behavior is undefined. The intent of this metadata is to improve the
5247accuracy of the block frequency propagation. For example, in the code below, the
5248block ``header0`` may have a loop header weight (relative to the other headers of
5249the irreducible loop) of 100:
5250
5251.. code-block:: llvm
5252
5253 header0:
5254 ...
5255 br i1 %cmp, label %t1, label %t2, !irr_loop !0
5256
5257 ...
5258 !0 = !{"loop_header_weight", i64 100}
5259
5260Irreducible loop header weights are typically based on profile data.
5261
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005262'``invariant.group``' Metadata
5263^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5264
5265The ``invariant.group`` metadata may be attached to ``load``/``store`` instructions.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005266The existence of the ``invariant.group`` metadata on the instruction tells
5267the optimizer that every ``load`` and ``store`` to the same pointer operand
5268within the same invariant group can be assumed to load or store the same
5269value (but see the ``llvm.invariant.group.barrier`` intrinsic which affects
Piotr Padlewskida362152016-12-30 18:45:07 +00005270when two pointers are considered the same). Pointers returned by bitcast or
5271getelementptr with only zero indices are considered the same.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005272
5273Examples:
5274
5275.. code-block:: llvm
5276
5277 @unknownPtr = external global i8
5278 ...
5279 %ptr = alloca i8
5280 store i8 42, i8* %ptr, !invariant.group !0
5281 call void @foo(i8* %ptr)
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005282
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005283 %a = load i8, i8* %ptr, !invariant.group !0 ; Can assume that value under %ptr didn't change
5284 call void @foo(i8* %ptr)
5285 %b = load i8, i8* %ptr, !invariant.group !1 ; Can't assume anything, because group changed
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005286
5287 %newPtr = call i8* @getPointer(i8* %ptr)
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005288 %c = load i8, i8* %newPtr, !invariant.group !0 ; Can't assume anything, because we only have information about %ptr
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005289
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005290 %unknownValue = load i8, i8* @unknownPtr
5291 store i8 %unknownValue, i8* %ptr, !invariant.group !0 ; Can assume that %unknownValue == 42
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005292
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005293 call void @foo(i8* %ptr)
5294 %newPtr2 = call i8* @llvm.invariant.group.barrier(i8* %ptr)
5295 %d = load i8, i8* %newPtr2, !invariant.group !0 ; Can't step through invariant.group.barrier to get value of %ptr
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005296
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005297 ...
5298 declare void @foo(i8*)
5299 declare i8* @getPointer(i8*)
5300 declare i8* @llvm.invariant.group.barrier(i8*)
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005301
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005302 !0 = !{!"magic ptr"}
5303 !1 = !{!"other ptr"}
5304
Piotr Padlewskif8486e32017-04-12 07:59:35 +00005305The invariant.group metadata must be dropped when replacing one pointer by
5306another based on aliasing information. This is because invariant.group is tied
5307to the SSA value of the pointer operand.
5308
5309.. code-block:: llvm
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005310
Piotr Padlewskif8486e32017-04-12 07:59:35 +00005311 %v = load i8, i8* %x, !invariant.group !0
5312 ; if %x mustalias %y then we can replace the above instruction with
5313 %v = load i8, i8* %y
5314
5315
Peter Collingbournea333db82016-07-26 22:31:30 +00005316'``type``' Metadata
5317^^^^^^^^^^^^^^^^^^^
5318
5319See :doc:`TypeMetadata`.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005320
Evgeniy Stepanov51c962f722017-03-17 22:17:24 +00005321'``associated``' Metadata
Evgeniy Stepanov4d490de2017-03-17 22:31:13 +00005322^^^^^^^^^^^^^^^^^^^^^^^^^
Evgeniy Stepanov51c962f722017-03-17 22:17:24 +00005323
5324The ``associated`` metadata may be attached to a global object
5325declaration with a single argument that references another global object.
5326
5327This metadata prevents discarding of the global object in linker GC
5328unless the referenced object is also discarded. The linker support for
5329this feature is spotty. For best compatibility, globals carrying this
5330metadata may also:
5331
5332- Be in a comdat with the referenced global.
5333- Be in @llvm.compiler.used.
5334- Have an explicit section with a name which is a valid C identifier.
5335
5336It does not have any effect on non-ELF targets.
5337
5338Example:
5339
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00005340.. code-block:: text
Evgeniy Stepanov4d490de2017-03-17 22:31:13 +00005341
Evgeniy Stepanov51c962f722017-03-17 22:17:24 +00005342 $a = comdat any
5343 @a = global i32 1, comdat $a
5344 @b = internal global i32 2, comdat $a, section "abc", !associated !0
5345 !0 = !{i32* @a}
5346
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00005347
Teresa Johnsond72f51c2017-06-15 15:57:12 +00005348'``prof``' Metadata
5349^^^^^^^^^^^^^^^^^^^
5350
5351The ``prof`` metadata is used to record profile data in the IR.
5352The first operand of the metadata node indicates the profile metadata
5353type. There are currently 3 types:
5354:ref:`branch_weights<prof_node_branch_weights>`,
5355:ref:`function_entry_count<prof_node_function_entry_count>`, and
5356:ref:`VP<prof_node_VP>`.
5357
5358.. _prof_node_branch_weights:
5359
5360branch_weights
5361""""""""""""""
5362
5363Branch weight metadata attached to a branch, select, switch or call instruction
5364represents the likeliness of the associated branch being taken.
5365For more information, see :doc:`BranchWeightMetadata`.
5366
5367.. _prof_node_function_entry_count:
5368
5369function_entry_count
5370""""""""""""""""""""
5371
5372Function entry count metadata can be attached to function definitions
5373to record the number of times the function is called. Used with BFI
5374information, it is also used to derive the basic block profile count.
5375For more information, see :doc:`BranchWeightMetadata`.
5376
5377.. _prof_node_VP:
5378
5379VP
5380""
5381
5382VP (value profile) metadata can be attached to instructions that have
5383value profile information. Currently this is indirect calls (where it
5384records the hottest callees) and calls to memory intrinsics such as memcpy,
5385memmove, and memset (where it records the hottest byte lengths).
5386
5387Each VP metadata node contains "VP" string, then a uint32_t value for the value
5388profiling kind, a uint64_t value for the total number of times the instruction
5389is executed, followed by uint64_t value and execution count pairs.
5390The value profiling kind is 0 for indirect call targets and 1 for memory
5391operations. For indirect call targets, each profile value is a hash
5392of the callee function name, and for memory operations each value is the
5393byte length.
5394
5395Note that the value counts do not need to add up to the total count
5396listed in the third operand (in practice only the top hottest values
5397are tracked and reported).
5398
5399Indirect call example:
5400
5401.. code-block:: llvm
5402
5403 call void %f(), !prof !1
5404 !1 = !{!"VP", i32 0, i64 1600, i64 7651369219802541373, i64 1030, i64 -4377547752858689819, i64 410}
5405
5406Note that the VP type is 0 (the second operand), which indicates this is
5407an indirect call value profile data. The third operand indicates that the
5408indirect call executed 1600 times. The 4th and 6th operands give the
5409hashes of the 2 hottest target functions' names (this is the same hash used
5410to represent function names in the profile database), and the 5th and 7th
5411operands give the execution count that each of the respective prior target
5412functions was called.
5413
Sean Silvab084af42012-12-07 10:36:55 +00005414Module Flags Metadata
5415=====================
5416
5417Information about the module as a whole is difficult to convey to LLVM's
5418subsystems. The LLVM IR isn't sufficient to transmit this information.
5419The ``llvm.module.flags`` named metadata exists in order to facilitate
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005420this. These flags are in the form of key / value pairs --- much like a
5421dictionary --- making it easy for any subsystem who cares about a flag to
Sean Silvab084af42012-12-07 10:36:55 +00005422look it up.
5423
5424The ``llvm.module.flags`` metadata contains a list of metadata triplets.
5425Each triplet has the following form:
5426
5427- The first element is a *behavior* flag, which specifies the behavior
5428 when two (or more) modules are merged together, and it encounters two
5429 (or more) metadata with the same ID. The supported behaviors are
5430 described below.
5431- The second element is a metadata string that is a unique ID for the
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005432 metadata. Each module may only have one flag entry for each unique ID (not
5433 including entries with the **Require** behavior).
Sean Silvab084af42012-12-07 10:36:55 +00005434- The third element is the value of the flag.
5435
5436When two (or more) modules are merged together, the resulting
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005437``llvm.module.flags`` metadata is the union of the modules' flags. That is, for
5438each unique metadata ID string, there will be exactly one entry in the merged
5439modules ``llvm.module.flags`` metadata table, and the value for that entry will
5440be determined by the merge behavior flag, as described below. The only exception
5441is that entries with the *Require* behavior are always preserved.
Sean Silvab084af42012-12-07 10:36:55 +00005442
5443The following behaviors are supported:
5444
5445.. list-table::
5446 :header-rows: 1
5447 :widths: 10 90
5448
5449 * - Value
5450 - Behavior
5451
5452 * - 1
5453 - **Error**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005454 Emits an error if two values disagree, otherwise the resulting value
5455 is that of the operands.
Sean Silvab084af42012-12-07 10:36:55 +00005456
5457 * - 2
5458 - **Warning**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005459 Emits a warning if two values disagree. The result value will be the
5460 operand for the flag from the first module being linked.
Sean Silvab084af42012-12-07 10:36:55 +00005461
5462 * - 3
5463 - **Require**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005464 Adds a requirement that another module flag be present and have a
5465 specified value after linking is performed. The value must be a
5466 metadata pair, where the first element of the pair is the ID of the
5467 module flag to be restricted, and the second element of the pair is
5468 the value the module flag should be restricted to. This behavior can
5469 be used to restrict the allowable results (via triggering of an
5470 error) of linking IDs with the **Override** behavior.
Sean Silvab084af42012-12-07 10:36:55 +00005471
5472 * - 4
5473 - **Override**
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005474 Uses the specified value, regardless of the behavior or value of the
5475 other module. If both modules specify **Override**, but the values
5476 differ, an error will be emitted.
5477
Daniel Dunbard77d9fb2013-01-16 21:38:56 +00005478 * - 5
5479 - **Append**
5480 Appends the two values, which are required to be metadata nodes.
5481
5482 * - 6
5483 - **AppendUnique**
5484 Appends the two values, which are required to be metadata
5485 nodes. However, duplicate entries in the second list are dropped
5486 during the append operation.
5487
Steven Wu86a511e2017-08-15 16:16:33 +00005488 * - 7
5489 - **Max**
5490 Takes the max of the two values, which are required to be integers.
5491
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005492It is an error for a particular unique flag ID to have multiple behaviors,
5493except in the case of **Require** (which adds restrictions on another metadata
5494value) or **Override**.
Sean Silvab084af42012-12-07 10:36:55 +00005495
5496An example of module flags:
5497
5498.. code-block:: llvm
5499
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005500 !0 = !{ i32 1, !"foo", i32 1 }
5501 !1 = !{ i32 4, !"bar", i32 37 }
5502 !2 = !{ i32 2, !"qux", i32 42 }
5503 !3 = !{ i32 3, !"qux",
5504 !{
5505 !"foo", i32 1
Sean Silvab084af42012-12-07 10:36:55 +00005506 }
5507 }
5508 !llvm.module.flags = !{ !0, !1, !2, !3 }
5509
5510- Metadata ``!0`` has the ID ``!"foo"`` and the value '1'. The behavior
5511 if two or more ``!"foo"`` flags are seen is to emit an error if their
5512 values are not equal.
5513
5514- Metadata ``!1`` has the ID ``!"bar"`` and the value '37'. The
5515 behavior if two or more ``!"bar"`` flags are seen is to use the value
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005516 '37'.
Sean Silvab084af42012-12-07 10:36:55 +00005517
5518- Metadata ``!2`` has the ID ``!"qux"`` and the value '42'. The
5519 behavior if two or more ``!"qux"`` flags are seen is to emit a
5520 warning if their values are not equal.
5521
5522- Metadata ``!3`` has the ID ``!"qux"`` and the value:
5523
5524 ::
5525
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005526 !{ !"foo", i32 1 }
Sean Silvab084af42012-12-07 10:36:55 +00005527
Daniel Dunbar25c4b572013-01-15 01:22:53 +00005528 The behavior is to emit an error if the ``llvm.module.flags`` does not
5529 contain a flag with the ID ``!"foo"`` that has the value '1' after linking is
5530 performed.
Sean Silvab084af42012-12-07 10:36:55 +00005531
5532Objective-C Garbage Collection Module Flags Metadata
5533----------------------------------------------------
5534
5535On the Mach-O platform, Objective-C stores metadata about garbage
5536collection in a special section called "image info". The metadata
5537consists of a version number and a bitmask specifying what types of
5538garbage collection are supported (if any) by the file. If two or more
5539modules are linked together their garbage collection metadata needs to
5540be merged rather than appended together.
5541
5542The Objective-C garbage collection module flags metadata consists of the
5543following key-value pairs:
5544
5545.. list-table::
5546 :header-rows: 1
5547 :widths: 30 70
5548
5549 * - Key
5550 - Value
5551
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005552 * - ``Objective-C Version``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005553 - **[Required]** --- The Objective-C ABI version. Valid values are 1 and 2.
Sean Silvab084af42012-12-07 10:36:55 +00005554
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005555 * - ``Objective-C Image Info Version``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005556 - **[Required]** --- The version of the image info section. Currently
Sean Silvab084af42012-12-07 10:36:55 +00005557 always 0.
5558
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005559 * - ``Objective-C Image Info Section``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005560 - **[Required]** --- The section to place the metadata. Valid values are
Sean Silvab084af42012-12-07 10:36:55 +00005561 ``"__OBJC, __image_info, regular"`` for Objective-C ABI version 1, and
5562 ``"__DATA,__objc_imageinfo, regular, no_dead_strip"`` for
5563 Objective-C ABI version 2.
5564
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005565 * - ``Objective-C Garbage Collection``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005566 - **[Required]** --- Specifies whether garbage collection is supported or
Sean Silvab084af42012-12-07 10:36:55 +00005567 not. Valid values are 0, for no garbage collection, and 2, for garbage
5568 collection supported.
5569
Daniel Dunbar1dc66ca2013-01-17 18:57:32 +00005570 * - ``Objective-C GC Only``
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00005571 - **[Optional]** --- Specifies that only garbage collection is supported.
Sean Silvab084af42012-12-07 10:36:55 +00005572 If present, its value must be 6. This flag requires that the
5573 ``Objective-C Garbage Collection`` flag have the value 2.
5574
5575Some important flag interactions:
5576
5577- If a module with ``Objective-C Garbage Collection`` set to 0 is
5578 merged with a module with ``Objective-C Garbage Collection`` set to
5579 2, then the resulting module has the
5580 ``Objective-C Garbage Collection`` flag set to 0.
5581- A module with ``Objective-C Garbage Collection`` set to 0 cannot be
5582 merged with a module with ``Objective-C GC Only`` set to 6.
5583
Oliver Stannard5dc29342014-06-20 10:08:11 +00005584C type width Module Flags Metadata
5585----------------------------------
5586
5587The ARM backend emits a section into each generated object file describing the
5588options that it was compiled with (in a compiler-independent way) to prevent
5589linking incompatible objects, and to allow automatic library selection. Some
5590of these options are not visible at the IR level, namely wchar_t width and enum
5591width.
5592
5593To pass this information to the backend, these options are encoded in module
5594flags metadata, using the following key-value pairs:
5595
5596.. list-table::
5597 :header-rows: 1
5598 :widths: 30 70
5599
5600 * - Key
5601 - Value
5602
5603 * - short_wchar
5604 - * 0 --- sizeof(wchar_t) == 4
5605 * 1 --- sizeof(wchar_t) == 2
5606
5607 * - short_enum
5608 - * 0 --- Enums are at least as large as an ``int``.
5609 * 1 --- Enums are stored in the smallest integer type which can
5610 represent all of its values.
5611
5612For example, the following metadata section specifies that the module was
5613compiled with a ``wchar_t`` width of 4 bytes, and the underlying type of an
5614enum is the smallest type which can represent all of its values::
5615
5616 !llvm.module.flags = !{!0, !1}
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00005617 !0 = !{i32 1, !"short_wchar", i32 1}
5618 !1 = !{i32 1, !"short_enum", i32 0}
Oliver Stannard5dc29342014-06-20 10:08:11 +00005619
Peter Collingbourne89061b22017-06-12 20:10:48 +00005620Automatic Linker Flags Named Metadata
5621=====================================
5622
5623Some targets support embedding flags to the linker inside individual object
5624files. Typically this is used in conjunction with language extensions which
5625allow source files to explicitly declare the libraries they depend on, and have
5626these automatically be transmitted to the linker via object files.
5627
5628These flags are encoded in the IR using named metadata with the name
5629``!llvm.linker.options``. Each operand is expected to be a metadata node
5630which should be a list of other metadata nodes, each of which should be a
5631list of metadata strings defining linker options.
5632
5633For example, the following metadata section specifies two separate sets of
5634linker options, presumably to link against ``libz`` and the ``Cocoa``
5635framework::
5636
5637 !0 = !{ !"-lz" },
5638 !1 = !{ !"-framework", !"Cocoa" } } }
5639 !llvm.linker.options = !{ !0, !1 }
5640
5641The metadata encoding as lists of lists of options, as opposed to a collapsed
5642list of options, is chosen so that the IR encoding can use multiple option
5643strings to specify e.g., a single library, while still having that specifier be
5644preserved as an atomic element that can be recognized by a target specific
5645assembly writer or object file emitter.
5646
5647Each individual option is required to be either a valid option for the target's
5648linker, or an option that is reserved by the target specific assembly writer or
5649object file emitter. No other aspect of these options is defined by the IR.
5650
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005651.. _intrinsicglobalvariables:
5652
Sean Silvab084af42012-12-07 10:36:55 +00005653Intrinsic Global Variables
5654==========================
5655
5656LLVM has a number of "magic" global variables that contain data that
5657affect code generation or other IR semantics. These are documented here.
5658All globals of this sort should have a section specified as
5659"``llvm.metadata``". This section and all globals that start with
5660"``llvm.``" are reserved for use by LLVM.
5661
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005662.. _gv_llvmused:
5663
Sean Silvab084af42012-12-07 10:36:55 +00005664The '``llvm.used``' Global Variable
5665-----------------------------------
5666
Rafael Espindola74f2e462013-04-22 14:58:02 +00005667The ``@llvm.used`` global is an array which has
Paul Redmond219ef812013-05-30 17:24:32 +00005668:ref:`appending linkage <linkage_appending>`. This array contains a list of
Rafael Espindola70a729d2013-06-11 13:18:13 +00005669pointers to named global variables, functions and aliases which may optionally
5670have a pointer cast formed of bitcast or getelementptr. For example, a legal
Sean Silvab084af42012-12-07 10:36:55 +00005671use of it is:
5672
5673.. code-block:: llvm
5674
5675 @X = global i8 4
5676 @Y = global i32 123
5677
5678 @llvm.used = appending global [2 x i8*] [
5679 i8* @X,
5680 i8* bitcast (i32* @Y to i8*)
5681 ], section "llvm.metadata"
5682
Rafael Espindola74f2e462013-04-22 14:58:02 +00005683If a symbol appears in the ``@llvm.used`` list, then the compiler, assembler,
5684and linker are required to treat the symbol as if there is a reference to the
Rafael Espindola70a729d2013-06-11 13:18:13 +00005685symbol that it cannot see (which is why they have to be named). For example, if
5686a variable has internal linkage and no references other than that from the
5687``@llvm.used`` list, it cannot be deleted. This is commonly used to represent
5688references from inline asms and other things the compiler cannot "see", and
5689corresponds to "``attribute((used))``" in GNU C.
Sean Silvab084af42012-12-07 10:36:55 +00005690
5691On some targets, the code generator must emit a directive to the
5692assembler or object file to prevent the assembler and linker from
5693molesting the symbol.
5694
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005695.. _gv_llvmcompilerused:
5696
Sean Silvab084af42012-12-07 10:36:55 +00005697The '``llvm.compiler.used``' Global Variable
5698--------------------------------------------
5699
5700The ``@llvm.compiler.used`` directive is the same as the ``@llvm.used``
5701directive, except that it only prevents the compiler from touching the
5702symbol. On targets that support it, this allows an intelligent linker to
5703optimize references to the symbol without being impeded as it would be
5704by ``@llvm.used``.
5705
5706This is a rare construct that should only be used in rare circumstances,
5707and should not be exposed to source languages.
5708
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005709.. _gv_llvmglobalctors:
5710
Sean Silvab084af42012-12-07 10:36:55 +00005711The '``llvm.global_ctors``' Global Variable
5712-------------------------------------------
5713
5714.. code-block:: llvm
5715
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005716 %0 = type { i32, void ()*, i8* }
5717 @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }]
Sean Silvab084af42012-12-07 10:36:55 +00005718
5719The ``@llvm.global_ctors`` array contains a list of constructor
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005720functions, priorities, and an optional associated global or function.
5721The functions referenced by this array will be called in ascending order
5722of priority (i.e. lowest first) when the module is loaded. The order of
5723functions with the same priority is not defined.
5724
5725If the third field is present, non-null, and points to a global variable
5726or function, the initializer function will only run if the associated
5727data from the current module is not discarded.
Sean Silvab084af42012-12-07 10:36:55 +00005728
Eli Bendersky0220e6b2013-06-07 20:24:43 +00005729.. _llvmglobaldtors:
5730
Sean Silvab084af42012-12-07 10:36:55 +00005731The '``llvm.global_dtors``' Global Variable
5732-------------------------------------------
5733
5734.. code-block:: llvm
5735
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005736 %0 = type { i32, void ()*, i8* }
5737 @llvm.global_dtors = appending global [1 x %0] [%0 { i32 65535, void ()* @dtor, i8* @data }]
Sean Silvab084af42012-12-07 10:36:55 +00005738
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005739The ``@llvm.global_dtors`` array contains a list of destructor
5740functions, priorities, and an optional associated global or function.
5741The functions referenced by this array will be called in descending
Reid Klecknerbffbcc52014-05-27 21:35:17 +00005742order of priority (i.e. highest first) when the module is unloaded. The
Reid Klecknerfceb76f2014-05-16 20:39:27 +00005743order of functions with the same priority is not defined.
5744
5745If the third field is present, non-null, and points to a global variable
5746or function, the destructor function will only run if the associated
5747data from the current module is not discarded.
Sean Silvab084af42012-12-07 10:36:55 +00005748
5749Instruction Reference
5750=====================
5751
5752The LLVM instruction set consists of several different classifications
5753of instructions: :ref:`terminator instructions <terminators>`, :ref:`binary
5754instructions <binaryops>`, :ref:`bitwise binary
5755instructions <bitwiseops>`, :ref:`memory instructions <memoryops>`, and
5756:ref:`other instructions <otherops>`.
5757
5758.. _terminators:
5759
5760Terminator Instructions
5761-----------------------
5762
5763As mentioned :ref:`previously <functionstructure>`, every basic block in a
5764program ends with a "Terminator" instruction, which indicates which
5765block should be executed after the current block is finished. These
5766terminator instructions typically yield a '``void``' value: they produce
5767control flow, not values (the one exception being the
5768':ref:`invoke <i_invoke>`' instruction).
5769
5770The terminator instructions are: ':ref:`ret <i_ret>`',
5771':ref:`br <i_br>`', ':ref:`switch <i_switch>`',
5772':ref:`indirectbr <i_indirectbr>`', ':ref:`invoke <i_invoke>`',
David Majnemer8a1c45d2015-12-12 05:38:55 +00005773':ref:`resume <i_resume>`', ':ref:`catchswitch <i_catchswitch>`',
David Majnemer654e1302015-07-31 17:58:14 +00005774':ref:`catchret <i_catchret>`',
5775':ref:`cleanupret <i_cleanupret>`',
David Majnemer654e1302015-07-31 17:58:14 +00005776and ':ref:`unreachable <i_unreachable>`'.
Sean Silvab084af42012-12-07 10:36:55 +00005777
5778.. _i_ret:
5779
5780'``ret``' Instruction
5781^^^^^^^^^^^^^^^^^^^^^
5782
5783Syntax:
5784"""""""
5785
5786::
5787
5788 ret <type> <value> ; Return a value from a non-void function
5789 ret void ; Return from void function
5790
5791Overview:
5792"""""""""
5793
5794The '``ret``' instruction is used to return control flow (and optionally
5795a value) from a function back to the caller.
5796
5797There are two forms of the '``ret``' instruction: one that returns a
5798value and then causes control flow, and one that just causes control
5799flow to occur.
5800
5801Arguments:
5802""""""""""
5803
5804The '``ret``' instruction optionally accepts a single argument, the
5805return value. The type of the return value must be a ':ref:`first
5806class <t_firstclass>`' type.
5807
5808A function is not :ref:`well formed <wellformed>` if it it has a non-void
5809return type and contains a '``ret``' instruction with no return value or
5810a return value with a type that does not match its type, or if it has a
5811void return type and contains a '``ret``' instruction with a return
5812value.
5813
5814Semantics:
5815""""""""""
5816
5817When the '``ret``' instruction is executed, control flow returns back to
5818the calling function's context. If the caller is a
5819":ref:`call <i_call>`" instruction, execution continues at the
5820instruction after the call. If the caller was an
5821":ref:`invoke <i_invoke>`" instruction, execution continues at the
5822beginning of the "normal" destination block. If the instruction returns
5823a value, that value shall set the call or invoke instruction's return
5824value.
5825
5826Example:
5827""""""""
5828
5829.. code-block:: llvm
5830
5831 ret i32 5 ; Return an integer value of 5
5832 ret void ; Return from a void function
5833 ret { i32, i8 } { i32 4, i8 2 } ; Return a struct of values 4 and 2
5834
5835.. _i_br:
5836
5837'``br``' Instruction
5838^^^^^^^^^^^^^^^^^^^^
5839
5840Syntax:
5841"""""""
5842
5843::
5844
5845 br i1 <cond>, label <iftrue>, label <iffalse>
5846 br label <dest> ; Unconditional branch
5847
5848Overview:
5849"""""""""
5850
5851The '``br``' instruction is used to cause control flow to transfer to a
5852different basic block in the current function. There are two forms of
5853this instruction, corresponding to a conditional branch and an
5854unconditional branch.
5855
5856Arguments:
5857""""""""""
5858
5859The conditional branch form of the '``br``' instruction takes a single
5860'``i1``' value and two '``label``' values. The unconditional form of the
5861'``br``' instruction takes a single '``label``' value as a target.
5862
5863Semantics:
5864""""""""""
5865
5866Upon execution of a conditional '``br``' instruction, the '``i1``'
5867argument is evaluated. If the value is ``true``, control flows to the
5868'``iftrue``' ``label`` argument. If "cond" is ``false``, control flows
5869to the '``iffalse``' ``label`` argument.
5870
5871Example:
5872""""""""
5873
5874.. code-block:: llvm
5875
5876 Test:
5877 %cond = icmp eq i32 %a, %b
5878 br i1 %cond, label %IfEqual, label %IfUnequal
5879 IfEqual:
5880 ret i32 1
5881 IfUnequal:
5882 ret i32 0
5883
5884.. _i_switch:
5885
5886'``switch``' Instruction
5887^^^^^^^^^^^^^^^^^^^^^^^^
5888
5889Syntax:
5890"""""""
5891
5892::
5893
5894 switch <intty> <value>, label <defaultdest> [ <intty> <val>, label <dest> ... ]
5895
5896Overview:
5897"""""""""
5898
5899The '``switch``' instruction is used to transfer control flow to one of
5900several different places. It is a generalization of the '``br``'
5901instruction, allowing a branch to occur to one of many possible
5902destinations.
5903
5904Arguments:
5905""""""""""
5906
5907The '``switch``' instruction uses three parameters: an integer
5908comparison value '``value``', a default '``label``' destination, and an
5909array of pairs of comparison value constants and '``label``'s. The table
5910is not allowed to contain duplicate constant entries.
5911
5912Semantics:
5913""""""""""
5914
5915The ``switch`` instruction specifies a table of values and destinations.
5916When the '``switch``' instruction is executed, this table is searched
5917for the given value. If the value is found, control flow is transferred
5918to the corresponding destination; otherwise, control flow is transferred
5919to the default destination.
5920
5921Implementation:
5922"""""""""""""""
5923
5924Depending on properties of the target machine and the particular
5925``switch`` instruction, this instruction may be code generated in
5926different ways. For example, it could be generated as a series of
5927chained conditional branches or with a lookup table.
5928
5929Example:
5930""""""""
5931
5932.. code-block:: llvm
5933
5934 ; Emulate a conditional br instruction
5935 %Val = zext i1 %value to i32
5936 switch i32 %Val, label %truedest [ i32 0, label %falsedest ]
5937
5938 ; Emulate an unconditional br instruction
5939 switch i32 0, label %dest [ ]
5940
5941 ; Implement a jump table:
5942 switch i32 %val, label %otherwise [ i32 0, label %onzero
5943 i32 1, label %onone
5944 i32 2, label %ontwo ]
5945
5946.. _i_indirectbr:
5947
5948'``indirectbr``' Instruction
5949^^^^^^^^^^^^^^^^^^^^^^^^^^^^
5950
5951Syntax:
5952"""""""
5953
5954::
5955
5956 indirectbr <somety>* <address>, [ label <dest1>, label <dest2>, ... ]
5957
5958Overview:
5959"""""""""
5960
5961The '``indirectbr``' instruction implements an indirect branch to a
5962label within the current function, whose address is specified by
5963"``address``". Address must be derived from a
5964:ref:`blockaddress <blockaddress>` constant.
5965
5966Arguments:
5967""""""""""
5968
5969The '``address``' argument is the address of the label to jump to. The
5970rest of the arguments indicate the full set of possible destinations
5971that the address may point to. Blocks are allowed to occur multiple
5972times in the destination list, though this isn't particularly useful.
5973
5974This destination list is required so that dataflow analysis has an
5975accurate understanding of the CFG.
5976
5977Semantics:
5978""""""""""
5979
5980Control transfers to the block specified in the address argument. All
5981possible destination blocks must be listed in the label list, otherwise
5982this instruction has undefined behavior. This implies that jumps to
5983labels defined in other functions have undefined behavior as well.
5984
5985Implementation:
5986"""""""""""""""
5987
5988This is typically implemented with a jump through a register.
5989
5990Example:
5991""""""""
5992
5993.. code-block:: llvm
5994
5995 indirectbr i8* %Addr, [ label %bb1, label %bb2, label %bb3 ]
5996
5997.. _i_invoke:
5998
5999'``invoke``' Instruction
6000^^^^^^^^^^^^^^^^^^^^^^^^
6001
6002Syntax:
6003"""""""
6004
6005::
6006
David Blaikieb83cf102016-07-13 17:21:34 +00006007 <result> = invoke [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00006008 [operand bundles] to label <normal label> unwind label <exception label>
Sean Silvab084af42012-12-07 10:36:55 +00006009
6010Overview:
6011"""""""""
6012
6013The '``invoke``' instruction causes control to transfer to a specified
6014function, with the possibility of control flow transfer to either the
6015'``normal``' label or the '``exception``' label. If the callee function
6016returns with the "``ret``" instruction, control flow will return to the
6017"normal" label. If the callee (or any indirect callees) returns via the
6018":ref:`resume <i_resume>`" instruction or other exception handling
6019mechanism, control is interrupted and continued at the dynamically
6020nearest "exception" label.
6021
6022The '``exception``' label is a `landing
6023pad <ExceptionHandling.html#overview>`_ for the exception. As such,
6024'``exception``' label is required to have the
6025":ref:`landingpad <i_landingpad>`" instruction, which contains the
6026information about the behavior of the program after unwinding happens,
6027as its first non-PHI instruction. The restrictions on the
6028"``landingpad``" instruction's tightly couples it to the "``invoke``"
6029instruction, so that the important information contained within the
6030"``landingpad``" instruction can't be lost through normal code motion.
6031
6032Arguments:
6033""""""""""
6034
6035This instruction requires several arguments:
6036
6037#. The optional "cconv" marker indicates which :ref:`calling
6038 convention <callingconv>` the call should use. If none is
6039 specified, the call defaults to using C calling conventions.
6040#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
6041 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
6042 are valid here.
David Blaikieb83cf102016-07-13 17:21:34 +00006043#. '``ty``': the type of the call instruction itself which is also the
6044 type of the return value. Functions that return no value are marked
6045 ``void``.
6046#. '``fnty``': shall be the signature of the function being invoked. The
6047 argument types must match the types implied by this signature. This
6048 type can be omitted if the function is not varargs.
6049#. '``fnptrval``': An LLVM value containing a pointer to a function to
6050 be invoked. In most cases, this is a direct function invocation, but
6051 indirect ``invoke``'s are just as possible, calling an arbitrary pointer
6052 to function value.
Sean Silvab084af42012-12-07 10:36:55 +00006053#. '``function args``': argument list whose types match the function
6054 signature argument types and parameter attributes. All arguments must
6055 be of :ref:`first class <t_firstclass>` type. If the function signature
6056 indicates the function accepts a variable number of arguments, the
6057 extra arguments can be specified.
6058#. '``normal label``': the label reached when the called function
6059 executes a '``ret``' instruction.
6060#. '``exception label``': the label reached when a callee returns via
6061 the :ref:`resume <i_resume>` instruction or other exception handling
6062 mechanism.
George Burgess IV8a464a72017-04-13 05:00:31 +00006063#. The optional :ref:`function attributes <fnattrs>` list.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00006064#. The optional :ref:`operand bundles <opbundles>` list.
Sean Silvab084af42012-12-07 10:36:55 +00006065
6066Semantics:
6067""""""""""
6068
6069This instruction is designed to operate as a standard '``call``'
6070instruction in most regards. The primary difference is that it
6071establishes an association with a label, which is used by the runtime
6072library to unwind the stack.
6073
6074This instruction is used in languages with destructors to ensure that
6075proper cleanup is performed in the case of either a ``longjmp`` or a
6076thrown exception. Additionally, this is important for implementation of
6077'``catch``' clauses in high-level languages that support them.
6078
6079For the purposes of the SSA form, the definition of the value returned
6080by the '``invoke``' instruction is deemed to occur on the edge from the
6081current block to the "normal" label. If the callee unwinds then no
6082return value is available.
6083
6084Example:
6085""""""""
6086
6087.. code-block:: llvm
6088
6089 %retval = invoke i32 @Test(i32 15) to label %Continue
Tim Northover675a0962014-06-13 14:24:23 +00006090 unwind label %TestCleanup ; i32:retval set
Sean Silvab084af42012-12-07 10:36:55 +00006091 %retval = invoke coldcc i32 %Testfnptr(i32 15) to label %Continue
Tim Northover675a0962014-06-13 14:24:23 +00006092 unwind label %TestCleanup ; i32:retval set
Sean Silvab084af42012-12-07 10:36:55 +00006093
6094.. _i_resume:
6095
6096'``resume``' Instruction
6097^^^^^^^^^^^^^^^^^^^^^^^^
6098
6099Syntax:
6100"""""""
6101
6102::
6103
6104 resume <type> <value>
6105
6106Overview:
6107"""""""""
6108
6109The '``resume``' instruction is a terminator instruction that has no
6110successors.
6111
6112Arguments:
6113""""""""""
6114
6115The '``resume``' instruction requires one argument, which must have the
6116same type as the result of any '``landingpad``' instruction in the same
6117function.
6118
6119Semantics:
6120""""""""""
6121
6122The '``resume``' instruction resumes propagation of an existing
6123(in-flight) exception whose unwinding was interrupted with a
6124:ref:`landingpad <i_landingpad>` instruction.
6125
6126Example:
6127""""""""
6128
6129.. code-block:: llvm
6130
6131 resume { i8*, i32 } %exn
6132
David Majnemer8a1c45d2015-12-12 05:38:55 +00006133.. _i_catchswitch:
6134
6135'``catchswitch``' Instruction
Akira Hatanakacedf8e92015-12-14 05:15:40 +00006136^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
David Majnemer8a1c45d2015-12-12 05:38:55 +00006137
6138Syntax:
6139"""""""
6140
6141::
6142
6143 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind to caller
6144 <resultval> = catchswitch within <parent> [ label <handler1>, label <handler2>, ... ] unwind label <default>
6145
6146Overview:
6147"""""""""
6148
6149The '``catchswitch``' instruction is used by `LLVM's exception handling system
6150<ExceptionHandling.html#overview>`_ to describe the set of possible catch handlers
6151that may be executed by the :ref:`EH personality routine <personalityfn>`.
6152
6153Arguments:
6154""""""""""
6155
6156The ``parent`` argument is the token of the funclet that contains the
6157``catchswitch`` instruction. If the ``catchswitch`` is not inside a funclet,
6158this operand may be the token ``none``.
6159
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006160The ``default`` argument is the label of another basic block beginning with
6161either a ``cleanuppad`` or ``catchswitch`` instruction. This unwind destination
6162must be a legal target with respect to the ``parent`` links, as described in
6163the `exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
David Majnemer8a1c45d2015-12-12 05:38:55 +00006164
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006165The ``handlers`` are a nonempty list of successor blocks that each begin with a
David Majnemer8a1c45d2015-12-12 05:38:55 +00006166:ref:`catchpad <i_catchpad>` instruction.
6167
6168Semantics:
6169""""""""""
6170
6171Executing this instruction transfers control to one of the successors in
6172``handlers``, if appropriate, or continues to unwind via the unwind label if
6173present.
6174
6175The ``catchswitch`` is both a terminator and a "pad" instruction, meaning that
6176it must be both the first non-phi instruction and last instruction in the basic
6177block. Therefore, it must be the only non-phi instruction in the block.
6178
6179Example:
6180""""""""
6181
Renato Golin124f2592016-07-20 12:16:38 +00006182.. code-block:: text
David Majnemer8a1c45d2015-12-12 05:38:55 +00006183
6184 dispatch1:
6185 %cs1 = catchswitch within none [label %handler0, label %handler1] unwind to caller
6186 dispatch2:
6187 %cs2 = catchswitch within %parenthandler [label %handler0] unwind label %cleanup
6188
David Majnemer654e1302015-07-31 17:58:14 +00006189.. _i_catchret:
6190
6191'``catchret``' Instruction
6192^^^^^^^^^^^^^^^^^^^^^^^^^^
6193
6194Syntax:
6195"""""""
6196
6197::
6198
David Majnemer8a1c45d2015-12-12 05:38:55 +00006199 catchret from <token> to label <normal>
David Majnemer654e1302015-07-31 17:58:14 +00006200
6201Overview:
6202"""""""""
6203
6204The '``catchret``' instruction is a terminator instruction that has a
6205single successor.
6206
6207
6208Arguments:
6209""""""""""
6210
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00006211The first argument to a '``catchret``' indicates which ``catchpad`` it
6212exits. It must be a :ref:`catchpad <i_catchpad>`.
6213The second argument to a '``catchret``' specifies where control will
6214transfer to next.
David Majnemer654e1302015-07-31 17:58:14 +00006215
6216Semantics:
6217""""""""""
6218
David Majnemer8a1c45d2015-12-12 05:38:55 +00006219The '``catchret``' instruction ends an existing (in-flight) exception whose
6220unwinding was interrupted with a :ref:`catchpad <i_catchpad>` instruction. The
6221:ref:`personality function <personalityfn>` gets a chance to execute arbitrary
6222code to, for example, destroy the active exception. Control then transfers to
6223``normal``.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00006224
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006225The ``token`` argument must be a token produced by a ``catchpad`` instruction.
6226If the specified ``catchpad`` is not the most-recently-entered not-yet-exited
6227funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
6228the ``catchret``'s behavior is undefined.
David Majnemer654e1302015-07-31 17:58:14 +00006229
6230Example:
6231""""""""
6232
Renato Golin124f2592016-07-20 12:16:38 +00006233.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00006234
David Majnemer8a1c45d2015-12-12 05:38:55 +00006235 catchret from %catch label %continue
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00006236
David Majnemer654e1302015-07-31 17:58:14 +00006237.. _i_cleanupret:
6238
6239'``cleanupret``' Instruction
6240^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6241
6242Syntax:
6243"""""""
6244
6245::
6246
David Majnemer8a1c45d2015-12-12 05:38:55 +00006247 cleanupret from <value> unwind label <continue>
6248 cleanupret from <value> unwind to caller
David Majnemer654e1302015-07-31 17:58:14 +00006249
6250Overview:
6251"""""""""
6252
6253The '``cleanupret``' instruction is a terminator instruction that has
6254an optional successor.
6255
6256
6257Arguments:
6258""""""""""
6259
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00006260The '``cleanupret``' instruction requires one argument, which indicates
6261which ``cleanuppad`` it exits, and must be a :ref:`cleanuppad <i_cleanuppad>`.
Joseph Tremoulete28885e2016-01-10 04:28:38 +00006262If the specified ``cleanuppad`` is not the most-recently-entered not-yet-exited
6263funclet pad (as described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
6264the ``cleanupret``'s behavior is undefined.
6265
6266The '``cleanupret``' instruction also has an optional successor, ``continue``,
6267which must be the label of another basic block beginning with either a
6268``cleanuppad`` or ``catchswitch`` instruction. This unwind destination must
6269be a legal target with respect to the ``parent`` links, as described in the
6270`exception handling documentation\ <ExceptionHandling.html#wineh-constraints>`_.
David Majnemer654e1302015-07-31 17:58:14 +00006271
6272Semantics:
6273""""""""""
6274
6275The '``cleanupret``' instruction indicates to the
6276:ref:`personality function <personalityfn>` that one
6277:ref:`cleanuppad <i_cleanuppad>` it transferred control to has ended.
6278It transfers control to ``continue`` or unwinds out of the function.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00006279
David Majnemer654e1302015-07-31 17:58:14 +00006280Example:
6281""""""""
6282
Renato Golin124f2592016-07-20 12:16:38 +00006283.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00006284
David Majnemer8a1c45d2015-12-12 05:38:55 +00006285 cleanupret from %cleanup unwind to caller
6286 cleanupret from %cleanup unwind label %continue
David Majnemer654e1302015-07-31 17:58:14 +00006287
Sean Silvab084af42012-12-07 10:36:55 +00006288.. _i_unreachable:
6289
6290'``unreachable``' Instruction
6291^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
6292
6293Syntax:
6294"""""""
6295
6296::
6297
6298 unreachable
6299
6300Overview:
6301"""""""""
6302
6303The '``unreachable``' instruction has no defined semantics. This
6304instruction is used to inform the optimizer that a particular portion of
6305the code is not reachable. This can be used to indicate that the code
6306after a no-return function cannot be reached, and other facts.
6307
6308Semantics:
6309""""""""""
6310
6311The '``unreachable``' instruction has no defined semantics.
6312
6313.. _binaryops:
6314
6315Binary Operations
6316-----------------
6317
6318Binary operators are used to do most of the computation in a program.
6319They require two operands of the same type, execute an operation on
6320them, and produce a single value. The operands might represent multiple
6321data, as is the case with the :ref:`vector <t_vector>` data type. The
6322result value has the same type as its operands.
6323
6324There are several different binary operators:
6325
6326.. _i_add:
6327
6328'``add``' Instruction
6329^^^^^^^^^^^^^^^^^^^^^
6330
6331Syntax:
6332"""""""
6333
6334::
6335
Tim Northover675a0962014-06-13 14:24:23 +00006336 <result> = add <ty> <op1>, <op2> ; yields ty:result
6337 <result> = add nuw <ty> <op1>, <op2> ; yields ty:result
6338 <result> = add nsw <ty> <op1>, <op2> ; yields ty:result
6339 <result> = add nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006340
6341Overview:
6342"""""""""
6343
6344The '``add``' instruction returns the sum of its two operands.
6345
6346Arguments:
6347""""""""""
6348
6349The two arguments to the '``add``' instruction must be
6350:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6351arguments must have identical types.
6352
6353Semantics:
6354""""""""""
6355
6356The value produced is the integer sum of the two operands.
6357
6358If the sum has unsigned overflow, the result returned is the
6359mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
6360the result.
6361
6362Because LLVM integers use a two's complement representation, this
6363instruction is appropriate for both signed and unsigned integers.
6364
6365``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6366respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6367result value of the ``add`` is a :ref:`poison value <poisonvalues>` if
6368unsigned and/or signed overflow, respectively, occurs.
6369
6370Example:
6371""""""""
6372
Renato Golin124f2592016-07-20 12:16:38 +00006373.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006374
Tim Northover675a0962014-06-13 14:24:23 +00006375 <result> = add i32 4, %var ; yields i32:result = 4 + %var
Sean Silvab084af42012-12-07 10:36:55 +00006376
6377.. _i_fadd:
6378
6379'``fadd``' Instruction
6380^^^^^^^^^^^^^^^^^^^^^^
6381
6382Syntax:
6383"""""""
6384
6385::
6386
Tim Northover675a0962014-06-13 14:24:23 +00006387 <result> = fadd [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006388
6389Overview:
6390"""""""""
6391
6392The '``fadd``' instruction returns the sum of its two operands.
6393
6394Arguments:
6395""""""""""
6396
6397The two arguments to the '``fadd``' instruction must be :ref:`floating
6398point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6399Both arguments must have identical types.
6400
6401Semantics:
6402""""""""""
6403
6404The value produced is the floating point sum of the two operands. This
6405instruction can also take any number of :ref:`fast-math flags <fastmath>`,
6406which are optimization hints to enable otherwise unsafe floating point
6407optimizations:
6408
6409Example:
6410""""""""
6411
Renato Golin124f2592016-07-20 12:16:38 +00006412.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006413
Tim Northover675a0962014-06-13 14:24:23 +00006414 <result> = fadd float 4.0, %var ; yields float:result = 4.0 + %var
Sean Silvab084af42012-12-07 10:36:55 +00006415
6416'``sub``' Instruction
6417^^^^^^^^^^^^^^^^^^^^^
6418
6419Syntax:
6420"""""""
6421
6422::
6423
Tim Northover675a0962014-06-13 14:24:23 +00006424 <result> = sub <ty> <op1>, <op2> ; yields ty:result
6425 <result> = sub nuw <ty> <op1>, <op2> ; yields ty:result
6426 <result> = sub nsw <ty> <op1>, <op2> ; yields ty:result
6427 <result> = sub nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006428
6429Overview:
6430"""""""""
6431
6432The '``sub``' instruction returns the difference of its two operands.
6433
6434Note that the '``sub``' instruction is used to represent the '``neg``'
6435instruction present in most other intermediate representations.
6436
6437Arguments:
6438""""""""""
6439
6440The two arguments to the '``sub``' instruction must be
6441:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6442arguments must have identical types.
6443
6444Semantics:
6445""""""""""
6446
6447The value produced is the integer difference of the two operands.
6448
6449If the difference has unsigned overflow, the result returned is the
6450mathematical result modulo 2\ :sup:`n`\ , where n is the bit width of
6451the result.
6452
6453Because LLVM integers use a two's complement representation, this
6454instruction is appropriate for both signed and unsigned integers.
6455
6456``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6457respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6458result value of the ``sub`` is a :ref:`poison value <poisonvalues>` if
6459unsigned and/or signed overflow, respectively, occurs.
6460
6461Example:
6462""""""""
6463
Renato Golin124f2592016-07-20 12:16:38 +00006464.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006465
Tim Northover675a0962014-06-13 14:24:23 +00006466 <result> = sub i32 4, %var ; yields i32:result = 4 - %var
6467 <result> = sub i32 0, %val ; yields i32:result = -%var
Sean Silvab084af42012-12-07 10:36:55 +00006468
6469.. _i_fsub:
6470
6471'``fsub``' Instruction
6472^^^^^^^^^^^^^^^^^^^^^^
6473
6474Syntax:
6475"""""""
6476
6477::
6478
Tim Northover675a0962014-06-13 14:24:23 +00006479 <result> = fsub [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006480
6481Overview:
6482"""""""""
6483
6484The '``fsub``' instruction returns the difference of its two operands.
6485
6486Note that the '``fsub``' instruction is used to represent the '``fneg``'
6487instruction present in most other intermediate representations.
6488
6489Arguments:
6490""""""""""
6491
6492The two arguments to the '``fsub``' instruction must be :ref:`floating
6493point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6494Both arguments must have identical types.
6495
6496Semantics:
6497""""""""""
6498
6499The value produced is the floating point difference of the two operands.
6500This instruction can also take any number of :ref:`fast-math
6501flags <fastmath>`, which are optimization hints to enable otherwise
6502unsafe floating point optimizations:
6503
6504Example:
6505""""""""
6506
Renato Golin124f2592016-07-20 12:16:38 +00006507.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006508
Tim Northover675a0962014-06-13 14:24:23 +00006509 <result> = fsub float 4.0, %var ; yields float:result = 4.0 - %var
6510 <result> = fsub float -0.0, %val ; yields float:result = -%var
Sean Silvab084af42012-12-07 10:36:55 +00006511
6512'``mul``' Instruction
6513^^^^^^^^^^^^^^^^^^^^^
6514
6515Syntax:
6516"""""""
6517
6518::
6519
Tim Northover675a0962014-06-13 14:24:23 +00006520 <result> = mul <ty> <op1>, <op2> ; yields ty:result
6521 <result> = mul nuw <ty> <op1>, <op2> ; yields ty:result
6522 <result> = mul nsw <ty> <op1>, <op2> ; yields ty:result
6523 <result> = mul nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006524
6525Overview:
6526"""""""""
6527
6528The '``mul``' instruction returns the product of its two operands.
6529
6530Arguments:
6531""""""""""
6532
6533The two arguments to the '``mul``' instruction must be
6534:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6535arguments must have identical types.
6536
6537Semantics:
6538""""""""""
6539
6540The value produced is the integer product of the two operands.
6541
6542If the result of the multiplication has unsigned overflow, the result
6543returned is the mathematical result modulo 2\ :sup:`n`\ , where n is the
6544bit width of the result.
6545
6546Because LLVM integers use a two's complement representation, and the
6547result is the same width as the operands, this instruction returns the
6548correct result for both signed and unsigned integers. If a full product
6549(e.g. ``i32`` * ``i32`` -> ``i64``) is needed, the operands should be
6550sign-extended or zero-extended as appropriate to the width of the full
6551product.
6552
6553``nuw`` and ``nsw`` stand for "No Unsigned Wrap" and "No Signed Wrap",
6554respectively. If the ``nuw`` and/or ``nsw`` keywords are present, the
6555result value of the ``mul`` is a :ref:`poison value <poisonvalues>` if
6556unsigned and/or signed overflow, respectively, occurs.
6557
6558Example:
6559""""""""
6560
Renato Golin124f2592016-07-20 12:16:38 +00006561.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006562
Tim Northover675a0962014-06-13 14:24:23 +00006563 <result> = mul i32 4, %var ; yields i32:result = 4 * %var
Sean Silvab084af42012-12-07 10:36:55 +00006564
6565.. _i_fmul:
6566
6567'``fmul``' Instruction
6568^^^^^^^^^^^^^^^^^^^^^^
6569
6570Syntax:
6571"""""""
6572
6573::
6574
Tim Northover675a0962014-06-13 14:24:23 +00006575 <result> = fmul [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006576
6577Overview:
6578"""""""""
6579
6580The '``fmul``' instruction returns the product of its two operands.
6581
6582Arguments:
6583""""""""""
6584
6585The two arguments to the '``fmul``' instruction must be :ref:`floating
6586point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6587Both arguments must have identical types.
6588
6589Semantics:
6590""""""""""
6591
6592The value produced is the floating point product of the two operands.
6593This instruction can also take any number of :ref:`fast-math
6594flags <fastmath>`, which are optimization hints to enable otherwise
6595unsafe floating point optimizations:
6596
6597Example:
6598""""""""
6599
Renato Golin124f2592016-07-20 12:16:38 +00006600.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006601
Tim Northover675a0962014-06-13 14:24:23 +00006602 <result> = fmul float 4.0, %var ; yields float:result = 4.0 * %var
Sean Silvab084af42012-12-07 10:36:55 +00006603
6604'``udiv``' Instruction
6605^^^^^^^^^^^^^^^^^^^^^^
6606
6607Syntax:
6608"""""""
6609
6610::
6611
Tim Northover675a0962014-06-13 14:24:23 +00006612 <result> = udiv <ty> <op1>, <op2> ; yields ty:result
6613 <result> = udiv exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006614
6615Overview:
6616"""""""""
6617
6618The '``udiv``' instruction returns the quotient of its two operands.
6619
6620Arguments:
6621""""""""""
6622
6623The two arguments to the '``udiv``' instruction must be
6624:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6625arguments must have identical types.
6626
6627Semantics:
6628""""""""""
6629
6630The value produced is the unsigned integer quotient of the two operands.
6631
6632Note that unsigned integer division and signed integer division are
6633distinct operations; for signed integer division, use '``sdiv``'.
6634
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00006635Division by zero is undefined behavior. For vectors, if any element
6636of the divisor is zero, the operation has undefined behavior.
6637
Sean Silvab084af42012-12-07 10:36:55 +00006638
6639If the ``exact`` keyword is present, the result value of the ``udiv`` is
6640a :ref:`poison value <poisonvalues>` if %op1 is not a multiple of %op2 (as
6641such, "((a udiv exact b) mul b) == a").
6642
6643Example:
6644""""""""
6645
Renato Golin124f2592016-07-20 12:16:38 +00006646.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006647
Tim Northover675a0962014-06-13 14:24:23 +00006648 <result> = udiv i32 4, %var ; yields i32:result = 4 / %var
Sean Silvab084af42012-12-07 10:36:55 +00006649
6650'``sdiv``' Instruction
6651^^^^^^^^^^^^^^^^^^^^^^
6652
6653Syntax:
6654"""""""
6655
6656::
6657
Tim Northover675a0962014-06-13 14:24:23 +00006658 <result> = sdiv <ty> <op1>, <op2> ; yields ty:result
6659 <result> = sdiv exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006660
6661Overview:
6662"""""""""
6663
6664The '``sdiv``' instruction returns the quotient of its two operands.
6665
6666Arguments:
6667""""""""""
6668
6669The two arguments to the '``sdiv``' instruction must be
6670:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6671arguments must have identical types.
6672
6673Semantics:
6674""""""""""
6675
6676The value produced is the signed integer quotient of the two operands
6677rounded towards zero.
6678
6679Note that signed integer division and unsigned integer division are
6680distinct operations; for unsigned integer division, use '``udiv``'.
6681
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00006682Division by zero is undefined behavior. For vectors, if any element
6683of the divisor is zero, the operation has undefined behavior.
6684Overflow also leads to undefined behavior; this is a rare case, but can
6685occur, for example, by doing a 32-bit division of -2147483648 by -1.
Sean Silvab084af42012-12-07 10:36:55 +00006686
6687If the ``exact`` keyword is present, the result value of the ``sdiv`` is
6688a :ref:`poison value <poisonvalues>` if the result would be rounded.
6689
6690Example:
6691""""""""
6692
Renato Golin124f2592016-07-20 12:16:38 +00006693.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006694
Tim Northover675a0962014-06-13 14:24:23 +00006695 <result> = sdiv i32 4, %var ; yields i32:result = 4 / %var
Sean Silvab084af42012-12-07 10:36:55 +00006696
6697.. _i_fdiv:
6698
6699'``fdiv``' Instruction
6700^^^^^^^^^^^^^^^^^^^^^^
6701
6702Syntax:
6703"""""""
6704
6705::
6706
Tim Northover675a0962014-06-13 14:24:23 +00006707 <result> = fdiv [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006708
6709Overview:
6710"""""""""
6711
6712The '``fdiv``' instruction returns the quotient of its two operands.
6713
6714Arguments:
6715""""""""""
6716
6717The two arguments to the '``fdiv``' instruction must be :ref:`floating
6718point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6719Both arguments must have identical types.
6720
6721Semantics:
6722""""""""""
6723
6724The value produced is the floating point quotient of the two operands.
6725This instruction can also take any number of :ref:`fast-math
6726flags <fastmath>`, which are optimization hints to enable otherwise
6727unsafe floating point optimizations:
6728
6729Example:
6730""""""""
6731
Renato Golin124f2592016-07-20 12:16:38 +00006732.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006733
Tim Northover675a0962014-06-13 14:24:23 +00006734 <result> = fdiv float 4.0, %var ; yields float:result = 4.0 / %var
Sean Silvab084af42012-12-07 10:36:55 +00006735
6736'``urem``' Instruction
6737^^^^^^^^^^^^^^^^^^^^^^
6738
6739Syntax:
6740"""""""
6741
6742::
6743
Tim Northover675a0962014-06-13 14:24:23 +00006744 <result> = urem <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006745
6746Overview:
6747"""""""""
6748
6749The '``urem``' instruction returns the remainder from the unsigned
6750division of its two arguments.
6751
6752Arguments:
6753""""""""""
6754
6755The two arguments to the '``urem``' instruction must be
6756:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6757arguments must have identical types.
6758
6759Semantics:
6760""""""""""
6761
6762This instruction returns the unsigned integer *remainder* of a division.
6763This instruction always performs an unsigned division to get the
6764remainder.
6765
6766Note that unsigned integer remainder and signed integer remainder are
6767distinct operations; for signed integer remainder, use '``srem``'.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00006768
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00006769Taking the remainder of a division by zero is undefined behavior.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00006770For vectors, if any element of the divisor is zero, the operation has
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00006771undefined behavior.
Sean Silvab084af42012-12-07 10:36:55 +00006772
6773Example:
6774""""""""
6775
Renato Golin124f2592016-07-20 12:16:38 +00006776.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006777
Tim Northover675a0962014-06-13 14:24:23 +00006778 <result> = urem i32 4, %var ; yields i32:result = 4 % %var
Sean Silvab084af42012-12-07 10:36:55 +00006779
6780'``srem``' Instruction
6781^^^^^^^^^^^^^^^^^^^^^^
6782
6783Syntax:
6784"""""""
6785
6786::
6787
Tim Northover675a0962014-06-13 14:24:23 +00006788 <result> = srem <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006789
6790Overview:
6791"""""""""
6792
6793The '``srem``' instruction returns the remainder from the signed
6794division of its two operands. This instruction can also take
6795:ref:`vector <t_vector>` versions of the values in which case the elements
6796must be integers.
6797
6798Arguments:
6799""""""""""
6800
6801The two arguments to the '``srem``' instruction must be
6802:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
6803arguments must have identical types.
6804
6805Semantics:
6806""""""""""
6807
6808This instruction returns the *remainder* of a division (where the result
6809is either zero or has the same sign as the dividend, ``op1``), not the
6810*modulo* operator (where the result is either zero or has the same sign
6811as the divisor, ``op2``) of a value. For more information about the
6812difference, see `The Math
6813Forum <http://mathforum.org/dr.math/problems/anne.4.28.99.html>`_. For a
6814table of how this is implemented in various languages, please see
6815`Wikipedia: modulo
6816operation <http://en.wikipedia.org/wiki/Modulo_operation>`_.
6817
6818Note that signed integer remainder and unsigned integer remainder are
6819distinct operations; for unsigned integer remainder, use '``urem``'.
6820
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00006821Taking the remainder of a division by zero is undefined behavior.
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00006822For vectors, if any element of the divisor is zero, the operation has
Sanjay Patel2b1f6f42017-03-09 16:20:52 +00006823undefined behavior.
Sean Silvab084af42012-12-07 10:36:55 +00006824Overflow also leads to undefined behavior; this is a rare case, but can
6825occur, for example, by taking the remainder of a 32-bit division of
6826-2147483648 by -1. (The remainder doesn't actually overflow, but this
6827rule lets srem be implemented using instructions that return both the
6828result of the division and the remainder.)
6829
6830Example:
6831""""""""
6832
Renato Golin124f2592016-07-20 12:16:38 +00006833.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006834
Tim Northover675a0962014-06-13 14:24:23 +00006835 <result> = srem i32 4, %var ; yields i32:result = 4 % %var
Sean Silvab084af42012-12-07 10:36:55 +00006836
6837.. _i_frem:
6838
6839'``frem``' Instruction
6840^^^^^^^^^^^^^^^^^^^^^^
6841
6842Syntax:
6843"""""""
6844
6845::
6846
Tim Northover675a0962014-06-13 14:24:23 +00006847 <result> = frem [fast-math flags]* <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006848
6849Overview:
6850"""""""""
6851
6852The '``frem``' instruction returns the remainder from the division of
6853its two operands.
6854
6855Arguments:
6856""""""""""
6857
6858The two arguments to the '``frem``' instruction must be :ref:`floating
6859point <t_floating>` or :ref:`vector <t_vector>` of floating point values.
6860Both arguments must have identical types.
6861
6862Semantics:
6863""""""""""
6864
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00006865Return the same value as a libm '``fmod``' function but without trapping or
Sanjay Patel7fb23122017-11-30 14:59:03 +00006866setting ``errno``.
6867
Elena Demikhovsky945b7e52018-02-14 06:58:08 +00006868The remainder has the same sign as the dividend. This instruction can also
Sanjay Patel7fb23122017-11-30 14:59:03 +00006869take any number of :ref:`fast-math flags <fastmath>`, which are optimization
6870hints to enable otherwise unsafe floating-point optimizations:
Sean Silvab084af42012-12-07 10:36:55 +00006871
6872Example:
6873""""""""
6874
Renato Golin124f2592016-07-20 12:16:38 +00006875.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006876
Tim Northover675a0962014-06-13 14:24:23 +00006877 <result> = frem float 4.0, %var ; yields float:result = 4.0 % %var
Sean Silvab084af42012-12-07 10:36:55 +00006878
6879.. _bitwiseops:
6880
6881Bitwise Binary Operations
6882-------------------------
6883
6884Bitwise binary operators are used to do various forms of bit-twiddling
6885in a program. They are generally very efficient instructions and can
6886commonly be strength reduced from other instructions. They require two
6887operands of the same type, execute an operation on them, and produce a
6888single value. The resulting value is the same type as its operands.
6889
6890'``shl``' Instruction
6891^^^^^^^^^^^^^^^^^^^^^
6892
6893Syntax:
6894"""""""
6895
6896::
6897
Tim Northover675a0962014-06-13 14:24:23 +00006898 <result> = shl <ty> <op1>, <op2> ; yields ty:result
6899 <result> = shl nuw <ty> <op1>, <op2> ; yields ty:result
6900 <result> = shl nsw <ty> <op1>, <op2> ; yields ty:result
6901 <result> = shl nuw nsw <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006902
6903Overview:
6904"""""""""
6905
6906The '``shl``' instruction returns the first operand shifted to the left
6907a specified number of bits.
6908
6909Arguments:
6910""""""""""
6911
6912Both arguments to the '``shl``' instruction must be the same
6913:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
6914'``op2``' is treated as an unsigned value.
6915
6916Semantics:
6917""""""""""
6918
6919The value produced is ``op1`` \* 2\ :sup:`op2` mod 2\ :sup:`n`,
6920where ``n`` is the width of the result. If ``op2`` is (statically or
Sean Silvab8a108c2015-04-17 21:58:55 +00006921dynamically) equal to or larger than the number of bits in
Nuno Lopesb2781fb2017-06-06 08:28:17 +00006922``op1``, this instruction returns a :ref:`poison value <poisonvalues>`.
6923If the arguments are vectors, each vector element of ``op1`` is shifted
6924by the corresponding shift amount in ``op2``.
Sean Silvab084af42012-12-07 10:36:55 +00006925
Nuno Lopesb2781fb2017-06-06 08:28:17 +00006926If the ``nuw`` keyword is present, then the shift produces a poison
6927value if it shifts out any non-zero bits.
6928If the ``nsw`` keyword is present, then the shift produces a poison
6929value it shifts out any bits that disagree with the resultant sign bit.
Sean Silvab084af42012-12-07 10:36:55 +00006930
6931Example:
6932""""""""
6933
Renato Golin124f2592016-07-20 12:16:38 +00006934.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006935
Tim Northover675a0962014-06-13 14:24:23 +00006936 <result> = shl i32 4, %var ; yields i32: 4 << %var
6937 <result> = shl i32 4, 2 ; yields i32: 16
6938 <result> = shl i32 1, 10 ; yields i32: 1024
Sean Silvab084af42012-12-07 10:36:55 +00006939 <result> = shl i32 1, 32 ; undefined
6940 <result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 2, i32 4>
6941
6942'``lshr``' Instruction
6943^^^^^^^^^^^^^^^^^^^^^^
6944
6945Syntax:
6946"""""""
6947
6948::
6949
Tim Northover675a0962014-06-13 14:24:23 +00006950 <result> = lshr <ty> <op1>, <op2> ; yields ty:result
6951 <result> = lshr exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00006952
6953Overview:
6954"""""""""
6955
6956The '``lshr``' instruction (logical shift right) returns the first
6957operand shifted to the right a specified number of bits with zero fill.
6958
6959Arguments:
6960""""""""""
6961
6962Both arguments to the '``lshr``' instruction must be the same
6963:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
6964'``op2``' is treated as an unsigned value.
6965
6966Semantics:
6967""""""""""
6968
6969This instruction always performs a logical shift right operation. The
6970most significant bits of the result will be filled with zero bits after
6971the shift. If ``op2`` is (statically or dynamically) equal to or larger
Nuno Lopesb2781fb2017-06-06 08:28:17 +00006972than the number of bits in ``op1``, this instruction returns a :ref:`poison
6973value <poisonvalues>`. If the arguments are vectors, each vector element
6974of ``op1`` is shifted by the corresponding shift amount in ``op2``.
Sean Silvab084af42012-12-07 10:36:55 +00006975
6976If the ``exact`` keyword is present, the result value of the ``lshr`` is
Nuno Lopesb2781fb2017-06-06 08:28:17 +00006977a poison value if any of the bits shifted out are non-zero.
Sean Silvab084af42012-12-07 10:36:55 +00006978
6979Example:
6980""""""""
6981
Renato Golin124f2592016-07-20 12:16:38 +00006982.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00006983
Tim Northover675a0962014-06-13 14:24:23 +00006984 <result> = lshr i32 4, 1 ; yields i32:result = 2
6985 <result> = lshr i32 4, 2 ; yields i32:result = 1
6986 <result> = lshr i8 4, 3 ; yields i8:result = 0
6987 <result> = lshr i8 -2, 1 ; yields i8:result = 0x7F
Sean Silvab084af42012-12-07 10:36:55 +00006988 <result> = lshr i32 1, 32 ; undefined
6989 <result> = lshr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 2> ; yields: result=<2 x i32> < i32 0x7FFFFFFF, i32 1>
6990
6991'``ashr``' Instruction
6992^^^^^^^^^^^^^^^^^^^^^^
6993
6994Syntax:
6995"""""""
6996
6997::
6998
Tim Northover675a0962014-06-13 14:24:23 +00006999 <result> = ashr <ty> <op1>, <op2> ; yields ty:result
7000 <result> = ashr exact <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007001
7002Overview:
7003"""""""""
7004
7005The '``ashr``' instruction (arithmetic shift right) returns the first
7006operand shifted to the right a specified number of bits with sign
7007extension.
7008
7009Arguments:
7010""""""""""
7011
7012Both arguments to the '``ashr``' instruction must be the same
7013:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer type.
7014'``op2``' is treated as an unsigned value.
7015
7016Semantics:
7017""""""""""
7018
7019This instruction always performs an arithmetic shift right operation,
7020The most significant bits of the result will be filled with the sign bit
7021of ``op1``. If ``op2`` is (statically or dynamically) equal to or larger
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007022than the number of bits in ``op1``, this instruction returns a :ref:`poison
7023value <poisonvalues>`. If the arguments are vectors, each vector element
7024of ``op1`` is shifted by the corresponding shift amount in ``op2``.
Sean Silvab084af42012-12-07 10:36:55 +00007025
7026If the ``exact`` keyword is present, the result value of the ``ashr`` is
Nuno Lopesb2781fb2017-06-06 08:28:17 +00007027a poison value if any of the bits shifted out are non-zero.
Sean Silvab084af42012-12-07 10:36:55 +00007028
7029Example:
7030""""""""
7031
Renato Golin124f2592016-07-20 12:16:38 +00007032.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007033
Tim Northover675a0962014-06-13 14:24:23 +00007034 <result> = ashr i32 4, 1 ; yields i32:result = 2
7035 <result> = ashr i32 4, 2 ; yields i32:result = 1
7036 <result> = ashr i8 4, 3 ; yields i8:result = 0
7037 <result> = ashr i8 -2, 1 ; yields i8:result = -1
Sean Silvab084af42012-12-07 10:36:55 +00007038 <result> = ashr i32 1, 32 ; undefined
7039 <result> = ashr <2 x i32> < i32 -2, i32 4>, < i32 1, i32 3> ; yields: result=<2 x i32> < i32 -1, i32 0>
7040
7041'``and``' Instruction
7042^^^^^^^^^^^^^^^^^^^^^
7043
7044Syntax:
7045"""""""
7046
7047::
7048
Tim Northover675a0962014-06-13 14:24:23 +00007049 <result> = and <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007050
7051Overview:
7052"""""""""
7053
7054The '``and``' instruction returns the bitwise logical and of its two
7055operands.
7056
7057Arguments:
7058""""""""""
7059
7060The two arguments to the '``and``' instruction must be
7061:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7062arguments must have identical types.
7063
7064Semantics:
7065""""""""""
7066
7067The truth table used for the '``and``' instruction is:
7068
7069+-----+-----+-----+
7070| In0 | In1 | Out |
7071+-----+-----+-----+
7072| 0 | 0 | 0 |
7073+-----+-----+-----+
7074| 0 | 1 | 0 |
7075+-----+-----+-----+
7076| 1 | 0 | 0 |
7077+-----+-----+-----+
7078| 1 | 1 | 1 |
7079+-----+-----+-----+
7080
7081Example:
7082""""""""
7083
Renato Golin124f2592016-07-20 12:16:38 +00007084.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007085
Tim Northover675a0962014-06-13 14:24:23 +00007086 <result> = and i32 4, %var ; yields i32:result = 4 & %var
7087 <result> = and i32 15, 40 ; yields i32:result = 8
7088 <result> = and i32 4, 8 ; yields i32:result = 0
Sean Silvab084af42012-12-07 10:36:55 +00007089
7090'``or``' Instruction
7091^^^^^^^^^^^^^^^^^^^^
7092
7093Syntax:
7094"""""""
7095
7096::
7097
Tim Northover675a0962014-06-13 14:24:23 +00007098 <result> = or <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007099
7100Overview:
7101"""""""""
7102
7103The '``or``' instruction returns the bitwise logical inclusive or of its
7104two operands.
7105
7106Arguments:
7107""""""""""
7108
7109The two arguments to the '``or``' instruction must be
7110:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7111arguments must have identical types.
7112
7113Semantics:
7114""""""""""
7115
7116The truth table used for the '``or``' instruction is:
7117
7118+-----+-----+-----+
7119| In0 | In1 | Out |
7120+-----+-----+-----+
7121| 0 | 0 | 0 |
7122+-----+-----+-----+
7123| 0 | 1 | 1 |
7124+-----+-----+-----+
7125| 1 | 0 | 1 |
7126+-----+-----+-----+
7127| 1 | 1 | 1 |
7128+-----+-----+-----+
7129
7130Example:
7131""""""""
7132
7133::
7134
Tim Northover675a0962014-06-13 14:24:23 +00007135 <result> = or i32 4, %var ; yields i32:result = 4 | %var
7136 <result> = or i32 15, 40 ; yields i32:result = 47
7137 <result> = or i32 4, 8 ; yields i32:result = 12
Sean Silvab084af42012-12-07 10:36:55 +00007138
7139'``xor``' Instruction
7140^^^^^^^^^^^^^^^^^^^^^
7141
7142Syntax:
7143"""""""
7144
7145::
7146
Tim Northover675a0962014-06-13 14:24:23 +00007147 <result> = xor <ty> <op1>, <op2> ; yields ty:result
Sean Silvab084af42012-12-07 10:36:55 +00007148
7149Overview:
7150"""""""""
7151
7152The '``xor``' instruction returns the bitwise logical exclusive or of
7153its two operands. The ``xor`` is used to implement the "one's
7154complement" operation, which is the "~" operator in C.
7155
7156Arguments:
7157""""""""""
7158
7159The two arguments to the '``xor``' instruction must be
7160:ref:`integer <t_integer>` or :ref:`vector <t_vector>` of integer values. Both
7161arguments must have identical types.
7162
7163Semantics:
7164""""""""""
7165
7166The truth table used for the '``xor``' instruction is:
7167
7168+-----+-----+-----+
7169| In0 | In1 | Out |
7170+-----+-----+-----+
7171| 0 | 0 | 0 |
7172+-----+-----+-----+
7173| 0 | 1 | 1 |
7174+-----+-----+-----+
7175| 1 | 0 | 1 |
7176+-----+-----+-----+
7177| 1 | 1 | 0 |
7178+-----+-----+-----+
7179
7180Example:
7181""""""""
7182
Renato Golin124f2592016-07-20 12:16:38 +00007183.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007184
Tim Northover675a0962014-06-13 14:24:23 +00007185 <result> = xor i32 4, %var ; yields i32:result = 4 ^ %var
7186 <result> = xor i32 15, 40 ; yields i32:result = 39
7187 <result> = xor i32 4, 8 ; yields i32:result = 12
7188 <result> = xor i32 %V, -1 ; yields i32:result = ~%V
Sean Silvab084af42012-12-07 10:36:55 +00007189
7190Vector Operations
7191-----------------
7192
7193LLVM supports several instructions to represent vector operations in a
7194target-independent manner. These instructions cover the element-access
7195and vector-specific operations needed to process vectors effectively.
7196While LLVM does directly support these vector operations, many
7197sophisticated algorithms will want to use target-specific intrinsics to
7198take full advantage of a specific target.
7199
7200.. _i_extractelement:
7201
7202'``extractelement``' Instruction
7203^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7204
7205Syntax:
7206"""""""
7207
7208::
7209
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007210 <result> = extractelement <n x <ty>> <val>, <ty2> <idx> ; yields <ty>
Sean Silvab084af42012-12-07 10:36:55 +00007211
7212Overview:
7213"""""""""
7214
7215The '``extractelement``' instruction extracts a single scalar element
7216from a vector at a specified index.
7217
7218Arguments:
7219""""""""""
7220
7221The first operand of an '``extractelement``' instruction is a value of
7222:ref:`vector <t_vector>` type. The second operand is an index indicating
7223the position from which to extract the element. The index may be a
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007224variable of any integer type.
Sean Silvab084af42012-12-07 10:36:55 +00007225
7226Semantics:
7227""""""""""
7228
7229The result is a scalar of the same type as the element type of ``val``.
7230Its value is the value at position ``idx`` of ``val``. If ``idx``
7231exceeds the length of ``val``, the results are undefined.
7232
7233Example:
7234""""""""
7235
Renato Golin124f2592016-07-20 12:16:38 +00007236.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007237
7238 <result> = extractelement <4 x i32> %vec, i32 0 ; yields i32
7239
7240.. _i_insertelement:
7241
7242'``insertelement``' Instruction
7243^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7244
7245Syntax:
7246"""""""
7247
7248::
7249
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007250 <result> = insertelement <n x <ty>> <val>, <ty> <elt>, <ty2> <idx> ; yields <n x <ty>>
Sean Silvab084af42012-12-07 10:36:55 +00007251
7252Overview:
7253"""""""""
7254
7255The '``insertelement``' instruction inserts a scalar element into a
7256vector at a specified index.
7257
7258Arguments:
7259""""""""""
7260
7261The first operand of an '``insertelement``' instruction is a value of
7262:ref:`vector <t_vector>` type. The second operand is a scalar value whose
7263type must equal the element type of the first operand. The third operand
7264is an index indicating the position at which to insert the value. The
Michael J. Spencer1f10c5ea2014-05-01 22:12:39 +00007265index may be a variable of any integer type.
Sean Silvab084af42012-12-07 10:36:55 +00007266
7267Semantics:
7268""""""""""
7269
7270The result is a vector of the same type as ``val``. Its element values
7271are those of ``val`` except at position ``idx``, where it gets the value
7272``elt``. If ``idx`` exceeds the length of ``val``, the results are
7273undefined.
7274
7275Example:
7276""""""""
7277
Renato Golin124f2592016-07-20 12:16:38 +00007278.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007279
7280 <result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>
7281
7282.. _i_shufflevector:
7283
7284'``shufflevector``' Instruction
7285^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7286
7287Syntax:
7288"""""""
7289
7290::
7291
7292 <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> <mask> ; yields <m x <ty>>
7293
7294Overview:
7295"""""""""
7296
7297The '``shufflevector``' instruction constructs a permutation of elements
7298from two input vectors, returning a vector with the same element type as
7299the input and length that is the same as the shuffle mask.
7300
7301Arguments:
7302""""""""""
7303
7304The first two operands of a '``shufflevector``' instruction are vectors
7305with the same type. The third argument is a shuffle mask whose element
7306type is always 'i32'. The result of the instruction is a vector whose
7307length is the same as the shuffle mask and whose element type is the
7308same as the element type of the first two operands.
7309
7310The shuffle mask operand is required to be a constant vector with either
7311constant integer or undef values.
7312
7313Semantics:
7314""""""""""
7315
7316The elements of the two input vectors are numbered from left to right
7317across both of the vectors. The shuffle mask operand specifies, for each
7318element of the result vector, which element of the two input vectors the
Sanjay Patel6e410182017-04-12 18:39:53 +00007319result element gets. If the shuffle mask is undef, the result vector is
7320undef. If any element of the mask operand is undef, that element of the
7321result is undef. If the shuffle mask selects an undef element from one
7322of the input vectors, the resulting element is undef.
Sean Silvab084af42012-12-07 10:36:55 +00007323
7324Example:
7325""""""""
7326
Renato Golin124f2592016-07-20 12:16:38 +00007327.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007328
7329 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
7330 <4 x i32> <i32 0, i32 4, i32 1, i32 5> ; yields <4 x i32>
7331 <result> = shufflevector <4 x i32> %v1, <4 x i32> undef,
7332 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32> - Identity shuffle.
7333 <result> = shufflevector <8 x i32> %v1, <8 x i32> undef,
7334 <4 x i32> <i32 0, i32 1, i32 2, i32 3> ; yields <4 x i32>
7335 <result> = shufflevector <4 x i32> %v1, <4 x i32> %v2,
7336 <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7 > ; yields <8 x i32>
7337
7338Aggregate Operations
7339--------------------
7340
7341LLVM supports several instructions for working with
7342:ref:`aggregate <t_aggregate>` values.
7343
7344.. _i_extractvalue:
7345
7346'``extractvalue``' Instruction
7347^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7348
7349Syntax:
7350"""""""
7351
7352::
7353
7354 <result> = extractvalue <aggregate type> <val>, <idx>{, <idx>}*
7355
7356Overview:
7357"""""""""
7358
7359The '``extractvalue``' instruction extracts the value of a member field
7360from an :ref:`aggregate <t_aggregate>` value.
7361
7362Arguments:
7363""""""""""
7364
7365The first operand of an '``extractvalue``' instruction is a value of
Arch D. Robisona7f8f252015-10-14 19:10:45 +00007366:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The other operands are
Sean Silvab084af42012-12-07 10:36:55 +00007367constant indices to specify which value to extract in a similar manner
7368as indices in a '``getelementptr``' instruction.
7369
7370The major differences to ``getelementptr`` indexing are:
7371
7372- Since the value being indexed is not a pointer, the first index is
7373 omitted and assumed to be zero.
7374- At least one index must be specified.
7375- Not only struct indices but also array indices must be in bounds.
7376
7377Semantics:
7378""""""""""
7379
7380The result is the value at the position in the aggregate specified by
7381the index operands.
7382
7383Example:
7384""""""""
7385
Renato Golin124f2592016-07-20 12:16:38 +00007386.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007387
7388 <result> = extractvalue {i32, float} %agg, 0 ; yields i32
7389
7390.. _i_insertvalue:
7391
7392'``insertvalue``' Instruction
7393^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7394
7395Syntax:
7396"""""""
7397
7398::
7399
7400 <result> = insertvalue <aggregate type> <val>, <ty> <elt>, <idx>{, <idx>}* ; yields <aggregate type>
7401
7402Overview:
7403"""""""""
7404
7405The '``insertvalue``' instruction inserts a value into a member field in
7406an :ref:`aggregate <t_aggregate>` value.
7407
7408Arguments:
7409""""""""""
7410
7411The first operand of an '``insertvalue``' instruction is a value of
7412:ref:`struct <t_struct>` or :ref:`array <t_array>` type. The second operand is
7413a first-class value to insert. The following operands are constant
7414indices indicating the position at which to insert the value in a
7415similar manner as indices in a '``extractvalue``' instruction. The value
7416to insert must have the same type as the value identified by the
7417indices.
7418
7419Semantics:
7420""""""""""
7421
7422The result is an aggregate of the same type as ``val``. Its value is
7423that of ``val`` except that the value at the position specified by the
7424indices is that of ``elt``.
7425
7426Example:
7427""""""""
7428
7429.. code-block:: llvm
7430
7431 %agg1 = insertvalue {i32, float} undef, i32 1, 0 ; yields {i32 1, float undef}
7432 %agg2 = insertvalue {i32, float} %agg1, float %val, 1 ; yields {i32 1, float %val}
Dan Liewffcfe7f2014-09-08 21:19:46 +00007433 %agg3 = insertvalue {i32, {float}} undef, float %val, 1, 0 ; yields {i32 undef, {float %val}}
Sean Silvab084af42012-12-07 10:36:55 +00007434
7435.. _memoryops:
7436
7437Memory Access and Addressing Operations
7438---------------------------------------
7439
7440A key design point of an SSA-based representation is how it represents
7441memory. In LLVM, no memory locations are in SSA form, which makes things
7442very simple. This section describes how to read, write, and allocate
7443memory in LLVM.
7444
7445.. _i_alloca:
7446
7447'``alloca``' Instruction
7448^^^^^^^^^^^^^^^^^^^^^^^^
7449
7450Syntax:
7451"""""""
7452
7453::
7454
Matt Arsenault3c1fc762017-04-10 22:27:50 +00007455 <result> = alloca [inalloca] <type> [, <ty> <NumElements>] [, align <alignment>] [, addrspace(<num>)] ; yields type addrspace(num)*:result
Sean Silvab084af42012-12-07 10:36:55 +00007456
7457Overview:
7458"""""""""
7459
7460The '``alloca``' instruction allocates memory on the stack frame of the
7461currently executing function, to be automatically released when this
7462function returns to its caller. The object is always allocated in the
Matt Arsenault3c1fc762017-04-10 22:27:50 +00007463address space for allocas indicated in the datalayout.
Sean Silvab084af42012-12-07 10:36:55 +00007464
7465Arguments:
7466""""""""""
7467
7468The '``alloca``' instruction allocates ``sizeof(<type>)*NumElements``
7469bytes of memory on the runtime stack, returning a pointer of the
7470appropriate type to the program. If "NumElements" is specified, it is
7471the number of elements allocated, otherwise "NumElements" is defaulted
7472to be one. If a constant alignment is specified, the value result of the
Reid Kleckner15fe7a52014-07-15 01:16:09 +00007473allocation is guaranteed to be aligned to at least that boundary. The
7474alignment may not be greater than ``1 << 29``. If not specified, or if
7475zero, the target can choose to align the allocation on any convenient
7476boundary compatible with the type.
Sean Silvab084af42012-12-07 10:36:55 +00007477
7478'``type``' may be any sized type.
7479
7480Semantics:
7481""""""""""
7482
7483Memory is allocated; a pointer is returned. The operation is undefined
7484if there is insufficient stack space for the allocation. '``alloca``'d
7485memory is automatically released when the function returns. The
7486'``alloca``' instruction is commonly used to represent automatic
7487variables that must have an address available. When the function returns
7488(either with the ``ret`` or ``resume`` instructions), the memory is
7489reclaimed. Allocating zero bytes is legal, but the result is undefined.
7490The order in which memory is allocated (ie., which way the stack grows)
7491is not specified.
7492
7493Example:
7494""""""""
7495
7496.. code-block:: llvm
7497
Tim Northover675a0962014-06-13 14:24:23 +00007498 %ptr = alloca i32 ; yields i32*:ptr
7499 %ptr = alloca i32, i32 4 ; yields i32*:ptr
7500 %ptr = alloca i32, i32 4, align 1024 ; yields i32*:ptr
7501 %ptr = alloca i32, align 1024 ; yields i32*:ptr
Sean Silvab084af42012-12-07 10:36:55 +00007502
7503.. _i_load:
7504
7505'``load``' Instruction
7506^^^^^^^^^^^^^^^^^^^^^^
7507
7508Syntax:
7509"""""""
7510
7511::
7512
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007513 <result> = load [volatile] <ty>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.load !<index>][, !invariant.group !<index>][, !nonnull !<index>][, !dereferenceable !<deref_bytes_node>][, !dereferenceable_or_null !<deref_bytes_node>][, !align !<align_node>]
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007514 <result> = load atomic [volatile] <ty>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>]
Sean Silvab084af42012-12-07 10:36:55 +00007515 !<index> = !{ i32 1 }
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007516 !<deref_bytes_node> = !{i64 <dereferenceable_bytes>}
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007517 !<align_node> = !{ i64 <value_alignment> }
Sean Silvab084af42012-12-07 10:36:55 +00007518
7519Overview:
7520"""""""""
7521
7522The '``load``' instruction is used to read from memory.
7523
7524Arguments:
7525""""""""""
7526
Sanjoy Dasc2cf6ef2016-06-01 16:13:10 +00007527The argument to the ``load`` instruction specifies the memory address from which
7528to load. The type specified must be a :ref:`first class <t_firstclass>` type of
7529known size (i.e. not containing an :ref:`opaque structural type <t_opaque>`). If
7530the ``load`` is marked as ``volatile``, then the optimizer is not allowed to
7531modify the number or order of execution of this ``load`` with other
7532:ref:`volatile operations <volatile>`.
Sean Silvab084af42012-12-07 10:36:55 +00007533
JF Bastiend1fb5852015-12-17 22:09:19 +00007534If the ``load`` is marked as ``atomic``, it takes an extra :ref:`ordering
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007535<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
7536``release`` and ``acq_rel`` orderings are not valid on ``load`` instructions.
7537Atomic loads produce :ref:`defined <memmodel>` results when they may see
7538multiple atomic stores. The type of the pointee must be an integer, pointer, or
7539floating-point type whose bit width is a power of two greater than or equal to
7540eight and less than or equal to a target-specific size limit. ``align`` must be
7541explicitly specified on atomic loads, and the load has undefined behavior if the
7542alignment is not set to a value which is at least the size in bytes of the
JF Bastiend1fb5852015-12-17 22:09:19 +00007543pointee. ``!nontemporal`` does not have any defined semantics for atomic loads.
Sean Silvab084af42012-12-07 10:36:55 +00007544
7545The optional constant ``align`` argument specifies the alignment of the
7546operation (that is, the alignment of the memory address). A value of 0
Eli Bendersky239a78b2013-04-17 20:17:08 +00007547or an omitted ``align`` argument means that the operation has the ABI
Sean Silvab084af42012-12-07 10:36:55 +00007548alignment for the target. It is the responsibility of the code emitter
7549to ensure that the alignment information is correct. Overestimating the
7550alignment results in undefined behavior. Underestimating the alignment
Reid Kleckner15fe7a52014-07-15 01:16:09 +00007551may produce less efficient code. An alignment of 1 is always safe. The
Matt Arsenault7020f252016-06-16 16:33:41 +00007552maximum possible alignment is ``1 << 29``. An alignment value higher
7553than the size of the loaded type implies memory up to the alignment
7554value bytes can be safely loaded without trapping in the default
7555address space. Access of the high bytes can interfere with debugging
7556tools, so should not be accessed if the function has the
7557``sanitize_thread`` or ``sanitize_address`` attributes.
Sean Silvab084af42012-12-07 10:36:55 +00007558
7559The optional ``!nontemporal`` metadata must reference a single
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007560metadata name ``<index>`` corresponding to a metadata node with one
Sean Silvab084af42012-12-07 10:36:55 +00007561``i32`` entry of value 1. The existence of the ``!nontemporal``
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007562metadata on the instruction tells the optimizer and code generator
Sean Silvab084af42012-12-07 10:36:55 +00007563that this load is not expected to be reused in the cache. The code
7564generator may select special instructions to save cache bandwidth, such
7565as the ``MOVNT`` instruction on x86.
7566
7567The optional ``!invariant.load`` metadata must reference a single
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007568metadata name ``<index>`` corresponding to a metadata node with no
Geoff Berry4bda5762016-08-31 17:39:21 +00007569entries. If a load instruction tagged with the ``!invariant.load``
7570metadata is executed, the optimizer may assume the memory location
7571referenced by the load contains the same value at all points in the
7572program where the memory location is known to be dereferenceable.
Sean Silvab084af42012-12-07 10:36:55 +00007573
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007574The optional ``!invariant.group`` metadata must reference a single metadata name
7575 ``<index>`` corresponding to a metadata node. See ``invariant.group`` metadata.
7576
Philip Reamescdb72f32014-10-20 22:40:55 +00007577The optional ``!nonnull`` metadata must reference a single
7578metadata name ``<index>`` corresponding to a metadata node with no
7579entries. The existence of the ``!nonnull`` metadata on the
7580instruction tells the optimizer that the value loaded is known to
Piotr Padlewskid97846e2015-09-02 20:33:16 +00007581never be null. This is analogous to the ``nonnull`` attribute
Sean Silvaa1190322015-08-06 22:56:48 +00007582on parameters and return values. This metadata can only be applied
Mehdi Amini4a121fa2015-03-14 22:04:06 +00007583to loads of a pointer type.
Philip Reamescdb72f32014-10-20 22:40:55 +00007584
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007585The optional ``!dereferenceable`` metadata must reference a single metadata
7586name ``<deref_bytes_node>`` corresponding to a metadata node with one ``i64``
Sean Silva706fba52015-08-06 22:56:24 +00007587entry. The existence of the ``!dereferenceable`` metadata on the instruction
Sanjoy Dasf9995472015-05-19 20:10:19 +00007588tells the optimizer that the value loaded is known to be dereferenceable.
Sean Silva706fba52015-08-06 22:56:24 +00007589The number of bytes known to be dereferenceable is specified by the integer
7590value in the metadata node. This is analogous to the ''dereferenceable''
7591attribute on parameters and return values. This metadata can only be applied
Sanjoy Dasf9995472015-05-19 20:10:19 +00007592to loads of a pointer type.
7593
7594The optional ``!dereferenceable_or_null`` metadata must reference a single
Artur Pilipenko253d71e2015-09-18 12:07:10 +00007595metadata name ``<deref_bytes_node>`` corresponding to a metadata node with one
7596``i64`` entry. The existence of the ``!dereferenceable_or_null`` metadata on the
Sanjoy Dasf9995472015-05-19 20:10:19 +00007597instruction tells the optimizer that the value loaded is known to be either
7598dereferenceable or null.
Sean Silva706fba52015-08-06 22:56:24 +00007599The number of bytes known to be dereferenceable is specified by the integer
7600value in the metadata node. This is analogous to the ''dereferenceable_or_null''
7601attribute on parameters and return values. This metadata can only be applied
Sanjoy Dasf9995472015-05-19 20:10:19 +00007602to loads of a pointer type.
7603
Artur Pilipenkob4d00902015-09-28 17:41:08 +00007604The optional ``!align`` metadata must reference a single metadata name
7605``<align_node>`` corresponding to a metadata node with one ``i64`` entry.
7606The existence of the ``!align`` metadata on the instruction tells the
7607optimizer that the value loaded is known to be aligned to a boundary specified
7608by the integer value in the metadata node. The alignment must be a power of 2.
7609This is analogous to the ''align'' attribute on parameters and return values.
7610This metadata can only be applied to loads of a pointer type.
7611
Sean Silvab084af42012-12-07 10:36:55 +00007612Semantics:
7613""""""""""
7614
7615The location of memory pointed to is loaded. If the value being loaded
7616is of scalar type then the number of bytes read does not exceed the
7617minimum number of bytes needed to hold all bits of the type. For
7618example, loading an ``i24`` reads at most three bytes. When loading a
7619value of a type like ``i20`` with a size that is not an integral number
7620of bytes, the result is undefined if the value was not originally
7621written using a store of the same type.
7622
7623Examples:
7624"""""""""
7625
7626.. code-block:: llvm
7627
Tim Northover675a0962014-06-13 14:24:23 +00007628 %ptr = alloca i32 ; yields i32*:ptr
7629 store i32 3, i32* %ptr ; yields void
David Blaikiec7aabbb2015-03-04 22:06:14 +00007630 %val = load i32, i32* %ptr ; yields i32:val = i32 3
Sean Silvab084af42012-12-07 10:36:55 +00007631
7632.. _i_store:
7633
7634'``store``' Instruction
7635^^^^^^^^^^^^^^^^^^^^^^^
7636
7637Syntax:
7638"""""""
7639
7640::
7641
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007642 store [volatile] <ty> <value>, <ty>* <pointer>[, align <alignment>][, !nontemporal !<index>][, !invariant.group !<index>] ; yields void
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007643 store atomic [volatile] <ty> <value>, <ty>* <pointer> [syncscope("<target-scope>")] <ordering>, align <alignment> [, !invariant.group !<index>] ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00007644
7645Overview:
7646"""""""""
7647
7648The '``store``' instruction is used to write to memory.
7649
7650Arguments:
7651""""""""""
7652
Sanjoy Dasc2cf6ef2016-06-01 16:13:10 +00007653There are two arguments to the ``store`` instruction: a value to store and an
7654address at which to store it. The type of the ``<pointer>`` operand must be a
7655pointer to the :ref:`first class <t_firstclass>` type of the ``<value>``
7656operand. If the ``store`` is marked as ``volatile``, then the optimizer is not
7657allowed to modify the number or order of execution of this ``store`` with other
7658:ref:`volatile operations <volatile>`. Only values of :ref:`first class
7659<t_firstclass>` types of known size (i.e. not containing an :ref:`opaque
7660structural type <t_opaque>`) can be stored.
Sean Silvab084af42012-12-07 10:36:55 +00007661
JF Bastiend1fb5852015-12-17 22:09:19 +00007662If the ``store`` is marked as ``atomic``, it takes an extra :ref:`ordering
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007663<ordering>` and optional ``syncscope("<target-scope>")`` argument. The
7664``acquire`` and ``acq_rel`` orderings aren't valid on ``store`` instructions.
7665Atomic loads produce :ref:`defined <memmodel>` results when they may see
7666multiple atomic stores. The type of the pointee must be an integer, pointer, or
7667floating-point type whose bit width is a power of two greater than or equal to
7668eight and less than or equal to a target-specific size limit. ``align`` must be
7669explicitly specified on atomic stores, and the store has undefined behavior if
7670the alignment is not set to a value which is at least the size in bytes of the
JF Bastiend1fb5852015-12-17 22:09:19 +00007671pointee. ``!nontemporal`` does not have any defined semantics for atomic stores.
Sean Silvab084af42012-12-07 10:36:55 +00007672
Eli Benderskyca380842013-04-17 17:17:20 +00007673The optional constant ``align`` argument specifies the alignment of the
Sean Silvab084af42012-12-07 10:36:55 +00007674operation (that is, the alignment of the memory address). A value of 0
Eli Benderskyca380842013-04-17 17:17:20 +00007675or an omitted ``align`` argument means that the operation has the ABI
Sean Silvab084af42012-12-07 10:36:55 +00007676alignment for the target. It is the responsibility of the code emitter
7677to ensure that the alignment information is correct. Overestimating the
Eli Benderskyca380842013-04-17 17:17:20 +00007678alignment results in undefined behavior. Underestimating the
Sean Silvab084af42012-12-07 10:36:55 +00007679alignment may produce less efficient code. An alignment of 1 is always
Matt Arsenault7020f252016-06-16 16:33:41 +00007680safe. The maximum possible alignment is ``1 << 29``. An alignment
7681value higher than the size of the stored type implies memory up to the
7682alignment value bytes can be stored to without trapping in the default
7683address space. Storing to the higher bytes however may result in data
7684races if another thread can access the same address. Introducing a
7685data race is not allowed. Storing to the extra bytes is not allowed
7686even in situations where a data race is known to not exist if the
7687function has the ``sanitize_address`` attribute.
Sean Silvab084af42012-12-07 10:36:55 +00007688
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007689The optional ``!nontemporal`` metadata must reference a single metadata
Eli Benderskyca380842013-04-17 17:17:20 +00007690name ``<index>`` corresponding to a metadata node with one ``i32`` entry of
Stefanus Du Toit736e2e22013-06-20 14:02:44 +00007691value 1. The existence of the ``!nontemporal`` metadata on the instruction
Sean Silvab084af42012-12-07 10:36:55 +00007692tells the optimizer and code generator that this load is not expected to
7693be reused in the cache. The code generator may select special
JF Bastiend2d8ffd2016-01-13 04:52:26 +00007694instructions to save cache bandwidth, such as the ``MOVNT`` instruction on
Sean Silvab084af42012-12-07 10:36:55 +00007695x86.
7696
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00007697The optional ``!invariant.group`` metadata must reference a
Piotr Padlewski6c15ec42015-09-15 18:32:14 +00007698single metadata name ``<index>``. See ``invariant.group`` metadata.
7699
Sean Silvab084af42012-12-07 10:36:55 +00007700Semantics:
7701""""""""""
7702
Eli Benderskyca380842013-04-17 17:17:20 +00007703The contents of memory are updated to contain ``<value>`` at the
7704location specified by the ``<pointer>`` operand. If ``<value>`` is
Sean Silvab084af42012-12-07 10:36:55 +00007705of scalar type then the number of bytes written does not exceed the
7706minimum number of bytes needed to hold all bits of the type. For
7707example, storing an ``i24`` writes at most three bytes. When writing a
7708value of a type like ``i20`` with a size that is not an integral number
7709of bytes, it is unspecified what happens to the extra bits that do not
7710belong to the type, but they will typically be overwritten.
7711
7712Example:
7713""""""""
7714
7715.. code-block:: llvm
7716
Tim Northover675a0962014-06-13 14:24:23 +00007717 %ptr = alloca i32 ; yields i32*:ptr
7718 store i32 3, i32* %ptr ; yields void
Nick Lewycky149d04c2015-08-11 01:05:16 +00007719 %val = load i32, i32* %ptr ; yields i32:val = i32 3
Sean Silvab084af42012-12-07 10:36:55 +00007720
7721.. _i_fence:
7722
7723'``fence``' Instruction
7724^^^^^^^^^^^^^^^^^^^^^^^
7725
7726Syntax:
7727"""""""
7728
7729::
7730
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007731 fence [syncscope("<target-scope>")] <ordering> ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00007732
7733Overview:
7734"""""""""
7735
7736The '``fence``' instruction is used to introduce happens-before edges
7737between operations.
7738
7739Arguments:
7740""""""""""
7741
7742'``fence``' instructions take an :ref:`ordering <ordering>` argument which
7743defines what *synchronizes-with* edges they add. They can only be given
7744``acquire``, ``release``, ``acq_rel``, and ``seq_cst`` orderings.
7745
7746Semantics:
7747""""""""""
7748
7749A fence A which has (at least) ``release`` ordering semantics
7750*synchronizes with* a fence B with (at least) ``acquire`` ordering
7751semantics if and only if there exist atomic operations X and Y, both
7752operating on some atomic object M, such that A is sequenced before X, X
7753modifies M (either directly or through some side effect of a sequence
7754headed by X), Y is sequenced before B, and Y observes M. This provides a
7755*happens-before* dependency between A and B. Rather than an explicit
7756``fence``, one (but not both) of the atomic operations X or Y might
7757provide a ``release`` or ``acquire`` (resp.) ordering constraint and
7758still *synchronize-with* the explicit ``fence`` and establish the
7759*happens-before* edge.
7760
7761A ``fence`` which has ``seq_cst`` ordering, in addition to having both
7762``acquire`` and ``release`` semantics specified above, participates in
7763the global program order of other ``seq_cst`` operations and/or fences.
7764
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007765A ``fence`` instruction can also take an optional
7766":ref:`syncscope <syncscope>`" argument.
Sean Silvab084af42012-12-07 10:36:55 +00007767
7768Example:
7769""""""""
7770
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00007771.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00007772
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007773 fence acquire ; yields void
7774 fence syncscope("singlethread") seq_cst ; yields void
7775 fence syncscope("agent") seq_cst ; yields void
Sean Silvab084af42012-12-07 10:36:55 +00007776
7777.. _i_cmpxchg:
7778
7779'``cmpxchg``' Instruction
7780^^^^^^^^^^^^^^^^^^^^^^^^^
7781
7782Syntax:
7783"""""""
7784
7785::
7786
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007787 cmpxchg [weak] [volatile] <ty>* <pointer>, <ty> <cmp>, <ty> <new> [syncscope("<target-scope>")] <success ordering> <failure ordering> ; yields { ty, i1 }
Sean Silvab084af42012-12-07 10:36:55 +00007788
7789Overview:
7790"""""""""
7791
7792The '``cmpxchg``' instruction is used to atomically modify memory. It
7793loads a value in memory and compares it to a given value. If they are
Tim Northover420a2162014-06-13 14:24:07 +00007794equal, it tries to store a new value into the memory.
Sean Silvab084af42012-12-07 10:36:55 +00007795
7796Arguments:
7797""""""""""
7798
7799There are three arguments to the '``cmpxchg``' instruction: an address
7800to operate on, a value to compare to the value currently be at that
7801address, and a new value to place at that address if the compared values
Philip Reames1960cfd2016-02-19 00:06:41 +00007802are equal. The type of '<cmp>' must be an integer or pointer type whose
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00007803bit width is a power of two greater than or equal to eight and less
Philip Reames1960cfd2016-02-19 00:06:41 +00007804than or equal to a target-specific size limit. '<cmp>' and '<new>' must
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00007805have the same type, and the type of '<pointer>' must be a pointer to
7806that type. If the ``cmpxchg`` is marked as ``volatile``, then the
Philip Reames1960cfd2016-02-19 00:06:41 +00007807optimizer is not allowed to modify the number or order of execution of
7808this ``cmpxchg`` with other :ref:`volatile operations <volatile>`.
Sean Silvab084af42012-12-07 10:36:55 +00007809
Tim Northovere94a5182014-03-11 10:48:52 +00007810The success and failure :ref:`ordering <ordering>` arguments specify how this
Tim Northover1dcc9f92014-06-13 14:24:16 +00007811``cmpxchg`` synchronizes with other atomic operations. Both ordering parameters
7812must be at least ``monotonic``, the ordering constraint on failure must be no
7813stronger than that on success, and the failure ordering cannot be either
7814``release`` or ``acq_rel``.
Sean Silvab084af42012-12-07 10:36:55 +00007815
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007816A ``cmpxchg`` instruction can also take an optional
7817":ref:`syncscope <syncscope>`" argument.
Sean Silvab084af42012-12-07 10:36:55 +00007818
7819The pointer passed into cmpxchg must have alignment greater than or
7820equal to the size in memory of the operand.
7821
7822Semantics:
7823""""""""""
7824
Tim Northover420a2162014-06-13 14:24:07 +00007825The contents of memory at the location specified by the '``<pointer>``' operand
Matthias Braun93f2b4b2017-08-09 22:22:04 +00007826is read and compared to '``<cmp>``'; if the values are equal, '``<new>``' is
7827written to the location. The original value at the location is returned,
7828together with a flag indicating success (true) or failure (false).
Tim Northover420a2162014-06-13 14:24:07 +00007829
7830If the cmpxchg operation is marked as ``weak`` then a spurious failure is
7831permitted: the operation may not write ``<new>`` even if the comparison
7832matched.
7833
7834If the cmpxchg operation is strong (the default), the i1 value is 1 if and only
7835if the value loaded equals ``cmp``.
Sean Silvab084af42012-12-07 10:36:55 +00007836
Tim Northovere94a5182014-03-11 10:48:52 +00007837A successful ``cmpxchg`` is a read-modify-write instruction for the purpose of
7838identifying release sequences. A failed ``cmpxchg`` is equivalent to an atomic
7839load with an ordering parameter determined the second ordering parameter.
Sean Silvab084af42012-12-07 10:36:55 +00007840
7841Example:
7842""""""""
7843
7844.. code-block:: llvm
7845
7846 entry:
Duncan P. N. Exon Smithc917c7a2016-02-07 05:06:35 +00007847 %orig = load atomic i32, i32* %ptr unordered, align 4 ; yields i32
Sean Silvab084af42012-12-07 10:36:55 +00007848 br label %loop
7849
7850 loop:
Duncan P. N. Exon Smithc917c7a2016-02-07 05:06:35 +00007851 %cmp = phi i32 [ %orig, %entry ], [%value_loaded, %loop]
Sean Silvab084af42012-12-07 10:36:55 +00007852 %squared = mul i32 %cmp, %cmp
Tim Northover675a0962014-06-13 14:24:23 +00007853 %val_success = cmpxchg i32* %ptr, i32 %cmp, i32 %squared acq_rel monotonic ; yields { i32, i1 }
Tim Northover420a2162014-06-13 14:24:07 +00007854 %value_loaded = extractvalue { i32, i1 } %val_success, 0
7855 %success = extractvalue { i32, i1 } %val_success, 1
Sean Silvab084af42012-12-07 10:36:55 +00007856 br i1 %success, label %done, label %loop
7857
7858 done:
7859 ...
7860
7861.. _i_atomicrmw:
7862
7863'``atomicrmw``' Instruction
7864^^^^^^^^^^^^^^^^^^^^^^^^^^^
7865
7866Syntax:
7867"""""""
7868
7869::
7870
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007871 atomicrmw [volatile] <operation> <ty>* <pointer>, <ty> <value> [syncscope("<target-scope>")] <ordering> ; yields ty
Sean Silvab084af42012-12-07 10:36:55 +00007872
7873Overview:
7874"""""""""
7875
7876The '``atomicrmw``' instruction is used to atomically modify memory.
7877
7878Arguments:
7879""""""""""
7880
7881There are three arguments to the '``atomicrmw``' instruction: an
7882operation to apply, an address whose value to modify, an argument to the
7883operation. The operation must be one of the following keywords:
7884
7885- xchg
7886- add
7887- sub
7888- and
7889- nand
7890- or
7891- xor
7892- max
7893- min
7894- umax
7895- umin
7896
7897The type of '<value>' must be an integer type whose bit width is a power
7898of two greater than or equal to eight and less than or equal to a
7899target-specific size limit. The type of the '``<pointer>``' operand must
7900be a pointer to that type. If the ``atomicrmw`` is marked as
7901``volatile``, then the optimizer is not allowed to modify the number or
7902order of execution of this ``atomicrmw`` with other :ref:`volatile
7903operations <volatile>`.
7904
Konstantin Zhuravlyovbb80d3e2017-07-11 22:23:00 +00007905A ``atomicrmw`` instruction can also take an optional
7906":ref:`syncscope <syncscope>`" argument.
7907
Sean Silvab084af42012-12-07 10:36:55 +00007908Semantics:
7909""""""""""
7910
7911The contents of memory at the location specified by the '``<pointer>``'
7912operand are atomically read, modified, and written back. The original
7913value at the location is returned. The modification is specified by the
7914operation argument:
7915
7916- xchg: ``*ptr = val``
7917- add: ``*ptr = *ptr + val``
7918- sub: ``*ptr = *ptr - val``
7919- and: ``*ptr = *ptr & val``
7920- nand: ``*ptr = ~(*ptr & val)``
7921- or: ``*ptr = *ptr | val``
7922- xor: ``*ptr = *ptr ^ val``
7923- max: ``*ptr = *ptr > val ? *ptr : val`` (using a signed comparison)
7924- min: ``*ptr = *ptr < val ? *ptr : val`` (using a signed comparison)
7925- umax: ``*ptr = *ptr > val ? *ptr : val`` (using an unsigned
7926 comparison)
7927- umin: ``*ptr = *ptr < val ? *ptr : val`` (using an unsigned
7928 comparison)
7929
7930Example:
7931""""""""
7932
7933.. code-block:: llvm
7934
Tim Northover675a0962014-06-13 14:24:23 +00007935 %old = atomicrmw add i32* %ptr, i32 1 acquire ; yields i32
Sean Silvab084af42012-12-07 10:36:55 +00007936
7937.. _i_getelementptr:
7938
7939'``getelementptr``' Instruction
7940^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7941
7942Syntax:
7943"""""""
7944
7945::
7946
Peter Collingbourned93620b2016-11-10 22:34:55 +00007947 <result> = getelementptr <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
7948 <result> = getelementptr inbounds <ty>, <ty>* <ptrval>{, [inrange] <ty> <idx>}*
7949 <result> = getelementptr <ty>, <ptr vector> <ptrval>, [inrange] <vector index type> <idx>
Sean Silvab084af42012-12-07 10:36:55 +00007950
7951Overview:
7952"""""""""
7953
7954The '``getelementptr``' instruction is used to get the address of a
7955subelement of an :ref:`aggregate <t_aggregate>` data structure. It performs
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00007956address calculation only and does not access memory. The instruction can also
7957be used to calculate a vector of such addresses.
Sean Silvab084af42012-12-07 10:36:55 +00007958
7959Arguments:
7960""""""""""
7961
David Blaikie16a97eb2015-03-04 22:02:58 +00007962The first argument is always a type used as the basis for the calculations.
7963The second argument is always a pointer or a vector of pointers, and is the
7964base address to start from. The remaining arguments are indices
Sean Silvab084af42012-12-07 10:36:55 +00007965that indicate which of the elements of the aggregate object are indexed.
7966The interpretation of each index is dependent on the type being indexed
7967into. The first index always indexes the pointer value given as the
David Blaikief91b0302017-06-19 05:34:21 +00007968second argument, the second index indexes a value of the type pointed to
Sean Silvab084af42012-12-07 10:36:55 +00007969(not necessarily the value directly pointed to, since the first index
7970can be non-zero), etc. The first type indexed into must be a pointer
7971value, subsequent types can be arrays, vectors, and structs. Note that
7972subsequent types being indexed into can never be pointers, since that
7973would require loading the pointer before continuing calculation.
7974
7975The type of each index argument depends on the type it is indexing into.
7976When indexing into a (optionally packed) structure, only ``i32`` integer
7977**constants** are allowed (when using a vector of indices they must all
7978be the **same** ``i32`` integer constant). When indexing into an array,
7979pointer or vector, integers of any width are allowed, and they are not
7980required to be constant. These integers are treated as signed values
7981where relevant.
7982
7983For example, let's consider a C code fragment and how it gets compiled
7984to LLVM:
7985
7986.. code-block:: c
7987
7988 struct RT {
7989 char A;
7990 int B[10][20];
7991 char C;
7992 };
7993 struct ST {
7994 int X;
7995 double Y;
7996 struct RT Z;
7997 };
7998
7999 int *foo(struct ST *s) {
8000 return &s[1].Z.B[5][13];
8001 }
8002
8003The LLVM code generated by Clang is:
8004
8005.. code-block:: llvm
8006
8007 %struct.RT = type { i8, [10 x [20 x i32]], i8 }
8008 %struct.ST = type { i32, double, %struct.RT }
8009
8010 define i32* @foo(%struct.ST* %s) nounwind uwtable readnone optsize ssp {
8011 entry:
David Blaikie16a97eb2015-03-04 22:02:58 +00008012 %arrayidx = getelementptr inbounds %struct.ST, %struct.ST* %s, i64 1, i32 2, i32 1, i64 5, i64 13
Sean Silvab084af42012-12-07 10:36:55 +00008013 ret i32* %arrayidx
8014 }
8015
8016Semantics:
8017""""""""""
8018
8019In the example above, the first index is indexing into the
8020'``%struct.ST*``' type, which is a pointer, yielding a '``%struct.ST``'
8021= '``{ i32, double, %struct.RT }``' type, a structure. The second index
8022indexes into the third element of the structure, yielding a
8023'``%struct.RT``' = '``{ i8 , [10 x [20 x i32]], i8 }``' type, another
8024structure. The third index indexes into the second element of the
8025structure, yielding a '``[10 x [20 x i32]]``' type, an array. The two
8026dimensions of the array are subscripted into, yielding an '``i32``'
8027type. The '``getelementptr``' instruction returns a pointer to this
8028element, thus computing a value of '``i32*``' type.
8029
8030Note that it is perfectly legal to index partially through a structure,
8031returning a pointer to an inner element. Because of this, the LLVM code
8032for the given testcase is equivalent to:
8033
8034.. code-block:: llvm
8035
8036 define i32* @foo(%struct.ST* %s) {
David Blaikie16a97eb2015-03-04 22:02:58 +00008037 %t1 = getelementptr %struct.ST, %struct.ST* %s, i32 1 ; yields %struct.ST*:%t1
8038 %t2 = getelementptr %struct.ST, %struct.ST* %t1, i32 0, i32 2 ; yields %struct.RT*:%t2
8039 %t3 = getelementptr %struct.RT, %struct.RT* %t2, i32 0, i32 1 ; yields [10 x [20 x i32]]*:%t3
8040 %t4 = getelementptr [10 x [20 x i32]], [10 x [20 x i32]]* %t3, i32 0, i32 5 ; yields [20 x i32]*:%t4
8041 %t5 = getelementptr [20 x i32], [20 x i32]* %t4, i32 0, i32 13 ; yields i32*:%t5
Sean Silvab084af42012-12-07 10:36:55 +00008042 ret i32* %t5
8043 }
8044
8045If the ``inbounds`` keyword is present, the result value of the
8046``getelementptr`` is a :ref:`poison value <poisonvalues>` if the base
8047pointer is not an *in bounds* address of an allocated object, or if any
8048of the addresses that would be formed by successive addition of the
8049offsets implied by the indices to the base address with infinitely
8050precise signed arithmetic are not an *in bounds* address of that
8051allocated object. The *in bounds* addresses for an allocated object are
8052all the addresses that point into the object, plus the address one byte
Eli Friedman13f2e352017-02-23 00:48:18 +00008053past the end. The only *in bounds* address for a null pointer in the
8054default address-space is the null pointer itself. In cases where the
8055base is a vector of pointers the ``inbounds`` keyword applies to each
8056of the computations element-wise.
Sean Silvab084af42012-12-07 10:36:55 +00008057
8058If the ``inbounds`` keyword is not present, the offsets are added to the
8059base address with silently-wrapping two's complement arithmetic. If the
8060offsets have a different width from the pointer, they are sign-extended
8061or truncated to the width of the pointer. The result value of the
8062``getelementptr`` may be outside the object pointed to by the base
8063pointer. The result value may not necessarily be used to access memory
8064though, even if it happens to point into allocated storage. See the
8065:ref:`Pointer Aliasing Rules <pointeraliasing>` section for more
8066information.
8067
Peter Collingbourned93620b2016-11-10 22:34:55 +00008068If the ``inrange`` keyword is present before any index, loading from or
8069storing to any pointer derived from the ``getelementptr`` has undefined
8070behavior if the load or store would access memory outside of the bounds of
8071the element selected by the index marked as ``inrange``. The result of a
8072pointer comparison or ``ptrtoint`` (including ``ptrtoint``-like operations
8073involving memory) involving a pointer derived from a ``getelementptr`` with
8074the ``inrange`` keyword is undefined, with the exception of comparisons
8075in the case where both operands are in the range of the element selected
8076by the ``inrange`` keyword, inclusive of the address one past the end of
8077that element. Note that the ``inrange`` keyword is currently only allowed
8078in constant ``getelementptr`` expressions.
8079
Sean Silvab084af42012-12-07 10:36:55 +00008080The getelementptr instruction is often confusing. For some more insight
8081into how it works, see :doc:`the getelementptr FAQ <GetElementPtr>`.
8082
8083Example:
8084""""""""
8085
8086.. code-block:: llvm
8087
8088 ; yields [12 x i8]*:aptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008089 %aptr = getelementptr {i32, [12 x i8]}, {i32, [12 x i8]}* %saptr, i64 0, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00008090 ; yields i8*:vptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008091 %vptr = getelementptr {i32, <2 x i8>}, {i32, <2 x i8>}* %svptr, i64 0, i32 1, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00008092 ; yields i8*:eptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008093 %eptr = getelementptr [12 x i8], [12 x i8]* %aptr, i64 0, i32 1
Sean Silvab084af42012-12-07 10:36:55 +00008094 ; yields i32*:iptr
David Blaikie16a97eb2015-03-04 22:02:58 +00008095 %iptr = getelementptr [10 x i32], [10 x i32]* @arr, i16 0, i16 0
Sean Silvab084af42012-12-07 10:36:55 +00008096
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008097Vector of pointers:
8098"""""""""""""""""""
8099
8100The ``getelementptr`` returns a vector of pointers, instead of a single address,
8101when one or more of its arguments is a vector. In such cases, all vector
8102arguments should have the same number of elements, and every scalar argument
8103will be effectively broadcast into a vector during address calculation.
Sean Silvab084af42012-12-07 10:36:55 +00008104
8105.. code-block:: llvm
8106
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008107 ; All arguments are vectors:
8108 ; A[i] = ptrs[i] + offsets[i]*sizeof(i8)
8109 %A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
Sean Silva706fba52015-08-06 22:56:24 +00008110
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008111 ; Add the same scalar offset to each pointer of a vector:
8112 ; A[i] = ptrs[i] + offset*sizeof(i8)
8113 %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
Sean Silva706fba52015-08-06 22:56:24 +00008114
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008115 ; Add distinct offsets to the same pointer:
8116 ; A[i] = ptr + offsets[i]*sizeof(i8)
8117 %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
Sean Silva706fba52015-08-06 22:56:24 +00008118
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008119 ; In all cases described above the type of the result is <4 x i8*>
8120
8121The two following instructions are equivalent:
8122
8123.. code-block:: llvm
8124
8125 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
8126 <4 x i32> <i32 2, i32 2, i32 2, i32 2>,
8127 <4 x i32> <i32 1, i32 1, i32 1, i32 1>,
8128 <4 x i32> %ind4,
8129 <4 x i64> <i64 13, i64 13, i64 13, i64 13>
Sean Silva706fba52015-08-06 22:56:24 +00008130
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008131 getelementptr %struct.ST, <4 x %struct.ST*> %s, <4 x i64> %ind1,
8132 i32 2, i32 1, <4 x i32> %ind4, i64 13
8133
8134Let's look at the C code, where the vector version of ``getelementptr``
8135makes sense:
8136
8137.. code-block:: c
8138
8139 // Let's assume that we vectorize the following loop:
Alexey Baderadec2832017-01-30 07:38:58 +00008140 double *A, *B; int *C;
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008141 for (int i = 0; i < size; ++i) {
8142 A[i] = B[C[i]];
8143 }
8144
8145.. code-block:: llvm
8146
8147 ; get pointers for 8 elements from array B
8148 %ptrs = getelementptr double, double* %B, <8 x i32> %C
8149 ; load 8 elements from array B into A
Elad Cohenef5798a2017-05-03 12:28:54 +00008150 %A = call <8 x double> @llvm.masked.gather.v8f64.v8p0f64(<8 x double*> %ptrs,
Elena Demikhovsky37a4da82015-07-09 07:42:48 +00008151 i32 8, <8 x i1> %mask, <8 x double> %passthru)
Sean Silvab084af42012-12-07 10:36:55 +00008152
8153Conversion Operations
8154---------------------
8155
8156The instructions in this category are the conversion instructions
8157(casting) which all take a single operand and a type. They perform
8158various bit conversions on the operand.
8159
Bjorn Petterssone1285e32017-10-24 11:59:20 +00008160.. _i_trunc:
8161
Sean Silvab084af42012-12-07 10:36:55 +00008162'``trunc .. to``' Instruction
8163^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8164
8165Syntax:
8166"""""""
8167
8168::
8169
8170 <result> = trunc <ty> <value> to <ty2> ; yields ty2
8171
8172Overview:
8173"""""""""
8174
8175The '``trunc``' instruction truncates its operand to the type ``ty2``.
8176
8177Arguments:
8178""""""""""
8179
8180The '``trunc``' instruction takes a value to trunc, and a type to trunc
8181it to. Both types must be of :ref:`integer <t_integer>` types, or vectors
8182of the same number of integers. The bit size of the ``value`` must be
8183larger than the bit size of the destination type, ``ty2``. Equal sized
8184types are not allowed.
8185
8186Semantics:
8187""""""""""
8188
8189The '``trunc``' instruction truncates the high order bits in ``value``
8190and converts the remaining bits to ``ty2``. Since the source size must
8191be larger than the destination size, ``trunc`` cannot be a *no-op cast*.
8192It will always truncate bits.
8193
8194Example:
8195""""""""
8196
8197.. code-block:: llvm
8198
8199 %X = trunc i32 257 to i8 ; yields i8:1
8200 %Y = trunc i32 123 to i1 ; yields i1:true
8201 %Z = trunc i32 122 to i1 ; yields i1:false
8202 %W = trunc <2 x i16> <i16 8, i16 7> to <2 x i8> ; yields <i8 8, i8 7>
8203
Bjorn Petterssone1285e32017-10-24 11:59:20 +00008204.. _i_zext:
8205
Sean Silvab084af42012-12-07 10:36:55 +00008206'``zext .. to``' Instruction
8207^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8208
8209Syntax:
8210"""""""
8211
8212::
8213
8214 <result> = zext <ty> <value> to <ty2> ; yields ty2
8215
8216Overview:
8217"""""""""
8218
8219The '``zext``' instruction zero extends its operand to type ``ty2``.
8220
8221Arguments:
8222""""""""""
8223
8224The '``zext``' instruction takes a value to cast, and a type to cast it
8225to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
8226the same number of integers. The bit size of the ``value`` must be
8227smaller than the bit size of the destination type, ``ty2``.
8228
8229Semantics:
8230""""""""""
8231
8232The ``zext`` fills the high order bits of the ``value`` with zero bits
8233until it reaches the size of the destination type, ``ty2``.
8234
8235When zero extending from i1, the result will always be either 0 or 1.
8236
8237Example:
8238""""""""
8239
8240.. code-block:: llvm
8241
8242 %X = zext i32 257 to i64 ; yields i64:257
8243 %Y = zext i1 true to i32 ; yields i32:1
8244 %Z = zext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
8245
Bjorn Petterssone1285e32017-10-24 11:59:20 +00008246.. _i_sext:
8247
Sean Silvab084af42012-12-07 10:36:55 +00008248'``sext .. to``' Instruction
8249^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8250
8251Syntax:
8252"""""""
8253
8254::
8255
8256 <result> = sext <ty> <value> to <ty2> ; yields ty2
8257
8258Overview:
8259"""""""""
8260
8261The '``sext``' sign extends ``value`` to the type ``ty2``.
8262
8263Arguments:
8264""""""""""
8265
8266The '``sext``' instruction takes a value to cast, and a type to cast it
8267to. Both types must be of :ref:`integer <t_integer>` types, or vectors of
8268the same number of integers. The bit size of the ``value`` must be
8269smaller than the bit size of the destination type, ``ty2``.
8270
8271Semantics:
8272""""""""""
8273
8274The '``sext``' instruction performs a sign extension by copying the sign
8275bit (highest order bit) of the ``value`` until it reaches the bit size
8276of the type ``ty2``.
8277
8278When sign extending from i1, the extension always results in -1 or 0.
8279
8280Example:
8281""""""""
8282
8283.. code-block:: llvm
8284
8285 %X = sext i8 -1 to i16 ; yields i16 :65535
8286 %Y = sext i1 true to i32 ; yields i32:-1
8287 %Z = sext <2 x i16> <i16 8, i16 7> to <2 x i32> ; yields <i32 8, i32 7>
8288
8289'``fptrunc .. to``' Instruction
8290^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8291
8292Syntax:
8293"""""""
8294
8295::
8296
8297 <result> = fptrunc <ty> <value> to <ty2> ; yields ty2
8298
8299Overview:
8300"""""""""
8301
8302The '``fptrunc``' instruction truncates ``value`` to type ``ty2``.
8303
8304Arguments:
8305""""""""""
8306
8307The '``fptrunc``' instruction takes a :ref:`floating point <t_floating>`
8308value to cast and a :ref:`floating point <t_floating>` type to cast it to.
8309The size of ``value`` must be larger than the size of ``ty2``. This
8310implies that ``fptrunc`` cannot be used to make a *no-op cast*.
8311
8312Semantics:
8313""""""""""
8314
Dan Liew50456fb2015-09-03 18:43:56 +00008315The '``fptrunc``' instruction casts a ``value`` from a larger
Sean Silvab084af42012-12-07 10:36:55 +00008316:ref:`floating point <t_floating>` type to a smaller :ref:`floating
Dan Liew50456fb2015-09-03 18:43:56 +00008317point <t_floating>` type. If the value cannot fit (i.e. overflows) within the
8318destination type, ``ty2``, then the results are undefined. If the cast produces
8319an inexact result, how rounding is performed (e.g. truncation, also known as
8320round to zero) is undefined.
Sean Silvab084af42012-12-07 10:36:55 +00008321
8322Example:
8323""""""""
8324
8325.. code-block:: llvm
8326
8327 %X = fptrunc double 123.0 to float ; yields float:123.0
8328 %Y = fptrunc double 1.0E+300 to float ; yields undefined
8329
8330'``fpext .. to``' Instruction
8331^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8332
8333Syntax:
8334"""""""
8335
8336::
8337
8338 <result> = fpext <ty> <value> to <ty2> ; yields ty2
8339
8340Overview:
8341"""""""""
8342
8343The '``fpext``' extends a floating point ``value`` to a larger floating
8344point value.
8345
8346Arguments:
8347""""""""""
8348
8349The '``fpext``' instruction takes a :ref:`floating point <t_floating>`
8350``value`` to cast, and a :ref:`floating point <t_floating>` type to cast it
8351to. The source type must be smaller than the destination type.
8352
8353Semantics:
8354""""""""""
8355
8356The '``fpext``' instruction extends the ``value`` from a smaller
8357:ref:`floating point <t_floating>` type to a larger :ref:`floating
8358point <t_floating>` type. The ``fpext`` cannot be used to make a
8359*no-op cast* because it always changes bits. Use ``bitcast`` to make a
8360*no-op cast* for a floating point cast.
8361
8362Example:
8363""""""""
8364
8365.. code-block:: llvm
8366
8367 %X = fpext float 3.125 to double ; yields double:3.125000e+00
8368 %Y = fpext double %X to fp128 ; yields fp128:0xL00000000000000004000900000000000
8369
8370'``fptoui .. to``' Instruction
8371^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8372
8373Syntax:
8374"""""""
8375
8376::
8377
8378 <result> = fptoui <ty> <value> to <ty2> ; yields ty2
8379
8380Overview:
8381"""""""""
8382
8383The '``fptoui``' converts a floating point ``value`` to its unsigned
8384integer equivalent of type ``ty2``.
8385
8386Arguments:
8387""""""""""
8388
8389The '``fptoui``' instruction takes a value to cast, which must be a
8390scalar or vector :ref:`floating point <t_floating>` value, and a type to
8391cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
8392``ty`` is a vector floating point type, ``ty2`` must be a vector integer
8393type with the same number of elements as ``ty``
8394
8395Semantics:
8396""""""""""
8397
8398The '``fptoui``' instruction converts its :ref:`floating
8399point <t_floating>` operand into the nearest (rounding towards zero)
8400unsigned integer value. If the value cannot fit in ``ty2``, the results
8401are undefined.
8402
8403Example:
8404""""""""
8405
8406.. code-block:: llvm
8407
8408 %X = fptoui double 123.0 to i32 ; yields i32:123
8409 %Y = fptoui float 1.0E+300 to i1 ; yields undefined:1
8410 %Z = fptoui float 1.04E+17 to i8 ; yields undefined:1
8411
8412'``fptosi .. to``' Instruction
8413^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8414
8415Syntax:
8416"""""""
8417
8418::
8419
8420 <result> = fptosi <ty> <value> to <ty2> ; yields ty2
8421
8422Overview:
8423"""""""""
8424
8425The '``fptosi``' instruction converts :ref:`floating point <t_floating>`
8426``value`` to type ``ty2``.
8427
8428Arguments:
8429""""""""""
8430
8431The '``fptosi``' instruction takes a value to cast, which must be a
8432scalar or vector :ref:`floating point <t_floating>` value, and a type to
8433cast it to ``ty2``, which must be an :ref:`integer <t_integer>` type. If
8434``ty`` is a vector floating point type, ``ty2`` must be a vector integer
8435type with the same number of elements as ``ty``
8436
8437Semantics:
8438""""""""""
8439
8440The '``fptosi``' instruction converts its :ref:`floating
8441point <t_floating>` operand into the nearest (rounding towards zero)
8442signed integer value. If the value cannot fit in ``ty2``, the results
8443are undefined.
8444
8445Example:
8446""""""""
8447
8448.. code-block:: llvm
8449
8450 %X = fptosi double -123.0 to i32 ; yields i32:-123
8451 %Y = fptosi float 1.0E-247 to i1 ; yields undefined:1
8452 %Z = fptosi float 1.04E+17 to i8 ; yields undefined:1
8453
8454'``uitofp .. to``' Instruction
8455^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8456
8457Syntax:
8458"""""""
8459
8460::
8461
8462 <result> = uitofp <ty> <value> to <ty2> ; yields ty2
8463
8464Overview:
8465"""""""""
8466
8467The '``uitofp``' instruction regards ``value`` as an unsigned integer
8468and converts that value to the ``ty2`` type.
8469
8470Arguments:
8471""""""""""
8472
8473The '``uitofp``' instruction takes a value to cast, which must be a
8474scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
8475``ty2``, which must be an :ref:`floating point <t_floating>` type. If
8476``ty`` is a vector integer type, ``ty2`` must be a vector floating point
8477type with the same number of elements as ``ty``
8478
8479Semantics:
8480""""""""""
8481
8482The '``uitofp``' instruction interprets its operand as an unsigned
8483integer quantity and converts it to the corresponding floating point
8484value. If the value cannot fit in the floating point value, the results
8485are undefined.
8486
8487Example:
8488""""""""
8489
8490.. code-block:: llvm
8491
8492 %X = uitofp i32 257 to float ; yields float:257.0
8493 %Y = uitofp i8 -1 to double ; yields double:255.0
8494
8495'``sitofp .. to``' Instruction
8496^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8497
8498Syntax:
8499"""""""
8500
8501::
8502
8503 <result> = sitofp <ty> <value> to <ty2> ; yields ty2
8504
8505Overview:
8506"""""""""
8507
8508The '``sitofp``' instruction regards ``value`` as a signed integer and
8509converts that value to the ``ty2`` type.
8510
8511Arguments:
8512""""""""""
8513
8514The '``sitofp``' instruction takes a value to cast, which must be a
8515scalar or vector :ref:`integer <t_integer>` value, and a type to cast it to
8516``ty2``, which must be an :ref:`floating point <t_floating>` type. If
8517``ty`` is a vector integer type, ``ty2`` must be a vector floating point
8518type with the same number of elements as ``ty``
8519
8520Semantics:
8521""""""""""
8522
8523The '``sitofp``' instruction interprets its operand as a signed integer
8524quantity and converts it to the corresponding floating point value. If
8525the value cannot fit in the floating point value, the results are
8526undefined.
8527
8528Example:
8529""""""""
8530
8531.. code-block:: llvm
8532
8533 %X = sitofp i32 257 to float ; yields float:257.0
8534 %Y = sitofp i8 -1 to double ; yields double:-1.0
8535
8536.. _i_ptrtoint:
8537
8538'``ptrtoint .. to``' Instruction
8539^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8540
8541Syntax:
8542"""""""
8543
8544::
8545
8546 <result> = ptrtoint <ty> <value> to <ty2> ; yields ty2
8547
8548Overview:
8549"""""""""
8550
8551The '``ptrtoint``' instruction converts the pointer or a vector of
8552pointers ``value`` to the integer (or vector of integers) type ``ty2``.
8553
8554Arguments:
8555""""""""""
8556
8557The '``ptrtoint``' instruction takes a ``value`` to cast, which must be
Ed Maste8ed40ce2015-04-14 20:52:58 +00008558a value of type :ref:`pointer <t_pointer>` or a vector of pointers, and a
Sean Silvab084af42012-12-07 10:36:55 +00008559type to cast it to ``ty2``, which must be an :ref:`integer <t_integer>` or
8560a vector of integers type.
8561
8562Semantics:
8563""""""""""
8564
8565The '``ptrtoint``' instruction converts ``value`` to integer type
8566``ty2`` by interpreting the pointer value as an integer and either
8567truncating or zero extending that value to the size of the integer type.
8568If ``value`` is smaller than ``ty2`` then a zero extension is done. If
8569``value`` is larger than ``ty2`` then a truncation is done. If they are
8570the same size, then nothing is done (*no-op cast*) other than a type
8571change.
8572
8573Example:
8574""""""""
8575
8576.. code-block:: llvm
8577
8578 %X = ptrtoint i32* %P to i8 ; yields truncation on 32-bit architecture
8579 %Y = ptrtoint i32* %P to i64 ; yields zero extension on 32-bit architecture
8580 %Z = ptrtoint <4 x i32*> %P to <4 x i64>; yields vector zero extension for a vector of addresses on 32-bit architecture
8581
8582.. _i_inttoptr:
8583
8584'``inttoptr .. to``' Instruction
8585^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8586
8587Syntax:
8588"""""""
8589
8590::
8591
8592 <result> = inttoptr <ty> <value> to <ty2> ; yields ty2
8593
8594Overview:
8595"""""""""
8596
8597The '``inttoptr``' instruction converts an integer ``value`` to a
8598pointer type, ``ty2``.
8599
8600Arguments:
8601""""""""""
8602
8603The '``inttoptr``' instruction takes an :ref:`integer <t_integer>` value to
8604cast, and a type to cast it to, which must be a :ref:`pointer <t_pointer>`
8605type.
8606
8607Semantics:
8608""""""""""
8609
8610The '``inttoptr``' instruction converts ``value`` to type ``ty2`` by
8611applying either a zero extension or a truncation depending on the size
8612of the integer ``value``. If ``value`` is larger than the size of a
8613pointer then a truncation is done. If ``value`` is smaller than the size
8614of a pointer then a zero extension is done. If they are the same size,
8615nothing is done (*no-op cast*).
8616
8617Example:
8618""""""""
8619
8620.. code-block:: llvm
8621
8622 %X = inttoptr i32 255 to i32* ; yields zero extension on 64-bit architecture
8623 %Y = inttoptr i32 255 to i32* ; yields no-op on 32-bit architecture
8624 %Z = inttoptr i64 0 to i32* ; yields truncation on 32-bit architecture
8625 %Z = inttoptr <4 x i32> %G to <4 x i8*>; yields truncation of vector G to four pointers
8626
8627.. _i_bitcast:
8628
8629'``bitcast .. to``' Instruction
8630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8631
8632Syntax:
8633"""""""
8634
8635::
8636
8637 <result> = bitcast <ty> <value> to <ty2> ; yields ty2
8638
8639Overview:
8640"""""""""
8641
8642The '``bitcast``' instruction converts ``value`` to type ``ty2`` without
8643changing any bits.
8644
8645Arguments:
8646""""""""""
8647
8648The '``bitcast``' instruction takes a value to cast, which must be a
8649non-aggregate first class value, and a type to cast it to, which must
Matt Arsenault24b49c42013-07-31 17:49:08 +00008650also be a non-aggregate :ref:`first class <t_firstclass>` type. The
8651bit sizes of ``value`` and the destination type, ``ty2``, must be
Sean Silvaa1190322015-08-06 22:56:48 +00008652identical. If the source type is a pointer, the destination type must
Matt Arsenault24b49c42013-07-31 17:49:08 +00008653also be a pointer of the same size. This instruction supports bitwise
8654conversion of vectors to integers and to vectors of other types (as
8655long as they have the same size).
Sean Silvab084af42012-12-07 10:36:55 +00008656
8657Semantics:
8658""""""""""
8659
Matt Arsenault24b49c42013-07-31 17:49:08 +00008660The '``bitcast``' instruction converts ``value`` to type ``ty2``. It
8661is always a *no-op cast* because no bits change with this
8662conversion. The conversion is done as if the ``value`` had been stored
8663to memory and read back as type ``ty2``. Pointer (or vector of
8664pointers) types may only be converted to other pointer (or vector of
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008665pointers) types with the same address space through this instruction.
8666To convert pointers to other types, use the :ref:`inttoptr <i_inttoptr>`
8667or :ref:`ptrtoint <i_ptrtoint>` instructions first.
Sean Silvab084af42012-12-07 10:36:55 +00008668
8669Example:
8670""""""""
8671
Renato Golin124f2592016-07-20 12:16:38 +00008672.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00008673
8674 %X = bitcast i8 255 to i8 ; yields i8 :-1
8675 %Y = bitcast i32* %x to sint* ; yields sint*:%x
8676 %Z = bitcast <2 x int> %V to i64; ; yields i64: %V
8677 %Z = bitcast <2 x i32*> %V to <2 x i64*> ; yields <2 x i64*>
8678
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008679.. _i_addrspacecast:
8680
8681'``addrspacecast .. to``' Instruction
8682^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
8683
8684Syntax:
8685"""""""
8686
8687::
8688
8689 <result> = addrspacecast <pty> <ptrval> to <pty2> ; yields pty2
8690
8691Overview:
8692"""""""""
8693
8694The '``addrspacecast``' instruction converts ``ptrval`` from ``pty`` in
8695address space ``n`` to type ``pty2`` in address space ``m``.
8696
8697Arguments:
8698""""""""""
8699
8700The '``addrspacecast``' instruction takes a pointer or vector of pointer value
8701to cast and a pointer type to cast it to, which must have a different
8702address space.
8703
8704Semantics:
8705""""""""""
8706
8707The '``addrspacecast``' instruction converts the pointer value
8708``ptrval`` to type ``pty2``. It can be a *no-op cast* or a complex
Matt Arsenault54a2a172013-11-15 05:44:56 +00008709value modification, depending on the target and the address space
8710pair. Pointer conversions within the same address space must be
8711performed with the ``bitcast`` instruction. Note that if the address space
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008712conversion is legal then both result and operand refer to the same memory
8713location.
8714
8715Example:
8716""""""""
8717
8718.. code-block:: llvm
8719
Matt Arsenault9c13dd02013-11-15 22:43:50 +00008720 %X = addrspacecast i32* %x to i32 addrspace(1)* ; yields i32 addrspace(1)*:%x
8721 %Y = addrspacecast i32 addrspace(1)* %y to i64 addrspace(2)* ; yields i64 addrspace(2)*:%y
8722 %Z = addrspacecast <4 x i32*> %z to <4 x float addrspace(3)*> ; yields <4 x float addrspace(3)*>:%z
Matt Arsenaultb03bd4d2013-11-15 01:34:59 +00008723
Sean Silvab084af42012-12-07 10:36:55 +00008724.. _otherops:
8725
8726Other Operations
8727----------------
8728
8729The instructions in this category are the "miscellaneous" instructions,
8730which defy better classification.
8731
8732.. _i_icmp:
8733
8734'``icmp``' Instruction
8735^^^^^^^^^^^^^^^^^^^^^^
8736
8737Syntax:
8738"""""""
8739
8740::
8741
Tim Northover675a0962014-06-13 14:24:23 +00008742 <result> = icmp <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Sean Silvab084af42012-12-07 10:36:55 +00008743
8744Overview:
8745"""""""""
8746
8747The '``icmp``' instruction returns a boolean value or a vector of
8748boolean values based on comparison of its two integer, integer vector,
8749pointer, or pointer vector operands.
8750
8751Arguments:
8752""""""""""
8753
8754The '``icmp``' instruction takes three operands. The first operand is
8755the condition code indicating the kind of comparison to perform. It is
Sanjay Patel43d41442016-03-30 21:38:20 +00008756not a value, just a keyword. The possible condition codes are:
Sean Silvab084af42012-12-07 10:36:55 +00008757
8758#. ``eq``: equal
8759#. ``ne``: not equal
8760#. ``ugt``: unsigned greater than
8761#. ``uge``: unsigned greater or equal
8762#. ``ult``: unsigned less than
8763#. ``ule``: unsigned less or equal
8764#. ``sgt``: signed greater than
8765#. ``sge``: signed greater or equal
8766#. ``slt``: signed less than
8767#. ``sle``: signed less or equal
8768
8769The remaining two arguments must be :ref:`integer <t_integer>` or
8770:ref:`pointer <t_pointer>` or integer :ref:`vector <t_vector>` typed. They
8771must also be identical types.
8772
8773Semantics:
8774""""""""""
8775
8776The '``icmp``' compares ``op1`` and ``op2`` according to the condition
8777code given as ``cond``. The comparison performed always yields either an
8778:ref:`i1 <t_integer>` or vector of ``i1`` result, as follows:
8779
8780#. ``eq``: yields ``true`` if the operands are equal, ``false``
8781 otherwise. No sign interpretation is necessary or performed.
8782#. ``ne``: yields ``true`` if the operands are unequal, ``false``
8783 otherwise. No sign interpretation is necessary or performed.
8784#. ``ugt``: interprets the operands as unsigned values and yields
8785 ``true`` if ``op1`` is greater than ``op2``.
8786#. ``uge``: interprets the operands as unsigned values and yields
8787 ``true`` if ``op1`` is greater than or equal to ``op2``.
8788#. ``ult``: interprets the operands as unsigned values and yields
8789 ``true`` if ``op1`` is less than ``op2``.
8790#. ``ule``: interprets the operands as unsigned values and yields
8791 ``true`` if ``op1`` is less than or equal to ``op2``.
8792#. ``sgt``: interprets the operands as signed values and yields ``true``
8793 if ``op1`` is greater than ``op2``.
8794#. ``sge``: interprets the operands as signed values and yields ``true``
8795 if ``op1`` is greater than or equal to ``op2``.
8796#. ``slt``: interprets the operands as signed values and yields ``true``
8797 if ``op1`` is less than ``op2``.
8798#. ``sle``: interprets the operands as signed values and yields ``true``
8799 if ``op1`` is less than or equal to ``op2``.
8800
8801If the operands are :ref:`pointer <t_pointer>` typed, the pointer values
8802are compared as if they were integers.
8803
8804If the operands are integer vectors, then they are compared element by
8805element. The result is an ``i1`` vector with the same number of elements
8806as the values being compared. Otherwise, the result is an ``i1``.
8807
8808Example:
8809""""""""
8810
Renato Golin124f2592016-07-20 12:16:38 +00008811.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00008812
8813 <result> = icmp eq i32 4, 5 ; yields: result=false
8814 <result> = icmp ne float* %X, %X ; yields: result=false
8815 <result> = icmp ult i16 4, 5 ; yields: result=true
8816 <result> = icmp sgt i16 4, 5 ; yields: result=false
8817 <result> = icmp ule i16 -4, 5 ; yields: result=false
8818 <result> = icmp sge i16 4, 5 ; yields: result=false
8819
Sean Silvab084af42012-12-07 10:36:55 +00008820.. _i_fcmp:
8821
8822'``fcmp``' Instruction
8823^^^^^^^^^^^^^^^^^^^^^^
8824
8825Syntax:
8826"""""""
8827
8828::
8829
James Molloy88eb5352015-07-10 12:52:00 +00008830 <result> = fcmp [fast-math flags]* <cond> <ty> <op1>, <op2> ; yields i1 or <N x i1>:result
Sean Silvab084af42012-12-07 10:36:55 +00008831
8832Overview:
8833"""""""""
8834
8835The '``fcmp``' instruction returns a boolean value or vector of boolean
8836values based on comparison of its operands.
8837
8838If the operands are floating point scalars, then the result type is a
8839boolean (:ref:`i1 <t_integer>`).
8840
8841If the operands are floating point vectors, then the result type is a
8842vector of boolean with the same number of elements as the operands being
8843compared.
8844
8845Arguments:
8846""""""""""
8847
8848The '``fcmp``' instruction takes three operands. The first operand is
8849the condition code indicating the kind of comparison to perform. It is
Sanjay Patel43d41442016-03-30 21:38:20 +00008850not a value, just a keyword. The possible condition codes are:
Sean Silvab084af42012-12-07 10:36:55 +00008851
8852#. ``false``: no comparison, always returns false
8853#. ``oeq``: ordered and equal
8854#. ``ogt``: ordered and greater than
8855#. ``oge``: ordered and greater than or equal
8856#. ``olt``: ordered and less than
8857#. ``ole``: ordered and less than or equal
8858#. ``one``: ordered and not equal
8859#. ``ord``: ordered (no nans)
8860#. ``ueq``: unordered or equal
8861#. ``ugt``: unordered or greater than
8862#. ``uge``: unordered or greater than or equal
8863#. ``ult``: unordered or less than
8864#. ``ule``: unordered or less than or equal
8865#. ``une``: unordered or not equal
8866#. ``uno``: unordered (either nans)
8867#. ``true``: no comparison, always returns true
8868
8869*Ordered* means that neither operand is a QNAN while *unordered* means
8870that either operand may be a QNAN.
8871
8872Each of ``val1`` and ``val2`` arguments must be either a :ref:`floating
8873point <t_floating>` type or a :ref:`vector <t_vector>` of floating point
8874type. They must have identical types.
8875
8876Semantics:
8877""""""""""
8878
8879The '``fcmp``' instruction compares ``op1`` and ``op2`` according to the
8880condition code given as ``cond``. If the operands are vectors, then the
8881vectors are compared element by element. Each comparison performed
8882always yields an :ref:`i1 <t_integer>` result, as follows:
8883
8884#. ``false``: always yields ``false``, regardless of operands.
8885#. ``oeq``: yields ``true`` if both operands are not a QNAN and ``op1``
8886 is equal to ``op2``.
8887#. ``ogt``: yields ``true`` if both operands are not a QNAN and ``op1``
8888 is greater than ``op2``.
8889#. ``oge``: yields ``true`` if both operands are not a QNAN and ``op1``
8890 is greater than or equal to ``op2``.
8891#. ``olt``: yields ``true`` if both operands are not a QNAN and ``op1``
8892 is less than ``op2``.
8893#. ``ole``: yields ``true`` if both operands are not a QNAN and ``op1``
8894 is less than or equal to ``op2``.
8895#. ``one``: yields ``true`` if both operands are not a QNAN and ``op1``
8896 is not equal to ``op2``.
8897#. ``ord``: yields ``true`` if both operands are not a QNAN.
8898#. ``ueq``: yields ``true`` if either operand is a QNAN or ``op1`` is
8899 equal to ``op2``.
8900#. ``ugt``: yields ``true`` if either operand is a QNAN or ``op1`` is
8901 greater than ``op2``.
8902#. ``uge``: yields ``true`` if either operand is a QNAN or ``op1`` is
8903 greater than or equal to ``op2``.
8904#. ``ult``: yields ``true`` if either operand is a QNAN or ``op1`` is
8905 less than ``op2``.
8906#. ``ule``: yields ``true`` if either operand is a QNAN or ``op1`` is
8907 less than or equal to ``op2``.
8908#. ``une``: yields ``true`` if either operand is a QNAN or ``op1`` is
8909 not equal to ``op2``.
8910#. ``uno``: yields ``true`` if either operand is a QNAN.
8911#. ``true``: always yields ``true``, regardless of operands.
8912
James Molloy88eb5352015-07-10 12:52:00 +00008913The ``fcmp`` instruction can also optionally take any number of
8914:ref:`fast-math flags <fastmath>`, which are optimization hints to enable
8915otherwise unsafe floating point optimizations.
8916
8917Any set of fast-math flags are legal on an ``fcmp`` instruction, but the
8918only flags that have any effect on its semantics are those that allow
8919assumptions to be made about the values of input arguments; namely
8920``nnan``, ``ninf``, and ``nsz``. See :ref:`fastmath` for more information.
8921
Sean Silvab084af42012-12-07 10:36:55 +00008922Example:
8923""""""""
8924
Renato Golin124f2592016-07-20 12:16:38 +00008925.. code-block:: text
Sean Silvab084af42012-12-07 10:36:55 +00008926
8927 <result> = fcmp oeq float 4.0, 5.0 ; yields: result=false
8928 <result> = fcmp one float 4.0, 5.0 ; yields: result=true
8929 <result> = fcmp olt float 4.0, 5.0 ; yields: result=true
8930 <result> = fcmp ueq double 1.0, 2.0 ; yields: result=false
8931
Sean Silvab084af42012-12-07 10:36:55 +00008932.. _i_phi:
8933
8934'``phi``' Instruction
8935^^^^^^^^^^^^^^^^^^^^^
8936
8937Syntax:
8938"""""""
8939
8940::
8941
8942 <result> = phi <ty> [ <val0>, <label0>], ...
8943
8944Overview:
8945"""""""""
8946
8947The '``phi``' instruction is used to implement the φ node in the SSA
8948graph representing the function.
8949
8950Arguments:
8951""""""""""
8952
8953The type of the incoming values is specified with the first type field.
8954After this, the '``phi``' instruction takes a list of pairs as
8955arguments, with one pair for each predecessor basic block of the current
8956block. Only values of :ref:`first class <t_firstclass>` type may be used as
8957the value arguments to the PHI node. Only labels may be used as the
8958label arguments.
8959
8960There must be no non-phi instructions between the start of a basic block
8961and the PHI instructions: i.e. PHI instructions must be first in a basic
8962block.
8963
8964For the purposes of the SSA form, the use of each incoming value is
8965deemed to occur on the edge from the corresponding predecessor block to
8966the current block (but after any definition of an '``invoke``'
8967instruction's return value on the same edge).
8968
8969Semantics:
8970""""""""""
8971
8972At runtime, the '``phi``' instruction logically takes on the value
8973specified by the pair corresponding to the predecessor basic block that
8974executed just prior to the current block.
8975
8976Example:
8977""""""""
8978
8979.. code-block:: llvm
8980
8981 Loop: ; Infinite loop that counts from 0 on up...
8982 %indvar = phi i32 [ 0, %LoopHeader ], [ %nextindvar, %Loop ]
8983 %nextindvar = add i32 %indvar, 1
8984 br label %Loop
8985
8986.. _i_select:
8987
8988'``select``' Instruction
8989^^^^^^^^^^^^^^^^^^^^^^^^
8990
8991Syntax:
8992"""""""
8993
8994::
8995
8996 <result> = select selty <cond>, <ty> <val1>, <ty> <val2> ; yields ty
8997
8998 selty is either i1 or {<N x i1>}
8999
9000Overview:
9001"""""""""
9002
9003The '``select``' instruction is used to choose one value based on a
Joerg Sonnenberger94321ec2014-03-26 15:30:21 +00009004condition, without IR-level branching.
Sean Silvab084af42012-12-07 10:36:55 +00009005
9006Arguments:
9007""""""""""
9008
9009The '``select``' instruction requires an 'i1' value or a vector of 'i1'
9010values indicating the condition, and two values of the same :ref:`first
David Majnemer40a0b592015-03-03 22:45:47 +00009011class <t_firstclass>` type.
Sean Silvab084af42012-12-07 10:36:55 +00009012
9013Semantics:
9014""""""""""
9015
9016If the condition is an i1 and it evaluates to 1, the instruction returns
9017the first value argument; otherwise, it returns the second value
9018argument.
9019
9020If the condition is a vector of i1, then the value arguments must be
9021vectors of the same size, and the selection is done element by element.
9022
David Majnemer40a0b592015-03-03 22:45:47 +00009023If the condition is an i1 and the value arguments are vectors of the
9024same size, then an entire vector is selected.
9025
Sean Silvab084af42012-12-07 10:36:55 +00009026Example:
9027""""""""
9028
9029.. code-block:: llvm
9030
9031 %X = select i1 true, i8 17, i8 42 ; yields i8:17
9032
9033.. _i_call:
9034
9035'``call``' Instruction
9036^^^^^^^^^^^^^^^^^^^^^^
9037
9038Syntax:
9039"""""""
9040
9041::
9042
David Blaikieb83cf102016-07-13 17:21:34 +00009043 <result> = [tail | musttail | notail ] call [fast-math flags] [cconv] [ret attrs] <ty>|<fnty> <fnptrval>(<function args>) [fn attrs]
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00009044 [ operand bundles ]
Sean Silvab084af42012-12-07 10:36:55 +00009045
9046Overview:
9047"""""""""
9048
9049The '``call``' instruction represents a simple function call.
9050
9051Arguments:
9052""""""""""
9053
9054This instruction requires several arguments:
9055
Reid Kleckner5772b772014-04-24 20:14:34 +00009056#. The optional ``tail`` and ``musttail`` markers indicate that the optimizers
Sean Silvaa1190322015-08-06 22:56:48 +00009057 should perform tail call optimization. The ``tail`` marker is a hint that
9058 `can be ignored <CodeGenerator.html#sibcallopt>`_. The ``musttail`` marker
Reid Kleckner5772b772014-04-24 20:14:34 +00009059 means that the call must be tail call optimized in order for the program to
Sean Silvaa1190322015-08-06 22:56:48 +00009060 be correct. The ``musttail`` marker provides these guarantees:
Reid Kleckner5772b772014-04-24 20:14:34 +00009061
9062 #. The call will not cause unbounded stack growth if it is part of a
9063 recursive cycle in the call graph.
9064 #. Arguments with the :ref:`inalloca <attr_inalloca>` attribute are
9065 forwarded in place.
9066
Florian Hahnedae5a62018-01-17 23:29:25 +00009067 Both markers imply that the callee does not access allocas from the caller.
9068 The ``tail`` marker additionally implies that the callee does not access
9069 varargs from the caller, while ``musttail`` implies that varargs from the
9070 caller are passed to the callee. Calls marked ``musttail`` must obey the
9071 following additional rules:
Reid Kleckner5772b772014-04-24 20:14:34 +00009072
9073 - The call must immediately precede a :ref:`ret <i_ret>` instruction,
9074 or a pointer bitcast followed by a ret instruction.
9075 - The ret instruction must return the (possibly bitcasted) value
9076 produced by the call or void.
Sean Silvaa1190322015-08-06 22:56:48 +00009077 - The caller and callee prototypes must match. Pointer types of
Reid Kleckner5772b772014-04-24 20:14:34 +00009078 parameters or return types may differ in pointee type, but not
9079 in address space.
9080 - The calling conventions of the caller and callee must match.
9081 - All ABI-impacting function attributes, such as sret, byval, inreg,
9082 returned, and inalloca, must match.
Reid Kleckner83498642014-08-26 00:33:28 +00009083 - The callee must be varargs iff the caller is varargs. Bitcasting a
9084 non-varargs function to the appropriate varargs type is legal so
9085 long as the non-varargs prefixes obey the other rules.
Reid Kleckner5772b772014-04-24 20:14:34 +00009086
9087 Tail call optimization for calls marked ``tail`` is guaranteed to occur if
9088 the following conditions are met:
Sean Silvab084af42012-12-07 10:36:55 +00009089
9090 - Caller and callee both have the calling convention ``fastcc``.
9091 - The call is in tail position (ret immediately follows call and ret
9092 uses value of call or is void).
9093 - Option ``-tailcallopt`` is enabled, or
9094 ``llvm::GuaranteedTailCallOpt`` is ``true``.
Alp Tokercf218752014-06-30 18:57:16 +00009095 - `Platform-specific constraints are
Sean Silvab084af42012-12-07 10:36:55 +00009096 met. <CodeGenerator.html#tailcallopt>`_
9097
Akira Hatanaka5cfcce122015-11-06 23:55:38 +00009098#. The optional ``notail`` marker indicates that the optimizers should not add
9099 ``tail`` or ``musttail`` markers to the call. It is used to prevent tail
9100 call optimization from being performed on the call.
9101
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +00009102#. The optional ``fast-math flags`` marker indicates that the call has one or more
Sanjay Patelfa54ace2015-12-14 21:59:03 +00009103 :ref:`fast-math flags <fastmath>`, which are optimization hints to enable
9104 otherwise unsafe floating-point optimizations. Fast-math flags are only valid
9105 for calls that return a floating-point scalar or vector type.
9106
Sean Silvab084af42012-12-07 10:36:55 +00009107#. The optional "cconv" marker indicates which :ref:`calling
9108 convention <callingconv>` the call should use. If none is
9109 specified, the call defaults to using C calling conventions. The
9110 calling convention of the call must match the calling convention of
9111 the target function, or else the behavior is undefined.
9112#. The optional :ref:`Parameter Attributes <paramattrs>` list for return
9113 values. Only '``zeroext``', '``signext``', and '``inreg``' attributes
9114 are valid here.
9115#. '``ty``': the type of the call instruction itself which is also the
9116 type of the return value. Functions that return no value are marked
9117 ``void``.
David Blaikieb83cf102016-07-13 17:21:34 +00009118#. '``fnty``': shall be the signature of the function being called. The
9119 argument types must match the types implied by this signature. This
9120 type can be omitted if the function is not varargs.
Sean Silvab084af42012-12-07 10:36:55 +00009121#. '``fnptrval``': An LLVM value containing a pointer to a function to
David Blaikieb83cf102016-07-13 17:21:34 +00009122 be called. In most cases, this is a direct function call, but
Sean Silvab084af42012-12-07 10:36:55 +00009123 indirect ``call``'s are just as possible, calling an arbitrary pointer
9124 to function value.
9125#. '``function args``': argument list whose types match the function
9126 signature argument types and parameter attributes. All arguments must
9127 be of :ref:`first class <t_firstclass>` type. If the function signature
9128 indicates the function accepts a variable number of arguments, the
9129 extra arguments can be specified.
George Burgess IV39c91052017-04-13 04:01:55 +00009130#. The optional :ref:`function attributes <fnattrs>` list.
Sanjoy Dasb513a9f2015-09-24 23:34:52 +00009131#. The optional :ref:`operand bundles <opbundles>` list.
Sean Silvab084af42012-12-07 10:36:55 +00009132
9133Semantics:
9134""""""""""
9135
9136The '``call``' instruction is used to cause control flow to transfer to
9137a specified function, with its incoming arguments bound to the specified
9138values. Upon a '``ret``' instruction in the called function, control
9139flow continues with the instruction after the function call, and the
9140return value of the function is bound to the result argument.
9141
9142Example:
9143""""""""
9144
9145.. code-block:: llvm
9146
9147 %retval = call i32 @test(i32 %argc)
9148 call i32 (i8*, ...)* @printf(i8* %msg, i32 12, i8 42) ; yields i32
9149 %X = tail call i32 @foo() ; yields i32
9150 %Y = tail call fastcc i32 @foo() ; yields i32
9151 call void %foo(i8 97 signext)
9152
9153 %struct.A = type { i32, i8 }
Tim Northover675a0962014-06-13 14:24:23 +00009154 %r = call %struct.A @foo() ; yields { i32, i8 }
Sean Silvab084af42012-12-07 10:36:55 +00009155 %gr = extractvalue %struct.A %r, 0 ; yields i32
9156 %gr1 = extractvalue %struct.A %r, 1 ; yields i8
9157 %Z = call void @foo() noreturn ; indicates that %foo never returns normally
9158 %ZZ = call zeroext i32 @bar() ; Return value is %zero extended
9159
9160llvm treats calls to some functions with names and arguments that match
9161the standard C99 library as being the C99 library functions, and may
9162perform optimizations or generate code for them under that assumption.
9163This is something we'd like to change in the future to provide better
9164support for freestanding environments and non-C-based languages.
9165
9166.. _i_va_arg:
9167
9168'``va_arg``' Instruction
9169^^^^^^^^^^^^^^^^^^^^^^^^
9170
9171Syntax:
9172"""""""
9173
9174::
9175
9176 <resultval> = va_arg <va_list*> <arglist>, <argty>
9177
9178Overview:
9179"""""""""
9180
9181The '``va_arg``' instruction is used to access arguments passed through
9182the "variable argument" area of a function call. It is used to implement
9183the ``va_arg`` macro in C.
9184
9185Arguments:
9186""""""""""
9187
9188This instruction takes a ``va_list*`` value and the type of the
9189argument. It returns a value of the specified argument type and
9190increments the ``va_list`` to point to the next argument. The actual
9191type of ``va_list`` is target specific.
9192
9193Semantics:
9194""""""""""
9195
9196The '``va_arg``' instruction loads an argument of the specified type
9197from the specified ``va_list`` and causes the ``va_list`` to point to
9198the next argument. For more information, see the variable argument
9199handling :ref:`Intrinsic Functions <int_varargs>`.
9200
9201It is legal for this instruction to be called in a function which does
9202not take a variable number of arguments, for example, the ``vfprintf``
9203function.
9204
9205``va_arg`` is an LLVM instruction instead of an :ref:`intrinsic
9206function <intrinsics>` because it takes a type as an argument.
9207
9208Example:
9209""""""""
9210
9211See the :ref:`variable argument processing <int_varargs>` section.
9212
9213Note that the code generator does not yet fully support va\_arg on many
9214targets. Also, it does not currently support va\_arg with aggregate
9215types on any target.
9216
9217.. _i_landingpad:
9218
9219'``landingpad``' Instruction
9220^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9221
9222Syntax:
9223"""""""
9224
9225::
9226
David Majnemer7fddecc2015-06-17 20:52:32 +00009227 <resultval> = landingpad <resultty> <clause>+
9228 <resultval> = landingpad <resultty> cleanup <clause>*
Sean Silvab084af42012-12-07 10:36:55 +00009229
9230 <clause> := catch <type> <value>
9231 <clause> := filter <array constant type> <array constant>
9232
9233Overview:
9234"""""""""
9235
9236The '``landingpad``' instruction is used by `LLVM's exception handling
9237system <ExceptionHandling.html#overview>`_ to specify that a basic block
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00009238is a landing pad --- one where the exception lands, and corresponds to the
Sean Silvab084af42012-12-07 10:36:55 +00009239code found in the ``catch`` portion of a ``try``/``catch`` sequence. It
David Majnemer7fddecc2015-06-17 20:52:32 +00009240defines values supplied by the :ref:`personality function <personalityfn>` upon
Sean Silvab084af42012-12-07 10:36:55 +00009241re-entry to the function. The ``resultval`` has the type ``resultty``.
9242
9243Arguments:
9244""""""""""
9245
David Majnemer7fddecc2015-06-17 20:52:32 +00009246The optional
Sean Silvab084af42012-12-07 10:36:55 +00009247``cleanup`` flag indicates that the landing pad block is a cleanup.
9248
Dmitri Gribenkoe8131122013-01-19 20:34:20 +00009249A ``clause`` begins with the clause type --- ``catch`` or ``filter`` --- and
Sean Silvab084af42012-12-07 10:36:55 +00009250contains the global variable representing the "type" that may be caught
9251or filtered respectively. Unlike the ``catch`` clause, the ``filter``
9252clause takes an array constant as its argument. Use
9253"``[0 x i8**] undef``" for a filter which cannot throw. The
9254'``landingpad``' instruction must contain *at least* one ``clause`` or
9255the ``cleanup`` flag.
9256
9257Semantics:
9258""""""""""
9259
9260The '``landingpad``' instruction defines the values which are set by the
David Majnemer7fddecc2015-06-17 20:52:32 +00009261:ref:`personality function <personalityfn>` upon re-entry to the function, and
Sean Silvab084af42012-12-07 10:36:55 +00009262therefore the "result type" of the ``landingpad`` instruction. As with
9263calling conventions, how the personality function results are
9264represented in LLVM IR is target specific.
9265
9266The clauses are applied in order from top to bottom. If two
9267``landingpad`` instructions are merged together through inlining, the
9268clauses from the calling function are appended to the list of clauses.
9269When the call stack is being unwound due to an exception being thrown,
9270the exception is compared against each ``clause`` in turn. If it doesn't
9271match any of the clauses, and the ``cleanup`` flag is not set, then
9272unwinding continues further up the call stack.
9273
9274The ``landingpad`` instruction has several restrictions:
9275
9276- A landing pad block is a basic block which is the unwind destination
9277 of an '``invoke``' instruction.
9278- A landing pad block must have a '``landingpad``' instruction as its
9279 first non-PHI instruction.
9280- There can be only one '``landingpad``' instruction within the landing
9281 pad block.
9282- A basic block that is not a landing pad block may not include a
9283 '``landingpad``' instruction.
Sean Silvab084af42012-12-07 10:36:55 +00009284
9285Example:
9286""""""""
9287
9288.. code-block:: llvm
9289
9290 ;; A landing pad which can catch an integer.
David Majnemer7fddecc2015-06-17 20:52:32 +00009291 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00009292 catch i8** @_ZTIi
9293 ;; A landing pad that is a cleanup.
David Majnemer7fddecc2015-06-17 20:52:32 +00009294 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00009295 cleanup
9296 ;; A landing pad which can catch an integer and can only throw a double.
David Majnemer7fddecc2015-06-17 20:52:32 +00009297 %res = landingpad { i8*, i32 }
Sean Silvab084af42012-12-07 10:36:55 +00009298 catch i8** @_ZTIi
9299 filter [1 x i8**] [@_ZTId]
9300
Joseph Tremoulet2adaa982016-01-10 04:46:10 +00009301.. _i_catchpad:
9302
9303'``catchpad``' Instruction
9304^^^^^^^^^^^^^^^^^^^^^^^^^^
9305
9306Syntax:
9307"""""""
9308
9309::
9310
9311 <resultval> = catchpad within <catchswitch> [<args>*]
9312
9313Overview:
9314"""""""""
9315
9316The '``catchpad``' instruction is used by `LLVM's exception handling
9317system <ExceptionHandling.html#overview>`_ to specify that a basic block
9318begins a catch handler --- one where a personality routine attempts to transfer
9319control to catch an exception.
9320
9321Arguments:
9322""""""""""
9323
9324The ``catchswitch`` operand must always be a token produced by a
9325:ref:`catchswitch <i_catchswitch>` instruction in a predecessor block. This
9326ensures that each ``catchpad`` has exactly one predecessor block, and it always
9327terminates in a ``catchswitch``.
9328
9329The ``args`` correspond to whatever information the personality routine
9330requires to know if this is an appropriate handler for the exception. Control
9331will transfer to the ``catchpad`` if this is the first appropriate handler for
9332the exception.
9333
9334The ``resultval`` has the type :ref:`token <t_token>` and is used to match the
9335``catchpad`` to corresponding :ref:`catchrets <i_catchret>` and other nested EH
9336pads.
9337
9338Semantics:
9339""""""""""
9340
9341When the call stack is being unwound due to an exception being thrown, the
9342exception is compared against the ``args``. If it doesn't match, control will
9343not reach the ``catchpad`` instruction. The representation of ``args`` is
9344entirely target and personality function-specific.
9345
9346Like the :ref:`landingpad <i_landingpad>` instruction, the ``catchpad``
9347instruction must be the first non-phi of its parent basic block.
9348
9349The meaning of the tokens produced and consumed by ``catchpad`` and other "pad"
9350instructions is described in the
9351`Windows exception handling documentation\ <ExceptionHandling.html#wineh>`_.
9352
9353When a ``catchpad`` has been "entered" but not yet "exited" (as
9354described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9355it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
9356that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
9357
9358Example:
9359""""""""
9360
Renato Golin124f2592016-07-20 12:16:38 +00009361.. code-block:: text
Joseph Tremoulet2adaa982016-01-10 04:46:10 +00009362
9363 dispatch:
9364 %cs = catchswitch within none [label %handler0] unwind to caller
9365 ;; A catch block which can catch an integer.
9366 handler0:
9367 %tok = catchpad within %cs [i8** @_ZTIi]
9368
David Majnemer654e1302015-07-31 17:58:14 +00009369.. _i_cleanuppad:
9370
9371'``cleanuppad``' Instruction
9372^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9373
9374Syntax:
9375"""""""
9376
9377::
9378
David Majnemer8a1c45d2015-12-12 05:38:55 +00009379 <resultval> = cleanuppad within <parent> [<args>*]
David Majnemer654e1302015-07-31 17:58:14 +00009380
9381Overview:
9382"""""""""
9383
9384The '``cleanuppad``' instruction is used by `LLVM's exception handling
9385system <ExceptionHandling.html#overview>`_ to specify that a basic block
9386is a cleanup block --- one where a personality routine attempts to
9387transfer control to run cleanup actions.
9388The ``args`` correspond to whatever additional
9389information the :ref:`personality function <personalityfn>` requires to
9390execute the cleanup.
Joseph Tremoulet8220bcc2015-08-23 00:26:33 +00009391The ``resultval`` has the type :ref:`token <t_token>` and is used to
David Majnemer8a1c45d2015-12-12 05:38:55 +00009392match the ``cleanuppad`` to corresponding :ref:`cleanuprets <i_cleanupret>`.
9393The ``parent`` argument is the token of the funclet that contains the
9394``cleanuppad`` instruction. If the ``cleanuppad`` is not inside a funclet,
9395this operand may be the token ``none``.
David Majnemer654e1302015-07-31 17:58:14 +00009396
9397Arguments:
9398""""""""""
9399
9400The instruction takes a list of arbitrary values which are interpreted
9401by the :ref:`personality function <personalityfn>`.
9402
9403Semantics:
9404""""""""""
9405
David Majnemer654e1302015-07-31 17:58:14 +00009406When the call stack is being unwound due to an exception being thrown,
9407the :ref:`personality function <personalityfn>` transfers control to the
9408``cleanuppad`` with the aid of the personality-specific arguments.
Joseph Tremoulet9ce71f72015-09-03 09:09:43 +00009409As with calling conventions, how the personality function results are
9410represented in LLVM IR is target specific.
David Majnemer654e1302015-07-31 17:58:14 +00009411
9412The ``cleanuppad`` instruction has several restrictions:
9413
9414- A cleanup block is a basic block which is the unwind destination of
9415 an exceptional instruction.
9416- A cleanup block must have a '``cleanuppad``' instruction as its
9417 first non-PHI instruction.
9418- There can be only one '``cleanuppad``' instruction within the
9419 cleanup block.
9420- A basic block that is not a cleanup block may not include a
9421 '``cleanuppad``' instruction.
David Majnemer8a1c45d2015-12-12 05:38:55 +00009422
Joseph Tremoulete28885e2016-01-10 04:28:38 +00009423When a ``cleanuppad`` has been "entered" but not yet "exited" (as
9424described in the `EH documentation\ <ExceptionHandling.html#wineh-constraints>`_),
9425it is undefined behavior to execute a :ref:`call <i_call>` or :ref:`invoke <i_invoke>`
9426that does not carry an appropriate :ref:`"funclet" bundle <ob_funclet>`.
David Majnemer8a1c45d2015-12-12 05:38:55 +00009427
David Majnemer654e1302015-07-31 17:58:14 +00009428Example:
9429""""""""
9430
Renato Golin124f2592016-07-20 12:16:38 +00009431.. code-block:: text
David Majnemer654e1302015-07-31 17:58:14 +00009432
David Majnemer8a1c45d2015-12-12 05:38:55 +00009433 %tok = cleanuppad within %cs []
David Majnemer654e1302015-07-31 17:58:14 +00009434
Sean Silvab084af42012-12-07 10:36:55 +00009435.. _intrinsics:
9436
9437Intrinsic Functions
9438===================
9439
9440LLVM supports the notion of an "intrinsic function". These functions
9441have well known names and semantics and are required to follow certain
9442restrictions. Overall, these intrinsics represent an extension mechanism
9443for the LLVM language that does not require changing all of the
9444transformations in LLVM when adding to the language (or the bitcode
9445reader/writer, the parser, etc...).
9446
9447Intrinsic function names must all start with an "``llvm.``" prefix. This
9448prefix is reserved in LLVM for intrinsic names; thus, function names may
9449not begin with this prefix. Intrinsic functions must always be external
9450functions: you cannot define the body of intrinsic functions. Intrinsic
9451functions may only be used in call or invoke instructions: it is illegal
9452to take the address of an intrinsic function. Additionally, because
9453intrinsic functions are part of the LLVM language, it is required if any
9454are added that they be documented here.
9455
9456Some intrinsic functions can be overloaded, i.e., the intrinsic
9457represents a family of functions that perform the same operation but on
9458different data types. Because LLVM can represent over 8 million
9459different integer types, overloading is used commonly to allow an
9460intrinsic function to operate on any integer type. One or more of the
9461argument types or the result type can be overloaded to accept any
9462integer type. Argument types may also be defined as exactly matching a
9463previous argument's type or the result type. This allows an intrinsic
9464function which accepts multiple arguments, but needs all of them to be
9465of the same type, to only be overloaded with respect to a single
9466argument or the result.
9467
9468Overloaded intrinsics will have the names of its overloaded argument
9469types encoded into its function name, each preceded by a period. Only
9470those types which are overloaded result in a name suffix. Arguments
9471whose type is matched against another type do not. For example, the
9472``llvm.ctpop`` function can take an integer of any width and returns an
9473integer of exactly the same integer width. This leads to a family of
9474functions such as ``i8 @llvm.ctpop.i8(i8 %val)`` and
9475``i29 @llvm.ctpop.i29(i29 %val)``. Only one type, the return type, is
9476overloaded, and only one type suffix is required. Because the argument's
9477type is matched against the return type, it does not require its own
9478name suffix.
9479
9480To learn how to add an intrinsic function, please see the `Extending
9481LLVM Guide <ExtendingLLVM.html>`_.
9482
9483.. _int_varargs:
9484
9485Variable Argument Handling Intrinsics
9486-------------------------------------
9487
9488Variable argument support is defined in LLVM with the
9489:ref:`va_arg <i_va_arg>` instruction and these three intrinsic
9490functions. These functions are related to the similarly named macros
9491defined in the ``<stdarg.h>`` header file.
9492
9493All of these functions operate on arguments that use a target-specific
9494value type "``va_list``". The LLVM assembly language reference manual
9495does not define what this type is, so all transformations should be
9496prepared to handle these functions regardless of the type used.
9497
9498This example shows how the :ref:`va_arg <i_va_arg>` instruction and the
9499variable argument handling intrinsic functions are used.
9500
9501.. code-block:: llvm
9502
Tim Northoverab60bb92014-11-02 01:21:51 +00009503 ; This struct is different for every platform. For most platforms,
9504 ; it is merely an i8*.
9505 %struct.va_list = type { i8* }
9506
9507 ; For Unix x86_64 platforms, va_list is the following struct:
9508 ; %struct.va_list = type { i32, i32, i8*, i8* }
9509
Sean Silvab084af42012-12-07 10:36:55 +00009510 define i32 @test(i32 %X, ...) {
9511 ; Initialize variable argument processing
Tim Northoverab60bb92014-11-02 01:21:51 +00009512 %ap = alloca %struct.va_list
9513 %ap2 = bitcast %struct.va_list* %ap to i8*
Sean Silvab084af42012-12-07 10:36:55 +00009514 call void @llvm.va_start(i8* %ap2)
9515
9516 ; Read a single integer argument
Tim Northoverab60bb92014-11-02 01:21:51 +00009517 %tmp = va_arg i8* %ap2, i32
Sean Silvab084af42012-12-07 10:36:55 +00009518
9519 ; Demonstrate usage of llvm.va_copy and llvm.va_end
9520 %aq = alloca i8*
9521 %aq2 = bitcast i8** %aq to i8*
9522 call void @llvm.va_copy(i8* %aq2, i8* %ap2)
9523 call void @llvm.va_end(i8* %aq2)
9524
9525 ; Stop processing of arguments.
9526 call void @llvm.va_end(i8* %ap2)
9527 ret i32 %tmp
9528 }
9529
9530 declare void @llvm.va_start(i8*)
9531 declare void @llvm.va_copy(i8*, i8*)
9532 declare void @llvm.va_end(i8*)
9533
9534.. _int_va_start:
9535
9536'``llvm.va_start``' Intrinsic
9537^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9538
9539Syntax:
9540"""""""
9541
9542::
9543
Nick Lewycky04f6de02013-09-11 22:04:52 +00009544 declare void @llvm.va_start(i8* <arglist>)
Sean Silvab084af42012-12-07 10:36:55 +00009545
9546Overview:
9547"""""""""
9548
9549The '``llvm.va_start``' intrinsic initializes ``*<arglist>`` for
9550subsequent use by ``va_arg``.
9551
9552Arguments:
9553""""""""""
9554
9555The argument is a pointer to a ``va_list`` element to initialize.
9556
9557Semantics:
9558""""""""""
9559
9560The '``llvm.va_start``' intrinsic works just like the ``va_start`` macro
9561available in C. In a target-dependent way, it initializes the
9562``va_list`` element to which the argument points, so that the next call
9563to ``va_arg`` will produce the first variable argument passed to the
9564function. Unlike the C ``va_start`` macro, this intrinsic does not need
9565to know the last argument of the function as the compiler can figure
9566that out.
9567
9568'``llvm.va_end``' Intrinsic
9569^^^^^^^^^^^^^^^^^^^^^^^^^^^
9570
9571Syntax:
9572"""""""
9573
9574::
9575
9576 declare void @llvm.va_end(i8* <arglist>)
9577
9578Overview:
9579"""""""""
9580
9581The '``llvm.va_end``' intrinsic destroys ``*<arglist>``, which has been
9582initialized previously with ``llvm.va_start`` or ``llvm.va_copy``.
9583
9584Arguments:
9585""""""""""
9586
9587The argument is a pointer to a ``va_list`` to destroy.
9588
9589Semantics:
9590""""""""""
9591
9592The '``llvm.va_end``' intrinsic works just like the ``va_end`` macro
9593available in C. In a target-dependent way, it destroys the ``va_list``
9594element to which the argument points. Calls to
9595:ref:`llvm.va_start <int_va_start>` and
9596:ref:`llvm.va_copy <int_va_copy>` must be matched exactly with calls to
9597``llvm.va_end``.
9598
9599.. _int_va_copy:
9600
9601'``llvm.va_copy``' Intrinsic
9602^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9603
9604Syntax:
9605"""""""
9606
9607::
9608
9609 declare void @llvm.va_copy(i8* <destarglist>, i8* <srcarglist>)
9610
9611Overview:
9612"""""""""
9613
9614The '``llvm.va_copy``' intrinsic copies the current argument position
9615from the source argument list to the destination argument list.
9616
9617Arguments:
9618""""""""""
9619
9620The first argument is a pointer to a ``va_list`` element to initialize.
9621The second argument is a pointer to a ``va_list`` element to copy from.
9622
9623Semantics:
9624""""""""""
9625
9626The '``llvm.va_copy``' intrinsic works just like the ``va_copy`` macro
9627available in C. In a target-dependent way, it copies the source
9628``va_list`` element into the destination ``va_list`` element. This
9629intrinsic is necessary because the `` llvm.va_start`` intrinsic may be
9630arbitrarily complex and require, for example, memory allocation.
9631
9632Accurate Garbage Collection Intrinsics
9633--------------------------------------
9634
Philip Reamesc5b0f562015-02-25 23:52:06 +00009635LLVM's support for `Accurate Garbage Collection <GarbageCollection.html>`_
Mehdi Amini4a121fa2015-03-14 22:04:06 +00009636(GC) requires the frontend to generate code containing appropriate intrinsic
9637calls and select an appropriate GC strategy which knows how to lower these
Philip Reamesc5b0f562015-02-25 23:52:06 +00009638intrinsics in a manner which is appropriate for the target collector.
9639
Sean Silvab084af42012-12-07 10:36:55 +00009640These intrinsics allow identification of :ref:`GC roots on the
9641stack <int_gcroot>`, as well as garbage collector implementations that
9642require :ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers.
Philip Reamesc5b0f562015-02-25 23:52:06 +00009643Frontends for type-safe garbage collected languages should generate
Sean Silvab084af42012-12-07 10:36:55 +00009644these intrinsics to make use of the LLVM garbage collectors. For more
Philip Reamesf80bbff2015-02-25 23:45:20 +00009645details, see `Garbage Collection with LLVM <GarbageCollection.html>`_.
Sean Silvab084af42012-12-07 10:36:55 +00009646
Philip Reamesf80bbff2015-02-25 23:45:20 +00009647Experimental Statepoint Intrinsics
9648^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9649
9650LLVM provides an second experimental set of intrinsics for describing garbage
Sean Silvaa1190322015-08-06 22:56:48 +00009651collection safepoints in compiled code. These intrinsics are an alternative
Mehdi Amini4a121fa2015-03-14 22:04:06 +00009652to the ``llvm.gcroot`` intrinsics, but are compatible with the ones for
Sean Silvaa1190322015-08-06 22:56:48 +00009653:ref:`read <int_gcread>` and :ref:`write <int_gcwrite>` barriers. The
Mehdi Amini4a121fa2015-03-14 22:04:06 +00009654differences in approach are covered in the `Garbage Collection with LLVM
Sean Silvaa1190322015-08-06 22:56:48 +00009655<GarbageCollection.html>`_ documentation. The intrinsics themselves are
Philip Reamesf80bbff2015-02-25 23:45:20 +00009656described in :doc:`Statepoints`.
Sean Silvab084af42012-12-07 10:36:55 +00009657
9658.. _int_gcroot:
9659
9660'``llvm.gcroot``' Intrinsic
9661^^^^^^^^^^^^^^^^^^^^^^^^^^^
9662
9663Syntax:
9664"""""""
9665
9666::
9667
9668 declare void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
9669
9670Overview:
9671"""""""""
9672
9673The '``llvm.gcroot``' intrinsic declares the existence of a GC root to
9674the code generator, and allows some metadata to be associated with it.
9675
9676Arguments:
9677""""""""""
9678
9679The first argument specifies the address of a stack object that contains
9680the root pointer. The second pointer (which must be either a constant or
9681a global value address) contains the meta-data to be associated with the
9682root.
9683
9684Semantics:
9685""""""""""
9686
9687At runtime, a call to this intrinsic stores a null pointer into the
9688"ptrloc" location. At compile-time, the code generator generates
9689information to allow the runtime to find the pointer at GC safe points.
9690The '``llvm.gcroot``' intrinsic may only be used in a function which
9691:ref:`specifies a GC algorithm <gc>`.
9692
9693.. _int_gcread:
9694
9695'``llvm.gcread``' Intrinsic
9696^^^^^^^^^^^^^^^^^^^^^^^^^^^
9697
9698Syntax:
9699"""""""
9700
9701::
9702
9703 declare i8* @llvm.gcread(i8* %ObjPtr, i8** %Ptr)
9704
9705Overview:
9706"""""""""
9707
9708The '``llvm.gcread``' intrinsic identifies reads of references from heap
9709locations, allowing garbage collector implementations that require read
9710barriers.
9711
9712Arguments:
9713""""""""""
9714
9715The second argument is the address to read from, which should be an
9716address allocated from the garbage collector. The first object is a
9717pointer to the start of the referenced object, if needed by the language
9718runtime (otherwise null).
9719
9720Semantics:
9721""""""""""
9722
9723The '``llvm.gcread``' intrinsic has the same semantics as a load
9724instruction, but may be replaced with substantially more complex code by
9725the garbage collector runtime, as needed. The '``llvm.gcread``'
9726intrinsic may only be used in a function which :ref:`specifies a GC
9727algorithm <gc>`.
9728
9729.. _int_gcwrite:
9730
9731'``llvm.gcwrite``' Intrinsic
9732^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9733
9734Syntax:
9735"""""""
9736
9737::
9738
9739 declare void @llvm.gcwrite(i8* %P1, i8* %Obj, i8** %P2)
9740
9741Overview:
9742"""""""""
9743
9744The '``llvm.gcwrite``' intrinsic identifies writes of references to heap
9745locations, allowing garbage collector implementations that require write
9746barriers (such as generational or reference counting collectors).
9747
9748Arguments:
9749""""""""""
9750
9751The first argument is the reference to store, the second is the start of
9752the object to store it to, and the third is the address of the field of
9753Obj to store to. If the runtime does not require a pointer to the
9754object, Obj may be null.
9755
9756Semantics:
9757""""""""""
9758
9759The '``llvm.gcwrite``' intrinsic has the same semantics as a store
9760instruction, but may be replaced with substantially more complex code by
9761the garbage collector runtime, as needed. The '``llvm.gcwrite``'
9762intrinsic may only be used in a function which :ref:`specifies a GC
9763algorithm <gc>`.
9764
9765Code Generator Intrinsics
9766-------------------------
9767
9768These intrinsics are provided by LLVM to expose special features that
9769may only be implemented with code generator support.
9770
9771'``llvm.returnaddress``' Intrinsic
9772^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9773
9774Syntax:
9775"""""""
9776
9777::
9778
George Burgess IVfbc34982017-05-20 04:52:29 +00009779 declare i8* @llvm.returnaddress(i32 <level>)
Sean Silvab084af42012-12-07 10:36:55 +00009780
9781Overview:
9782"""""""""
9783
9784The '``llvm.returnaddress``' intrinsic attempts to compute a
9785target-specific value indicating the return address of the current
9786function or one of its callers.
9787
9788Arguments:
9789""""""""""
9790
9791The argument to this intrinsic indicates which function to return the
9792address for. Zero indicates the calling function, one indicates its
9793caller, etc. The argument is **required** to be a constant integer
9794value.
9795
9796Semantics:
9797""""""""""
9798
9799The '``llvm.returnaddress``' intrinsic either returns a pointer
9800indicating the return address of the specified call frame, or zero if it
9801cannot be identified. The value returned by this intrinsic is likely to
9802be incorrect or 0 for arguments other than zero, so it should only be
9803used for debugging purposes.
9804
9805Note that calling this intrinsic does not prevent function inlining or
9806other aggressive transformations, so the value returned may not be that
9807of the obvious source-language caller.
9808
Albert Gutowski795d7d62016-10-12 22:13:19 +00009809'``llvm.addressofreturnaddress``' Intrinsic
Albert Gutowski57ad5fe2016-10-12 23:10:02 +00009810^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Albert Gutowski795d7d62016-10-12 22:13:19 +00009811
9812Syntax:
9813"""""""
9814
9815::
9816
George Burgess IVfbc34982017-05-20 04:52:29 +00009817 declare i8* @llvm.addressofreturnaddress()
Albert Gutowski795d7d62016-10-12 22:13:19 +00009818
9819Overview:
9820"""""""""
9821
9822The '``llvm.addressofreturnaddress``' intrinsic returns a target-specific
9823pointer to the place in the stack frame where the return address of the
9824current function is stored.
9825
9826Semantics:
9827""""""""""
9828
9829Note that calling this intrinsic does not prevent function inlining or
9830other aggressive transformations, so the value returned may not be that
9831of the obvious source-language caller.
9832
9833This intrinsic is only implemented for x86.
9834
Sean Silvab084af42012-12-07 10:36:55 +00009835'``llvm.frameaddress``' Intrinsic
9836^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9837
9838Syntax:
9839"""""""
9840
9841::
9842
9843 declare i8* @llvm.frameaddress(i32 <level>)
9844
9845Overview:
9846"""""""""
9847
9848The '``llvm.frameaddress``' intrinsic attempts to return the
9849target-specific frame pointer value for the specified stack frame.
9850
9851Arguments:
9852""""""""""
9853
9854The argument to this intrinsic indicates which function to return the
9855frame pointer for. Zero indicates the calling function, one indicates
9856its caller, etc. The argument is **required** to be a constant integer
9857value.
9858
9859Semantics:
9860""""""""""
9861
9862The '``llvm.frameaddress``' intrinsic either returns a pointer
9863indicating the frame address of the specified call frame, or zero if it
9864cannot be identified. The value returned by this intrinsic is likely to
9865be incorrect or 0 for arguments other than zero, so it should only be
9866used for debugging purposes.
9867
9868Note that calling this intrinsic does not prevent function inlining or
9869other aggressive transformations, so the value returned may not be that
9870of the obvious source-language caller.
9871
Reid Kleckner60381792015-07-07 22:25:32 +00009872'``llvm.localescape``' and '``llvm.localrecover``' Intrinsics
Reid Klecknere9b89312015-01-13 00:48:10 +00009873^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9874
9875Syntax:
9876"""""""
9877
9878::
9879
Reid Kleckner60381792015-07-07 22:25:32 +00009880 declare void @llvm.localescape(...)
9881 declare i8* @llvm.localrecover(i8* %func, i8* %fp, i32 %idx)
Reid Klecknere9b89312015-01-13 00:48:10 +00009882
9883Overview:
9884"""""""""
9885
Reid Kleckner60381792015-07-07 22:25:32 +00009886The '``llvm.localescape``' intrinsic escapes offsets of a collection of static
9887allocas, and the '``llvm.localrecover``' intrinsic applies those offsets to a
Reid Klecknercfb9ce52015-03-05 18:26:34 +00009888live frame pointer to recover the address of the allocation. The offset is
Reid Kleckner60381792015-07-07 22:25:32 +00009889computed during frame layout of the caller of ``llvm.localescape``.
Reid Klecknere9b89312015-01-13 00:48:10 +00009890
9891Arguments:
9892""""""""""
9893
Reid Kleckner60381792015-07-07 22:25:32 +00009894All arguments to '``llvm.localescape``' must be pointers to static allocas or
9895casts of static allocas. Each function can only call '``llvm.localescape``'
Reid Klecknercfb9ce52015-03-05 18:26:34 +00009896once, and it can only do so from the entry block.
Reid Klecknere9b89312015-01-13 00:48:10 +00009897
Reid Kleckner60381792015-07-07 22:25:32 +00009898The ``func`` argument to '``llvm.localrecover``' must be a constant
Reid Klecknere9b89312015-01-13 00:48:10 +00009899bitcasted pointer to a function defined in the current module. The code
9900generator cannot determine the frame allocation offset of functions defined in
9901other modules.
9902
Reid Klecknerd5afc62f2015-07-07 23:23:03 +00009903The ``fp`` argument to '``llvm.localrecover``' must be a frame pointer of a
9904call frame that is currently live. The return value of '``llvm.localaddress``'
9905is one way to produce such a value, but various runtimes also expose a suitable
9906pointer in platform-specific ways.
Reid Klecknere9b89312015-01-13 00:48:10 +00009907
Reid Kleckner60381792015-07-07 22:25:32 +00009908The ``idx`` argument to '``llvm.localrecover``' indicates which alloca passed to
9909'``llvm.localescape``' to recover. It is zero-indexed.
Reid Klecknercfb9ce52015-03-05 18:26:34 +00009910
Reid Klecknere9b89312015-01-13 00:48:10 +00009911Semantics:
9912""""""""""
9913
Reid Kleckner60381792015-07-07 22:25:32 +00009914These intrinsics allow a group of functions to share access to a set of local
9915stack allocations of a one parent function. The parent function may call the
9916'``llvm.localescape``' intrinsic once from the function entry block, and the
9917child functions can use '``llvm.localrecover``' to access the escaped allocas.
9918The '``llvm.localescape``' intrinsic blocks inlining, as inlining changes where
9919the escaped allocas are allocated, which would break attempts to use
9920'``llvm.localrecover``'.
Reid Klecknere9b89312015-01-13 00:48:10 +00009921
Renato Golinc7aea402014-05-06 16:51:25 +00009922.. _int_read_register:
9923.. _int_write_register:
9924
9925'``llvm.read_register``' and '``llvm.write_register``' Intrinsics
9926^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9927
9928Syntax:
9929"""""""
9930
9931::
9932
9933 declare i32 @llvm.read_register.i32(metadata)
9934 declare i64 @llvm.read_register.i64(metadata)
9935 declare void @llvm.write_register.i32(metadata, i32 @value)
9936 declare void @llvm.write_register.i64(metadata, i64 @value)
Duncan P. N. Exon Smithbe7ea192014-12-15 19:07:53 +00009937 !0 = !{!"sp\00"}
Renato Golinc7aea402014-05-06 16:51:25 +00009938
9939Overview:
9940"""""""""
9941
9942The '``llvm.read_register``' and '``llvm.write_register``' intrinsics
9943provides access to the named register. The register must be valid on
9944the architecture being compiled to. The type needs to be compatible
9945with the register being read.
9946
9947Semantics:
9948""""""""""
9949
9950The '``llvm.read_register``' intrinsic returns the current value of the
9951register, where possible. The '``llvm.write_register``' intrinsic sets
9952the current value of the register, where possible.
9953
9954This is useful to implement named register global variables that need
9955to always be mapped to a specific register, as is common practice on
9956bare-metal programs including OS kernels.
9957
9958The compiler doesn't check for register availability or use of the used
9959register in surrounding code, including inline assembly. Because of that,
9960allocatable registers are not supported.
9961
9962Warning: So far it only works with the stack pointer on selected
Tim Northover3b0846e2014-05-24 12:50:23 +00009963architectures (ARM, AArch64, PowerPC and x86_64). Significant amount of
Renato Golinc7aea402014-05-06 16:51:25 +00009964work is needed to support other registers and even more so, allocatable
9965registers.
9966
Sean Silvab084af42012-12-07 10:36:55 +00009967.. _int_stacksave:
9968
9969'``llvm.stacksave``' Intrinsic
9970^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9971
9972Syntax:
9973"""""""
9974
9975::
9976
9977 declare i8* @llvm.stacksave()
9978
9979Overview:
9980"""""""""
9981
9982The '``llvm.stacksave``' intrinsic is used to remember the current state
9983of the function stack, for use with
9984:ref:`llvm.stackrestore <int_stackrestore>`. This is useful for
9985implementing language features like scoped automatic variable sized
9986arrays in C99.
9987
9988Semantics:
9989""""""""""
9990
9991This intrinsic returns a opaque pointer value that can be passed to
9992:ref:`llvm.stackrestore <int_stackrestore>`. When an
9993``llvm.stackrestore`` intrinsic is executed with a value saved from
9994``llvm.stacksave``, it effectively restores the state of the stack to
9995the state it was in when the ``llvm.stacksave`` intrinsic executed. In
9996practice, this pops any :ref:`alloca <i_alloca>` blocks from the stack that
9997were allocated after the ``llvm.stacksave`` was executed.
9998
9999.. _int_stackrestore:
10000
10001'``llvm.stackrestore``' Intrinsic
10002^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10003
10004Syntax:
10005"""""""
10006
10007::
10008
10009 declare void @llvm.stackrestore(i8* %ptr)
10010
10011Overview:
10012"""""""""
10013
10014The '``llvm.stackrestore``' intrinsic is used to restore the state of
10015the function stack to the state it was in when the corresponding
10016:ref:`llvm.stacksave <int_stacksave>` intrinsic executed. This is
10017useful for implementing language features like scoped automatic variable
10018sized arrays in C99.
10019
10020Semantics:
10021""""""""""
10022
10023See the description for :ref:`llvm.stacksave <int_stacksave>`.
10024
Yury Gribovd7dbb662015-12-01 11:40:55 +000010025.. _int_get_dynamic_area_offset:
10026
10027'``llvm.get.dynamic.area.offset``' Intrinsic
Yury Gribov81f3f152015-12-01 13:24:48 +000010028^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Yury Gribovd7dbb662015-12-01 11:40:55 +000010029
10030Syntax:
10031"""""""
10032
10033::
10034
10035 declare i32 @llvm.get.dynamic.area.offset.i32()
10036 declare i64 @llvm.get.dynamic.area.offset.i64()
10037
Lang Hames10239932016-10-08 00:20:42 +000010038Overview:
10039"""""""""
Yury Gribovd7dbb662015-12-01 11:40:55 +000010040
10041 The '``llvm.get.dynamic.area.offset.*``' intrinsic family is used to
10042 get the offset from native stack pointer to the address of the most
10043 recent dynamic alloca on the caller's stack. These intrinsics are
10044 intendend for use in combination with
10045 :ref:`llvm.stacksave <int_stacksave>` to get a
10046 pointer to the most recent dynamic alloca. This is useful, for example,
10047 for AddressSanitizer's stack unpoisoning routines.
10048
10049Semantics:
10050""""""""""
10051
10052 These intrinsics return a non-negative integer value that can be used to
10053 get the address of the most recent dynamic alloca, allocated by :ref:`alloca <i_alloca>`
10054 on the caller's stack. In particular, for targets where stack grows downwards,
10055 adding this offset to the native stack pointer would get the address of the most
10056 recent dynamic alloca. For targets where stack grows upwards, the situation is a bit more
Sylvestre Ledru0455cbe2016-07-28 09:28:58 +000010057 complicated, because subtracting this value from stack pointer would get the address
Yury Gribovd7dbb662015-12-01 11:40:55 +000010058 one past the end of the most recent dynamic alloca.
10059
10060 Although for most targets `llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
10061 returns just a zero, for others, such as PowerPC and PowerPC64, it returns a
10062 compile-time-known constant value.
10063
10064 The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
Matt Arsenaultc749bdc2017-03-30 23:36:47 +000010065 must match the target's default address space's (address space 0) pointer type.
Yury Gribovd7dbb662015-12-01 11:40:55 +000010066
Sean Silvab084af42012-12-07 10:36:55 +000010067'``llvm.prefetch``' Intrinsic
10068^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10069
10070Syntax:
10071"""""""
10072
10073::
10074
10075 declare void @llvm.prefetch(i8* <address>, i32 <rw>, i32 <locality>, i32 <cache type>)
10076
10077Overview:
10078"""""""""
10079
10080The '``llvm.prefetch``' intrinsic is a hint to the code generator to
10081insert a prefetch instruction if supported; otherwise, it is a noop.
10082Prefetches have no effect on the behavior of the program but can change
10083its performance characteristics.
10084
10085Arguments:
10086""""""""""
10087
10088``address`` is the address to be prefetched, ``rw`` is the specifier
10089determining if the fetch should be for a read (0) or write (1), and
10090``locality`` is a temporal locality specifier ranging from (0) - no
10091locality, to (3) - extremely local keep in cache. The ``cache type``
10092specifies whether the prefetch is performed on the data (1) or
10093instruction (0) cache. The ``rw``, ``locality`` and ``cache type``
10094arguments must be constant integers.
10095
10096Semantics:
10097""""""""""
10098
10099This intrinsic does not modify the behavior of the program. In
10100particular, prefetches cannot trap and do not produce a value. On
10101targets that support this intrinsic, the prefetch can provide hints to
10102the processor cache for better performance.
10103
10104'``llvm.pcmarker``' Intrinsic
10105^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10106
10107Syntax:
10108"""""""
10109
10110::
10111
10112 declare void @llvm.pcmarker(i32 <id>)
10113
10114Overview:
10115"""""""""
10116
10117The '``llvm.pcmarker``' intrinsic is a method to export a Program
10118Counter (PC) in a region of code to simulators and other tools. The
10119method is target specific, but it is expected that the marker will use
10120exported symbols to transmit the PC of the marker. The marker makes no
10121guarantees that it will remain with any specific instruction after
10122optimizations. It is possible that the presence of a marker will inhibit
10123optimizations. The intended use is to be inserted after optimizations to
10124allow correlations of simulation runs.
10125
10126Arguments:
10127""""""""""
10128
10129``id`` is a numerical id identifying the marker.
10130
10131Semantics:
10132""""""""""
10133
10134This intrinsic does not modify the behavior of the program. Backends
10135that do not support this intrinsic may ignore it.
10136
10137'``llvm.readcyclecounter``' Intrinsic
10138^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10139
10140Syntax:
10141"""""""
10142
10143::
10144
10145 declare i64 @llvm.readcyclecounter()
10146
10147Overview:
10148"""""""""
10149
10150The '``llvm.readcyclecounter``' intrinsic provides access to the cycle
10151counter register (or similar low latency, high accuracy clocks) on those
10152targets that support it. On X86, it should map to RDTSC. On Alpha, it
10153should map to RPCC. As the backing counters overflow quickly (on the
10154order of 9 seconds on alpha), this should only be used for small
10155timings.
10156
10157Semantics:
10158""""""""""
10159
10160When directly supported, reading the cycle counter should not modify any
10161memory. Implementations are allowed to either return a application
10162specific value or a system wide value. On backends without support, this
10163is lowered to a constant 0.
10164
Tim Northoverbc933082013-05-23 19:11:20 +000010165Note that runtime support may be conditional on the privilege-level code is
10166running at and the host platform.
10167
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010168'``llvm.clear_cache``' Intrinsic
10169^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10170
10171Syntax:
10172"""""""
10173
10174::
10175
10176 declare void @llvm.clear_cache(i8*, i8*)
10177
10178Overview:
10179"""""""""
10180
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010181The '``llvm.clear_cache``' intrinsic ensures visibility of modifications
10182in the specified range to the execution unit of the processor. On
10183targets with non-unified instruction and data cache, the implementation
10184flushes the instruction cache.
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010185
10186Semantics:
10187""""""""""
10188
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010189On platforms with coherent instruction and data caches (e.g. x86), this
10190intrinsic is a nop. On platforms with non-coherent instruction and data
Alp Toker16f98b22014-04-09 14:47:27 +000010191cache (e.g. ARM, MIPS), the intrinsic is lowered either to appropriate
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010192instructions or a system call, if cache flushing requires special
10193privileges.
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010194
Sean Silvad02bf3e2014-04-07 22:29:53 +000010195The default behavior is to emit a call to ``__clear_cache`` from the run
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010196time library.
Renato Golin93010e62014-03-26 14:01:32 +000010197
Joerg Sonnenberger03014d62014-03-26 14:35:21 +000010198This instrinsic does *not* empty the instruction pipeline. Modifications
10199of the current function are outside the scope of the intrinsic.
Renato Golinc0a3c1d2014-03-26 12:52:28 +000010200
Vedant Kumar51ce6682018-01-26 23:54:25 +000010201'``llvm.instrprof.increment``' Intrinsic
Justin Bogner61ba2e32014-12-08 18:02:35 +000010202^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10203
10204Syntax:
10205"""""""
10206
10207::
10208
Vedant Kumar51ce6682018-01-26 23:54:25 +000010209 declare void @llvm.instrprof.increment(i8* <name>, i64 <hash>,
Justin Bogner61ba2e32014-12-08 18:02:35 +000010210 i32 <num-counters>, i32 <index>)
10211
10212Overview:
10213"""""""""
10214
Vedant Kumar51ce6682018-01-26 23:54:25 +000010215The '``llvm.instrprof.increment``' intrinsic can be emitted by a
Justin Bogner61ba2e32014-12-08 18:02:35 +000010216frontend for use with instrumentation based profiling. These will be
10217lowered by the ``-instrprof`` pass to generate execution counts of a
10218program at runtime.
10219
10220Arguments:
10221""""""""""
10222
10223The first argument is a pointer to a global variable containing the
10224name of the entity being instrumented. This should generally be the
10225(mangled) function name for a set of counters.
10226
10227The second argument is a hash value that can be used by the consumer
10228of the profile data to detect changes to the instrumented source, and
10229the third is the number of counters associated with ``name``. It is an
10230error if ``hash`` or ``num-counters`` differ between two instances of
Vedant Kumar51ce6682018-01-26 23:54:25 +000010231``instrprof.increment`` that refer to the same name.
Justin Bogner61ba2e32014-12-08 18:02:35 +000010232
10233The last argument refers to which of the counters for ``name`` should
10234be incremented. It should be a value between 0 and ``num-counters``.
10235
10236Semantics:
10237""""""""""
10238
10239This intrinsic represents an increment of a profiling counter. It will
10240cause the ``-instrprof`` pass to generate the appropriate data
10241structures and the code to increment the appropriate value, in a
10242format that can be written out by a compiler runtime and consumed via
10243the ``llvm-profdata`` tool.
10244
Vedant Kumar51ce6682018-01-26 23:54:25 +000010245'``llvm.instrprof.increment.step``' Intrinsic
Xinliang David Lie1117102016-09-18 22:10:19 +000010246^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Xinliang David Li4ca17332016-09-18 18:34:07 +000010247
10248Syntax:
10249"""""""
10250
10251::
10252
Vedant Kumar51ce6682018-01-26 23:54:25 +000010253 declare void @llvm.instrprof.increment.step(i8* <name>, i64 <hash>,
Xinliang David Li4ca17332016-09-18 18:34:07 +000010254 i32 <num-counters>,
10255 i32 <index>, i64 <step>)
10256
10257Overview:
10258"""""""""
10259
Vedant Kumar51ce6682018-01-26 23:54:25 +000010260The '``llvm.instrprof.increment.step``' intrinsic is an extension to
10261the '``llvm.instrprof.increment``' intrinsic with an additional fifth
Xinliang David Li4ca17332016-09-18 18:34:07 +000010262argument to specify the step of the increment.
10263
10264Arguments:
10265""""""""""
Vedant Kumar51ce6682018-01-26 23:54:25 +000010266The first four arguments are the same as '``llvm.instrprof.increment``'
Pete Couperused9569d2017-08-23 20:58:22 +000010267intrinsic.
Xinliang David Li4ca17332016-09-18 18:34:07 +000010268
10269The last argument specifies the value of the increment of the counter variable.
10270
10271Semantics:
10272""""""""""
Vedant Kumar51ce6682018-01-26 23:54:25 +000010273See description of '``llvm.instrprof.increment``' instrinsic.
Xinliang David Li4ca17332016-09-18 18:34:07 +000010274
10275
Vedant Kumar51ce6682018-01-26 23:54:25 +000010276'``llvm.instrprof.value.profile``' Intrinsic
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010277^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10278
10279Syntax:
10280"""""""
10281
10282::
10283
Vedant Kumar51ce6682018-01-26 23:54:25 +000010284 declare void @llvm.instrprof.value.profile(i8* <name>, i64 <hash>,
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010285 i64 <value>, i32 <value_kind>,
10286 i32 <index>)
10287
10288Overview:
10289"""""""""
10290
Vedant Kumar51ce6682018-01-26 23:54:25 +000010291The '``llvm.instrprof.value.profile``' intrinsic can be emitted by a
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010292frontend for use with instrumentation based profiling. This will be
10293lowered by the ``-instrprof`` pass to find out the target values,
10294instrumented expressions take in a program at runtime.
10295
10296Arguments:
10297""""""""""
10298
10299The first argument is a pointer to a global variable containing the
10300name of the entity being instrumented. ``name`` should generally be the
10301(mangled) function name for a set of counters.
10302
10303The second argument is a hash value that can be used by the consumer
10304of the profile data to detect changes to the instrumented source. It
10305is an error if ``hash`` differs between two instances of
Vedant Kumar51ce6682018-01-26 23:54:25 +000010306``llvm.instrprof.*`` that refer to the same name.
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010307
10308The third argument is the value of the expression being profiled. The profiled
10309expression's value should be representable as an unsigned 64-bit value. The
10310fourth argument represents the kind of value profiling that is being done. The
10311supported value profiling kinds are enumerated through the
10312``InstrProfValueKind`` type declared in the
10313``<include/llvm/ProfileData/InstrProf.h>`` header file. The last argument is the
10314index of the instrumented expression within ``name``. It should be >= 0.
10315
10316Semantics:
10317""""""""""
10318
10319This intrinsic represents the point where a call to a runtime routine
10320should be inserted for value profiling of target expressions. ``-instrprof``
10321pass will generate the appropriate data structures and replace the
Vedant Kumar51ce6682018-01-26 23:54:25 +000010322``llvm.instrprof.value.profile`` intrinsic with the call to the profile
Betul Buyukkurt6fac1742015-11-18 18:14:55 +000010323runtime library with proper arguments.
10324
Marcin Koscielnicki3fdc2572016-04-19 20:51:05 +000010325'``llvm.thread.pointer``' Intrinsic
10326^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10327
10328Syntax:
10329"""""""
10330
10331::
10332
10333 declare i8* @llvm.thread.pointer()
10334
10335Overview:
10336"""""""""
10337
10338The '``llvm.thread.pointer``' intrinsic returns the value of the thread
10339pointer.
10340
10341Semantics:
10342""""""""""
10343
10344The '``llvm.thread.pointer``' intrinsic returns a pointer to the TLS area
10345for the current thread. The exact semantics of this value are target
10346specific: it may point to the start of TLS area, to the end, or somewhere
10347in the middle. Depending on the target, this intrinsic may read a register,
10348call a helper function, read from an alternate memory space, or perform
10349other operations necessary to locate the TLS area. Not all targets support
10350this intrinsic.
10351
Sean Silvab084af42012-12-07 10:36:55 +000010352Standard C Library Intrinsics
10353-----------------------------
10354
10355LLVM provides intrinsics for a few important standard C library
10356functions. These intrinsics allow source-language front-ends to pass
10357information about the alignment of the pointer arguments to the code
10358generator, providing opportunity for more efficient code generation.
10359
10360.. _int_memcpy:
10361
10362'``llvm.memcpy``' Intrinsic
10363^^^^^^^^^^^^^^^^^^^^^^^^^^^
10364
10365Syntax:
10366"""""""
10367
10368This is an overloaded intrinsic. You can use ``llvm.memcpy`` on any
10369integer bit width and for different address spaces. Not all targets
10370support all bit widths however.
10371
10372::
10373
10374 declare void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010375 i32 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010376 declare void @llvm.memcpy.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010377 i64 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010378
10379Overview:
10380"""""""""
10381
10382The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
10383source location to the destination location.
10384
10385Note that, unlike the standard libc function, the ``llvm.memcpy.*``
Daniel Neilson1e687242018-01-19 17:13:12 +000010386intrinsics do not return a value, takes extra isvolatile
Sean Silvab084af42012-12-07 10:36:55 +000010387arguments and the pointers can be in specified address spaces.
10388
10389Arguments:
10390""""""""""
10391
10392The first argument is a pointer to the destination, the second is a
10393pointer to the source. The third argument is an integer argument
Daniel Neilson1e687242018-01-19 17:13:12 +000010394specifying the number of bytes to copy, and the fourth is a
Sean Silvab084af42012-12-07 10:36:55 +000010395boolean indicating a volatile access.
10396
Daniel Neilson39eb6a52018-01-19 17:24:21 +000010397The :ref:`align <attr_align>` parameter attribute can be provided
Daniel Neilson1e687242018-01-19 17:13:12 +000010398for the first and second arguments.
Sean Silvab084af42012-12-07 10:36:55 +000010399
10400If the ``isvolatile`` parameter is ``true``, the ``llvm.memcpy`` call is
10401a :ref:`volatile operation <volatile>`. The detailed access behavior is not
10402very cleanly specified and it is unwise to depend on it.
10403
10404Semantics:
10405""""""""""
10406
10407The '``llvm.memcpy.*``' intrinsics copy a block of memory from the
10408source location to the destination location, which are not allowed to
10409overlap. It copies "len" bytes of memory over. If the argument is known
10410to be aligned to some boundary, this can be specified as the fourth
Bill Wendling61163152013-10-18 23:26:55 +000010411argument, otherwise it should be set to 0 or 1 (both meaning no alignment).
Sean Silvab084af42012-12-07 10:36:55 +000010412
Daniel Neilson57226ef2017-07-12 15:25:26 +000010413.. _int_memmove:
10414
Sean Silvab084af42012-12-07 10:36:55 +000010415'``llvm.memmove``' Intrinsic
10416^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10417
10418Syntax:
10419"""""""
10420
10421This is an overloaded intrinsic. You can use llvm.memmove on any integer
10422bit width and for different address space. Not all targets support all
10423bit widths however.
10424
10425::
10426
10427 declare void @llvm.memmove.p0i8.p0i8.i32(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010428 i32 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010429 declare void @llvm.memmove.p0i8.p0i8.i64(i8* <dest>, i8* <src>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010430 i64 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010431
10432Overview:
10433"""""""""
10434
10435The '``llvm.memmove.*``' intrinsics move a block of memory from the
10436source location to the destination location. It is similar to the
10437'``llvm.memcpy``' intrinsic but allows the two memory locations to
10438overlap.
10439
10440Note that, unlike the standard libc function, the ``llvm.memmove.*``
Daniel Neilson1e687242018-01-19 17:13:12 +000010441intrinsics do not return a value, takes an extra isvolatile
10442argument and the pointers can be in specified address spaces.
Sean Silvab084af42012-12-07 10:36:55 +000010443
10444Arguments:
10445""""""""""
10446
10447The first argument is a pointer to the destination, the second is a
10448pointer to the source. The third argument is an integer argument
Daniel Neilson1e687242018-01-19 17:13:12 +000010449specifying the number of bytes to copy, and the fourth is a
Sean Silvab084af42012-12-07 10:36:55 +000010450boolean indicating a volatile access.
10451
Daniel Neilsonaac0f8f2018-01-19 17:32:33 +000010452The :ref:`align <attr_align>` parameter attribute can be provided
Daniel Neilson1e687242018-01-19 17:13:12 +000010453for the first and second arguments.
Sean Silvab084af42012-12-07 10:36:55 +000010454
10455If the ``isvolatile`` parameter is ``true``, the ``llvm.memmove`` call
10456is a :ref:`volatile operation <volatile>`. The detailed access behavior is
10457not very cleanly specified and it is unwise to depend on it.
10458
10459Semantics:
10460""""""""""
10461
10462The '``llvm.memmove.*``' intrinsics copy a block of memory from the
10463source location to the destination location, which may overlap. It
10464copies "len" bytes of memory over. If the argument is known to be
10465aligned to some boundary, this can be specified as the fourth argument,
Bill Wendling61163152013-10-18 23:26:55 +000010466otherwise it should be set to 0 or 1 (both meaning no alignment).
Sean Silvab084af42012-12-07 10:36:55 +000010467
Daniel Neilson965613e2017-07-12 21:57:23 +000010468.. _int_memset:
10469
Sean Silvab084af42012-12-07 10:36:55 +000010470'``llvm.memset.*``' Intrinsics
10471^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10472
10473Syntax:
10474"""""""
10475
10476This is an overloaded intrinsic. You can use llvm.memset on any integer
10477bit width and for different address spaces. However, not all targets
10478support all bit widths.
10479
10480::
10481
10482 declare void @llvm.memset.p0i8.i32(i8* <dest>, i8 <val>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010483 i32 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010484 declare void @llvm.memset.p0i8.i64(i8* <dest>, i8 <val>,
Daniel Neilson1e687242018-01-19 17:13:12 +000010485 i64 <len>, i1 <isvolatile>)
Sean Silvab084af42012-12-07 10:36:55 +000010486
10487Overview:
10488"""""""""
10489
10490The '``llvm.memset.*``' intrinsics fill a block of memory with a
10491particular byte value.
10492
10493Note that, unlike the standard libc function, the ``llvm.memset``
Daniel Neilson1e687242018-01-19 17:13:12 +000010494intrinsic does not return a value and takes an extra volatile
10495argument. Also, the destination can be in an arbitrary address space.
Sean Silvab084af42012-12-07 10:36:55 +000010496
10497Arguments:
10498""""""""""
10499
10500The first argument is a pointer to the destination to fill, the second
10501is the byte value with which to fill it, the third argument is an
10502integer argument specifying the number of bytes to fill, and the fourth
Daniel Neilson1e687242018-01-19 17:13:12 +000010503is a boolean indicating a volatile access.
Sean Silvab084af42012-12-07 10:36:55 +000010504
Daniel Neilsonaac0f8f2018-01-19 17:32:33 +000010505The :ref:`align <attr_align>` parameter attribute can be provided
Daniel Neilson1e687242018-01-19 17:13:12 +000010506for the first arguments.
Sean Silvab084af42012-12-07 10:36:55 +000010507
10508If the ``isvolatile`` parameter is ``true``, the ``llvm.memset`` call is
10509a :ref:`volatile operation <volatile>`. The detailed access behavior is not
10510very cleanly specified and it is unwise to depend on it.
10511
10512Semantics:
10513""""""""""
10514
10515The '``llvm.memset.*``' intrinsics fill "len" bytes of memory starting
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010516at the destination location.
Sean Silvab084af42012-12-07 10:36:55 +000010517
10518'``llvm.sqrt.*``' Intrinsic
10519^^^^^^^^^^^^^^^^^^^^^^^^^^^
10520
10521Syntax:
10522"""""""
10523
10524This is an overloaded intrinsic. You can use ``llvm.sqrt`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010525floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010526all types however.
10527
10528::
10529
10530 declare float @llvm.sqrt.f32(float %Val)
10531 declare double @llvm.sqrt.f64(double %Val)
10532 declare x86_fp80 @llvm.sqrt.f80(x86_fp80 %Val)
10533 declare fp128 @llvm.sqrt.f128(fp128 %Val)
10534 declare ppc_fp128 @llvm.sqrt.ppcf128(ppc_fp128 %Val)
10535
10536Overview:
10537"""""""""
10538
Sanjay Patel629c4112017-11-06 16:27:15 +000010539The '``llvm.sqrt``' intrinsics return the square root of the specified value.
Sean Silvab084af42012-12-07 10:36:55 +000010540
10541Arguments:
10542""""""""""
10543
Sanjay Patel629c4112017-11-06 16:27:15 +000010544The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010545
10546Semantics:
10547""""""""""
10548
Sanjay Patel629c4112017-11-06 16:27:15 +000010549Return the same value as a corresponding libm '``sqrt``' function but without
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010550trapping or setting ``errno``. For types specified by IEEE-754, the result
Sanjay Patel629c4112017-11-06 16:27:15 +000010551matches a conforming libm implementation.
10552
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010553When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010554using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010555
10556'``llvm.powi.*``' Intrinsic
10557^^^^^^^^^^^^^^^^^^^^^^^^^^^
10558
10559Syntax:
10560"""""""
10561
10562This is an overloaded intrinsic. You can use ``llvm.powi`` on any
10563floating point or vector of floating point type. Not all targets support
10564all types however.
10565
10566::
10567
10568 declare float @llvm.powi.f32(float %Val, i32 %power)
10569 declare double @llvm.powi.f64(double %Val, i32 %power)
10570 declare x86_fp80 @llvm.powi.f80(x86_fp80 %Val, i32 %power)
10571 declare fp128 @llvm.powi.f128(fp128 %Val, i32 %power)
10572 declare ppc_fp128 @llvm.powi.ppcf128(ppc_fp128 %Val, i32 %power)
10573
10574Overview:
10575"""""""""
10576
10577The '``llvm.powi.*``' intrinsics return the first operand raised to the
10578specified (positive or negative) power. The order of evaluation of
10579multiplications is not defined. When a vector of floating point type is
10580used, the second argument remains a scalar integer value.
10581
10582Arguments:
10583""""""""""
10584
10585The second argument is an integer power, and the first is a value to
10586raise to that power.
10587
10588Semantics:
10589""""""""""
10590
10591This function returns the first value raised to the second power with an
10592unspecified sequence of rounding operations.
10593
10594'``llvm.sin.*``' Intrinsic
10595^^^^^^^^^^^^^^^^^^^^^^^^^^
10596
10597Syntax:
10598"""""""
10599
10600This is an overloaded intrinsic. You can use ``llvm.sin`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010601floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010602all types however.
10603
10604::
10605
10606 declare float @llvm.sin.f32(float %Val)
10607 declare double @llvm.sin.f64(double %Val)
10608 declare x86_fp80 @llvm.sin.f80(x86_fp80 %Val)
10609 declare fp128 @llvm.sin.f128(fp128 %Val)
10610 declare ppc_fp128 @llvm.sin.ppcf128(ppc_fp128 %Val)
10611
10612Overview:
10613"""""""""
10614
10615The '``llvm.sin.*``' intrinsics return the sine of the operand.
10616
10617Arguments:
10618""""""""""
10619
Sanjay Patel629c4112017-11-06 16:27:15 +000010620The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010621
10622Semantics:
10623""""""""""
10624
Sanjay Patel629c4112017-11-06 16:27:15 +000010625Return the same value as a corresponding libm '``sin``' function but without
10626trapping or setting ``errno``.
10627
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010628When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010629using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010630
10631'``llvm.cos.*``' Intrinsic
10632^^^^^^^^^^^^^^^^^^^^^^^^^^
10633
10634Syntax:
10635"""""""
10636
10637This is an overloaded intrinsic. You can use ``llvm.cos`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010638floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010639all types however.
10640
10641::
10642
10643 declare float @llvm.cos.f32(float %Val)
10644 declare double @llvm.cos.f64(double %Val)
10645 declare x86_fp80 @llvm.cos.f80(x86_fp80 %Val)
10646 declare fp128 @llvm.cos.f128(fp128 %Val)
10647 declare ppc_fp128 @llvm.cos.ppcf128(ppc_fp128 %Val)
10648
10649Overview:
10650"""""""""
10651
10652The '``llvm.cos.*``' intrinsics return the cosine of the operand.
10653
10654Arguments:
10655""""""""""
10656
Sanjay Patel629c4112017-11-06 16:27:15 +000010657The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010658
10659Semantics:
10660""""""""""
10661
Sanjay Patel629c4112017-11-06 16:27:15 +000010662Return the same value as a corresponding libm '``cos``' function but without
10663trapping or setting ``errno``.
10664
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010665When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010666using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010667
10668'``llvm.pow.*``' Intrinsic
10669^^^^^^^^^^^^^^^^^^^^^^^^^^
10670
10671Syntax:
10672"""""""
10673
10674This is an overloaded intrinsic. You can use ``llvm.pow`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010675floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010676all types however.
10677
10678::
10679
10680 declare float @llvm.pow.f32(float %Val, float %Power)
10681 declare double @llvm.pow.f64(double %Val, double %Power)
10682 declare x86_fp80 @llvm.pow.f80(x86_fp80 %Val, x86_fp80 %Power)
10683 declare fp128 @llvm.pow.f128(fp128 %Val, fp128 %Power)
10684 declare ppc_fp128 @llvm.pow.ppcf128(ppc_fp128 %Val, ppc_fp128 Power)
10685
10686Overview:
10687"""""""""
10688
10689The '``llvm.pow.*``' intrinsics return the first operand raised to the
10690specified (positive or negative) power.
10691
10692Arguments:
10693""""""""""
10694
Sanjay Patel629c4112017-11-06 16:27:15 +000010695The arguments and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010696
10697Semantics:
10698""""""""""
10699
Sanjay Patel629c4112017-11-06 16:27:15 +000010700Return the same value as a corresponding libm '``pow``' function but without
10701trapping or setting ``errno``.
10702
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010703When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010704using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010705
10706'``llvm.exp.*``' Intrinsic
10707^^^^^^^^^^^^^^^^^^^^^^^^^^
10708
10709Syntax:
10710"""""""
10711
10712This is an overloaded intrinsic. You can use ``llvm.exp`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010713floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010714all types however.
10715
10716::
10717
10718 declare float @llvm.exp.f32(float %Val)
10719 declare double @llvm.exp.f64(double %Val)
10720 declare x86_fp80 @llvm.exp.f80(x86_fp80 %Val)
10721 declare fp128 @llvm.exp.f128(fp128 %Val)
10722 declare ppc_fp128 @llvm.exp.ppcf128(ppc_fp128 %Val)
10723
10724Overview:
10725"""""""""
10726
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000010727The '``llvm.exp.*``' intrinsics compute the base-e exponential of the specified
10728value.
Sean Silvab084af42012-12-07 10:36:55 +000010729
10730Arguments:
10731""""""""""
10732
Sanjay Patel629c4112017-11-06 16:27:15 +000010733The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010734
10735Semantics:
10736""""""""""
10737
Sanjay Patel629c4112017-11-06 16:27:15 +000010738Return the same value as a corresponding libm '``exp``' function but without
10739trapping or setting ``errno``.
10740
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010741When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010742using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010743
10744'``llvm.exp2.*``' Intrinsic
10745^^^^^^^^^^^^^^^^^^^^^^^^^^^
10746
10747Syntax:
10748"""""""
10749
10750This is an overloaded intrinsic. You can use ``llvm.exp2`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010751floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010752all types however.
10753
10754::
10755
10756 declare float @llvm.exp2.f32(float %Val)
10757 declare double @llvm.exp2.f64(double %Val)
10758 declare x86_fp80 @llvm.exp2.f80(x86_fp80 %Val)
10759 declare fp128 @llvm.exp2.f128(fp128 %Val)
10760 declare ppc_fp128 @llvm.exp2.ppcf128(ppc_fp128 %Val)
10761
10762Overview:
10763"""""""""
10764
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000010765The '``llvm.exp2.*``' intrinsics compute the base-2 exponential of the
10766specified value.
Sean Silvab084af42012-12-07 10:36:55 +000010767
10768Arguments:
10769""""""""""
10770
Sanjay Patel629c4112017-11-06 16:27:15 +000010771The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010772
10773Semantics:
10774""""""""""
10775
Sanjay Patel629c4112017-11-06 16:27:15 +000010776Return the same value as a corresponding libm '``exp2``' function but without
10777trapping or setting ``errno``.
10778
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010779When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010780using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010781
10782'``llvm.log.*``' Intrinsic
10783^^^^^^^^^^^^^^^^^^^^^^^^^^
10784
10785Syntax:
10786"""""""
10787
10788This is an overloaded intrinsic. You can use ``llvm.log`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010789floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010790all types however.
10791
10792::
10793
10794 declare float @llvm.log.f32(float %Val)
10795 declare double @llvm.log.f64(double %Val)
10796 declare x86_fp80 @llvm.log.f80(x86_fp80 %Val)
10797 declare fp128 @llvm.log.f128(fp128 %Val)
10798 declare ppc_fp128 @llvm.log.ppcf128(ppc_fp128 %Val)
10799
10800Overview:
10801"""""""""
10802
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000010803The '``llvm.log.*``' intrinsics compute the base-e logarithm of the specified
10804value.
Sean Silvab084af42012-12-07 10:36:55 +000010805
10806Arguments:
10807""""""""""
10808
Sanjay Patel629c4112017-11-06 16:27:15 +000010809The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010810
10811Semantics:
10812""""""""""
10813
Sanjay Patel629c4112017-11-06 16:27:15 +000010814Return the same value as a corresponding libm '``log``' function but without
10815trapping or setting ``errno``.
10816
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010817When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010818using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010819
10820'``llvm.log10.*``' Intrinsic
10821^^^^^^^^^^^^^^^^^^^^^^^^^^^^
10822
10823Syntax:
10824"""""""
10825
10826This is an overloaded intrinsic. You can use ``llvm.log10`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010827floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010828all types however.
10829
10830::
10831
10832 declare float @llvm.log10.f32(float %Val)
10833 declare double @llvm.log10.f64(double %Val)
10834 declare x86_fp80 @llvm.log10.f80(x86_fp80 %Val)
10835 declare fp128 @llvm.log10.f128(fp128 %Val)
10836 declare ppc_fp128 @llvm.log10.ppcf128(ppc_fp128 %Val)
10837
10838Overview:
10839"""""""""
10840
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000010841The '``llvm.log10.*``' intrinsics compute the base-10 logarithm of the
10842specified value.
Sean Silvab084af42012-12-07 10:36:55 +000010843
10844Arguments:
10845""""""""""
10846
Sanjay Patel629c4112017-11-06 16:27:15 +000010847The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010848
10849Semantics:
10850""""""""""
10851
Sanjay Patel629c4112017-11-06 16:27:15 +000010852Return the same value as a corresponding libm '``log10``' function but without
10853trapping or setting ``errno``.
10854
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010855When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010856using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010857
10858'``llvm.log2.*``' Intrinsic
10859^^^^^^^^^^^^^^^^^^^^^^^^^^^
10860
10861Syntax:
10862"""""""
10863
10864This is an overloaded intrinsic. You can use ``llvm.log2`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010865floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010866all types however.
10867
10868::
10869
10870 declare float @llvm.log2.f32(float %Val)
10871 declare double @llvm.log2.f64(double %Val)
10872 declare x86_fp80 @llvm.log2.f80(x86_fp80 %Val)
10873 declare fp128 @llvm.log2.f128(fp128 %Val)
10874 declare ppc_fp128 @llvm.log2.ppcf128(ppc_fp128 %Val)
10875
10876Overview:
10877"""""""""
10878
Andrew Kaylorcaf24d22017-04-11 21:52:40 +000010879The '``llvm.log2.*``' intrinsics compute the base-2 logarithm of the specified
10880value.
Sean Silvab084af42012-12-07 10:36:55 +000010881
10882Arguments:
10883""""""""""
10884
Sanjay Patel629c4112017-11-06 16:27:15 +000010885The argument and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010886
10887Semantics:
10888""""""""""
10889
Sanjay Patel629c4112017-11-06 16:27:15 +000010890Return the same value as a corresponding libm '``log2``' function but without
10891trapping or setting ``errno``.
10892
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010893When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010894using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010895
10896'``llvm.fma.*``' Intrinsic
10897^^^^^^^^^^^^^^^^^^^^^^^^^^
10898
10899Syntax:
10900"""""""
10901
10902This is an overloaded intrinsic. You can use ``llvm.fma`` on any
Sanjay Patel629c4112017-11-06 16:27:15 +000010903floating-point or vector of floating-point type. Not all targets support
Sean Silvab084af42012-12-07 10:36:55 +000010904all types however.
10905
10906::
10907
10908 declare float @llvm.fma.f32(float %a, float %b, float %c)
10909 declare double @llvm.fma.f64(double %a, double %b, double %c)
10910 declare x86_fp80 @llvm.fma.f80(x86_fp80 %a, x86_fp80 %b, x86_fp80 %c)
10911 declare fp128 @llvm.fma.f128(fp128 %a, fp128 %b, fp128 %c)
10912 declare ppc_fp128 @llvm.fma.ppcf128(ppc_fp128 %a, ppc_fp128 %b, ppc_fp128 %c)
10913
10914Overview:
10915"""""""""
10916
Sanjay Patel629c4112017-11-06 16:27:15 +000010917The '``llvm.fma.*``' intrinsics perform the fused multiply-add operation.
Sean Silvab084af42012-12-07 10:36:55 +000010918
10919Arguments:
10920""""""""""
10921
Sanjay Patel629c4112017-11-06 16:27:15 +000010922The arguments and return value are floating-point numbers of the same type.
Sean Silvab084af42012-12-07 10:36:55 +000010923
10924Semantics:
10925""""""""""
10926
Sanjay Patel629c4112017-11-06 16:27:15 +000010927Return the same value as a corresponding libm '``fma``' function but without
10928trapping or setting ``errno``.
10929
Elena Demikhovsky945b7e52018-02-14 06:58:08 +000010930When specified with the fast-math-flag 'afn', the result may be approximated
Sanjay Patel629c4112017-11-06 16:27:15 +000010931using a less accurate calculation.
Sean Silvab084af42012-12-07 10:36:55 +000010932
10933'``llvm.fabs.*``' Intrinsic
10934^^^^^^^^^^^^^^^^^^^^^^^^^^^
10935
10936Syntax:
10937"""""""
10938
10939This is an overloaded intrinsic. You can use ``llvm.fabs`` on any
10940floating point or vector of floating point type. Not all targets support
10941all types however.
10942
10943::
10944
10945 declare float @llvm.fabs.f32(float %Val)
10946 declare double @llvm.fabs.f64(double %Val)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010947 declare x86_fp80 @llvm.fabs.f80(x86_fp80 %Val)
Sean Silvab084af42012-12-07 10:36:55 +000010948 declare fp128 @llvm.fabs.f128(fp128 %Val)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010949 declare ppc_fp128 @llvm.fabs.ppcf128(ppc_fp128 %Val)
Sean Silvab084af42012-12-07 10:36:55 +000010950
10951Overview:
10952"""""""""
10953
10954The '``llvm.fabs.*``' intrinsics return the absolute value of the
10955operand.
10956
10957Arguments:
10958""""""""""
10959
10960The argument and return value are floating point numbers of the same
10961type.
10962
10963Semantics:
10964""""""""""
10965
10966This function returns the same values as the libm ``fabs`` functions
10967would, and handles error conditions in the same way.
10968
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010969'``llvm.minnum.*``' Intrinsic
Matt Arsenault9886b0d2014-10-22 00:15:53 +000010970^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010971
10972Syntax:
10973"""""""
10974
10975This is an overloaded intrinsic. You can use ``llvm.minnum`` on any
10976floating point or vector of floating point type. Not all targets support
10977all types however.
10978
10979::
10980
Matt Arsenault64313c92014-10-22 18:25:02 +000010981 declare float @llvm.minnum.f32(float %Val0, float %Val1)
10982 declare double @llvm.minnum.f64(double %Val0, double %Val1)
10983 declare x86_fp80 @llvm.minnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
10984 declare fp128 @llvm.minnum.f128(fp128 %Val0, fp128 %Val1)
10985 declare ppc_fp128 @llvm.minnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000010986
10987Overview:
10988"""""""""
10989
10990The '``llvm.minnum.*``' intrinsics return the minimum of the two
10991arguments.
10992
10993
10994Arguments:
10995""""""""""
10996
10997The arguments and return value are floating point numbers of the same
10998type.
10999
11000Semantics:
11001""""""""""
11002
11003Follows the IEEE-754 semantics for minNum, which also match for libm's
11004fmin.
11005
11006If either operand is a NaN, returns the other non-NaN operand. Returns
11007NaN only if both operands are NaN. If the operands compare equal,
11008returns a value that compares equal to both operands. This means that
11009fmin(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
11010
11011'``llvm.maxnum.*``' Intrinsic
Matt Arsenault9886b0d2014-10-22 00:15:53 +000011012^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011013
11014Syntax:
11015"""""""
11016
11017This is an overloaded intrinsic. You can use ``llvm.maxnum`` on any
11018floating point or vector of floating point type. Not all targets support
11019all types however.
11020
11021::
11022
Matt Arsenault64313c92014-10-22 18:25:02 +000011023 declare float @llvm.maxnum.f32(float %Val0, float %Val1l)
11024 declare double @llvm.maxnum.f64(double %Val0, double %Val1)
11025 declare x86_fp80 @llvm.maxnum.f80(x86_fp80 %Val0, x86_fp80 %Val1)
11026 declare fp128 @llvm.maxnum.f128(fp128 %Val0, fp128 %Val1)
11027 declare ppc_fp128 @llvm.maxnum.ppcf128(ppc_fp128 %Val0, ppc_fp128 %Val1)
Matt Arsenaultd6511b42014-10-21 23:00:20 +000011028
11029Overview:
11030"""""""""
11031
11032The '``llvm.maxnum.*``' intrinsics return the maximum of the two
11033arguments.
11034
11035
11036Arguments:
11037""""""""""
11038
11039The arguments and return value are floating point numbers of the same
11040type.
11041
11042Semantics:
11043""""""""""
11044Follows the IEEE-754 semantics for maxNum, which also match for libm's
11045fmax.
11046
11047If either operand is a NaN, returns the other non-NaN operand. Returns
11048NaN only if both operands are NaN. If the operands compare equal,
11049returns a value that compares equal to both operands. This means that
11050fmax(+/-0.0, +/-0.0) could return either -0.0 or 0.0.
11051
Hal Finkel0c5c01aa2013-08-19 23:35:46 +000011052'``llvm.copysign.*``' Intrinsic
11053^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11054
11055Syntax:
11056"""""""
11057
11058This is an overloaded intrinsic. You can use ``llvm.copysign`` on any
11059floating point or vector of floating point type. Not all targets support
11060all types however.
11061
11062::
11063
11064 declare float @llvm.copysign.f32(float %Mag, float %Sgn)
11065 declare double @llvm.copysign.f64(double %Mag, double %Sgn)
11066 declare x86_fp80 @llvm.copysign.f80(x86_fp80 %Mag, x86_fp80 %Sgn)
11067 declare fp128 @llvm.copysign.f128(fp128 %Mag, fp128 %Sgn)
11068 declare ppc_fp128 @llvm.copysign.ppcf128(ppc_fp128 %Mag, ppc_fp128 %Sgn)
11069
11070Overview:
11071"""""""""
11072
11073The '``llvm.copysign.*``' intrinsics return a value with the magnitude of the
11074first operand and the sign of the second operand.
11075
11076Arguments:
11077""""""""""
11078
11079The arguments and return value are floating point numbers of the same
11080type.
11081
11082Semantics:
11083""""""""""
11084
11085This function returns the same values as the libm ``copysign``
11086functions would, and handles error conditions in the same way.
11087
Sean Silvab084af42012-12-07 10:36:55 +000011088'``llvm.floor.*``' Intrinsic
11089^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11090
11091Syntax:
11092"""""""
11093
11094This is an overloaded intrinsic. You can use ``llvm.floor`` on any
11095floating point or vector of floating point type. Not all targets support
11096all types however.
11097
11098::
11099
11100 declare float @llvm.floor.f32(float %Val)
11101 declare double @llvm.floor.f64(double %Val)
11102 declare x86_fp80 @llvm.floor.f80(x86_fp80 %Val)
11103 declare fp128 @llvm.floor.f128(fp128 %Val)
11104 declare ppc_fp128 @llvm.floor.ppcf128(ppc_fp128 %Val)
11105
11106Overview:
11107"""""""""
11108
11109The '``llvm.floor.*``' intrinsics return the floor of the operand.
11110
11111Arguments:
11112""""""""""
11113
11114The argument and return value are floating point numbers of the same
11115type.
11116
11117Semantics:
11118""""""""""
11119
11120This function returns the same values as the libm ``floor`` functions
11121would, and handles error conditions in the same way.
11122
11123'``llvm.ceil.*``' Intrinsic
11124^^^^^^^^^^^^^^^^^^^^^^^^^^^
11125
11126Syntax:
11127"""""""
11128
11129This is an overloaded intrinsic. You can use ``llvm.ceil`` on any
11130floating point or vector of floating point type. Not all targets support
11131all types however.
11132
11133::
11134
11135 declare float @llvm.ceil.f32(float %Val)
11136 declare double @llvm.ceil.f64(double %Val)
11137 declare x86_fp80 @llvm.ceil.f80(x86_fp80 %Val)
11138 declare fp128 @llvm.ceil.f128(fp128 %Val)
11139 declare ppc_fp128 @llvm.ceil.ppcf128(ppc_fp128 %Val)
11140
11141Overview:
11142"""""""""
11143
11144The '``llvm.ceil.*``' intrinsics return the ceiling of the operand.
11145
11146Arguments:
11147""""""""""
11148
11149The argument and return value are floating point numbers of the same
11150type.
11151
11152Semantics:
11153""""""""""
11154
11155This function returns the same values as the libm ``ceil`` functions
11156would, and handles error conditions in the same way.
11157
11158'``llvm.trunc.*``' Intrinsic
11159^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11160
11161Syntax:
11162"""""""
11163
11164This is an overloaded intrinsic. You can use ``llvm.trunc`` on any
11165floating point or vector of floating point type. Not all targets support
11166all types however.
11167
11168::
11169
11170 declare float @llvm.trunc.f32(float %Val)
11171 declare double @llvm.trunc.f64(double %Val)
11172 declare x86_fp80 @llvm.trunc.f80(x86_fp80 %Val)
11173 declare fp128 @llvm.trunc.f128(fp128 %Val)
11174 declare ppc_fp128 @llvm.trunc.ppcf128(ppc_fp128 %Val)
11175
11176Overview:
11177"""""""""
11178
11179The '``llvm.trunc.*``' intrinsics returns the operand rounded to the
11180nearest integer not larger in magnitude than the operand.
11181
11182Arguments:
11183""""""""""
11184
11185The argument and return value are floating point numbers of the same
11186type.
11187
11188Semantics:
11189""""""""""
11190
11191This function returns the same values as the libm ``trunc`` functions
11192would, and handles error conditions in the same way.
11193
11194'``llvm.rint.*``' Intrinsic
11195^^^^^^^^^^^^^^^^^^^^^^^^^^^
11196
11197Syntax:
11198"""""""
11199
11200This is an overloaded intrinsic. You can use ``llvm.rint`` on any
11201floating point or vector of floating point type. Not all targets support
11202all types however.
11203
11204::
11205
11206 declare float @llvm.rint.f32(float %Val)
11207 declare double @llvm.rint.f64(double %Val)
11208 declare x86_fp80 @llvm.rint.f80(x86_fp80 %Val)
11209 declare fp128 @llvm.rint.f128(fp128 %Val)
11210 declare ppc_fp128 @llvm.rint.ppcf128(ppc_fp128 %Val)
11211
11212Overview:
11213"""""""""
11214
11215The '``llvm.rint.*``' intrinsics returns the operand rounded to the
11216nearest integer. It may raise an inexact floating-point exception if the
11217operand isn't an integer.
11218
11219Arguments:
11220""""""""""
11221
11222The argument and return value are floating point numbers of the same
11223type.
11224
11225Semantics:
11226""""""""""
11227
11228This function returns the same values as the libm ``rint`` functions
11229would, and handles error conditions in the same way.
11230
11231'``llvm.nearbyint.*``' Intrinsic
11232^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11233
11234Syntax:
11235"""""""
11236
11237This is an overloaded intrinsic. You can use ``llvm.nearbyint`` on any
11238floating point or vector of floating point type. Not all targets support
11239all types however.
11240
11241::
11242
11243 declare float @llvm.nearbyint.f32(float %Val)
11244 declare double @llvm.nearbyint.f64(double %Val)
11245 declare x86_fp80 @llvm.nearbyint.f80(x86_fp80 %Val)
11246 declare fp128 @llvm.nearbyint.f128(fp128 %Val)
11247 declare ppc_fp128 @llvm.nearbyint.ppcf128(ppc_fp128 %Val)
11248
11249Overview:
11250"""""""""
11251
11252The '``llvm.nearbyint.*``' intrinsics returns the operand rounded to the
11253nearest integer.
11254
11255Arguments:
11256""""""""""
11257
11258The argument and return value are floating point numbers of the same
11259type.
11260
11261Semantics:
11262""""""""""
11263
11264This function returns the same values as the libm ``nearbyint``
11265functions would, and handles error conditions in the same way.
11266
Hal Finkel171817e2013-08-07 22:49:12 +000011267'``llvm.round.*``' Intrinsic
11268^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11269
11270Syntax:
11271"""""""
11272
11273This is an overloaded intrinsic. You can use ``llvm.round`` on any
11274floating point or vector of floating point type. Not all targets support
11275all types however.
11276
11277::
11278
11279 declare float @llvm.round.f32(float %Val)
11280 declare double @llvm.round.f64(double %Val)
11281 declare x86_fp80 @llvm.round.f80(x86_fp80 %Val)
11282 declare fp128 @llvm.round.f128(fp128 %Val)
11283 declare ppc_fp128 @llvm.round.ppcf128(ppc_fp128 %Val)
11284
11285Overview:
11286"""""""""
11287
11288The '``llvm.round.*``' intrinsics returns the operand rounded to the
11289nearest integer.
11290
11291Arguments:
11292""""""""""
11293
11294The argument and return value are floating point numbers of the same
11295type.
11296
11297Semantics:
11298""""""""""
11299
11300This function returns the same values as the libm ``round``
11301functions would, and handles error conditions in the same way.
11302
Sean Silvab084af42012-12-07 10:36:55 +000011303Bit Manipulation Intrinsics
11304---------------------------
11305
11306LLVM provides intrinsics for a few important bit manipulation
11307operations. These allow efficient code generation for some algorithms.
11308
James Molloy90111f72015-11-12 12:29:09 +000011309'``llvm.bitreverse.*``' Intrinsics
Akira Hatanaka7f5562b2015-11-13 21:09:57 +000011310^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
James Molloy90111f72015-11-12 12:29:09 +000011311
11312Syntax:
11313"""""""
11314
11315This is an overloaded intrinsic function. You can use bitreverse on any
11316integer type.
11317
11318::
11319
11320 declare i16 @llvm.bitreverse.i16(i16 <id>)
11321 declare i32 @llvm.bitreverse.i32(i32 <id>)
11322 declare i64 @llvm.bitreverse.i64(i64 <id>)
11323
11324Overview:
11325"""""""""
11326
11327The '``llvm.bitreverse``' family of intrinsics is used to reverse the
Matt Arsenaultde2d6a32016-03-07 21:54:52 +000011328bitpattern of an integer value; for example ``0b10110110`` becomes
11329``0b01101101``.
James Molloy90111f72015-11-12 12:29:09 +000011330
11331Semantics:
11332""""""""""
11333
Yichao Yu5abf14b2016-11-23 16:25:31 +000011334The ``llvm.bitreverse.iN`` intrinsic returns an iN value that has bit
James Molloy90111f72015-11-12 12:29:09 +000011335``M`` in the input moved to bit ``N-M`` in the output.
11336
Sean Silvab084af42012-12-07 10:36:55 +000011337'``llvm.bswap.*``' Intrinsics
11338^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11339
11340Syntax:
11341"""""""
11342
11343This is an overloaded intrinsic function. You can use bswap on any
11344integer type that is an even number of bytes (i.e. BitWidth % 16 == 0).
11345
11346::
11347
11348 declare i16 @llvm.bswap.i16(i16 <id>)
11349 declare i32 @llvm.bswap.i32(i32 <id>)
11350 declare i64 @llvm.bswap.i64(i64 <id>)
11351
11352Overview:
11353"""""""""
11354
11355The '``llvm.bswap``' family of intrinsics is used to byte swap integer
11356values with an even number of bytes (positive multiple of 16 bits).
11357These are useful for performing operations on data that is not in the
11358target's native byte order.
11359
11360Semantics:
11361""""""""""
11362
11363The ``llvm.bswap.i16`` intrinsic returns an i16 value that has the high
11364and low byte of the input i16 swapped. Similarly, the ``llvm.bswap.i32``
11365intrinsic returns an i32 value that has the four bytes of the input i32
11366swapped, so that if the input bytes are numbered 0, 1, 2, 3 then the
11367returned i32 will have its bytes in 3, 2, 1, 0 order. The
11368``llvm.bswap.i48``, ``llvm.bswap.i64`` and other intrinsics extend this
11369concept to additional even-byte lengths (6 bytes, 8 bytes and more,
11370respectively).
11371
11372'``llvm.ctpop.*``' Intrinsic
11373^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11374
11375Syntax:
11376"""""""
11377
11378This is an overloaded intrinsic. You can use llvm.ctpop on any integer
11379bit width, or on any vector with integer elements. Not all targets
11380support all bit widths or vector types, however.
11381
11382::
11383
11384 declare i8 @llvm.ctpop.i8(i8 <src>)
11385 declare i16 @llvm.ctpop.i16(i16 <src>)
11386 declare i32 @llvm.ctpop.i32(i32 <src>)
11387 declare i64 @llvm.ctpop.i64(i64 <src>)
11388 declare i256 @llvm.ctpop.i256(i256 <src>)
11389 declare <2 x i32> @llvm.ctpop.v2i32(<2 x i32> <src>)
11390
11391Overview:
11392"""""""""
11393
11394The '``llvm.ctpop``' family of intrinsics counts the number of bits set
11395in a value.
11396
11397Arguments:
11398""""""""""
11399
11400The only argument is the value to be counted. The argument may be of any
11401integer type, or a vector with integer elements. The return type must
11402match the argument type.
11403
11404Semantics:
11405""""""""""
11406
11407The '``llvm.ctpop``' intrinsic counts the 1's in a variable, or within
11408each element of a vector.
11409
11410'``llvm.ctlz.*``' Intrinsic
11411^^^^^^^^^^^^^^^^^^^^^^^^^^^
11412
11413Syntax:
11414"""""""
11415
11416This is an overloaded intrinsic. You can use ``llvm.ctlz`` on any
11417integer bit width, or any vector whose elements are integers. Not all
11418targets support all bit widths or vector types, however.
11419
11420::
11421
11422 declare i8 @llvm.ctlz.i8 (i8 <src>, i1 <is_zero_undef>)
11423 declare i16 @llvm.ctlz.i16 (i16 <src>, i1 <is_zero_undef>)
11424 declare i32 @llvm.ctlz.i32 (i32 <src>, i1 <is_zero_undef>)
11425 declare i64 @llvm.ctlz.i64 (i64 <src>, i1 <is_zero_undef>)
11426 declare i256 @llvm.ctlz.i256(i256 <src>, i1 <is_zero_undef>)
Alexey Samsonovc4b18302016-03-17 23:08:01 +000011427 declare <2 x i32> @llvm.ctlz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
Sean Silvab084af42012-12-07 10:36:55 +000011428
11429Overview:
11430"""""""""
11431
11432The '``llvm.ctlz``' family of intrinsic functions counts the number of
11433leading zeros in a variable.
11434
11435Arguments:
11436""""""""""
11437
11438The first argument is the value to be counted. This argument may be of
Hal Finkel5dd82782015-01-05 04:05:21 +000011439any integer type, or a vector with integer element type. The return
Sean Silvab084af42012-12-07 10:36:55 +000011440type must match the first argument type.
11441
11442The second argument must be a constant and is a flag to indicate whether
11443the intrinsic should ensure that a zero as the first argument produces a
11444defined result. Historically some architectures did not provide a
11445defined result for zero values as efficiently, and many algorithms are
11446now predicated on avoiding zero-value inputs.
11447
11448Semantics:
11449""""""""""
11450
11451The '``llvm.ctlz``' intrinsic counts the leading (most significant)
11452zeros in a variable, or within each element of the vector. If
11453``src == 0`` then the result is the size in bits of the type of ``src``
11454if ``is_zero_undef == 0`` and ``undef`` otherwise. For example,
11455``llvm.ctlz(i32 2) = 30``.
11456
11457'``llvm.cttz.*``' Intrinsic
11458^^^^^^^^^^^^^^^^^^^^^^^^^^^
11459
11460Syntax:
11461"""""""
11462
11463This is an overloaded intrinsic. You can use ``llvm.cttz`` on any
11464integer bit width, or any vector of integer elements. Not all targets
11465support all bit widths or vector types, however.
11466
11467::
11468
11469 declare i8 @llvm.cttz.i8 (i8 <src>, i1 <is_zero_undef>)
11470 declare i16 @llvm.cttz.i16 (i16 <src>, i1 <is_zero_undef>)
11471 declare i32 @llvm.cttz.i32 (i32 <src>, i1 <is_zero_undef>)
11472 declare i64 @llvm.cttz.i64 (i64 <src>, i1 <is_zero_undef>)
11473 declare i256 @llvm.cttz.i256(i256 <src>, i1 <is_zero_undef>)
Alexey Samsonovc4b18302016-03-17 23:08:01 +000011474 declare <2 x i32> @llvm.cttz.v2i32(<2 x i32> <src>, i1 <is_zero_undef>)
Sean Silvab084af42012-12-07 10:36:55 +000011475
11476Overview:
11477"""""""""
11478
11479The '``llvm.cttz``' family of intrinsic functions counts the number of
11480trailing zeros.
11481
11482Arguments:
11483""""""""""
11484
11485The first argument is the value to be counted. This argument may be of
Hal Finkel5dd82782015-01-05 04:05:21 +000011486any integer type, or a vector with integer element type. The return
Sean Silvab084af42012-12-07 10:36:55 +000011487type must match the first argument type.
11488
11489The second argument must be a constant and is a flag to indicate whether
11490the intrinsic should ensure that a zero as the first argument produces a
11491defined result. Historically some architectures did not provide a
11492defined result for zero values as efficiently, and many algorithms are
11493now predicated on avoiding zero-value inputs.
11494
11495Semantics:
11496""""""""""
11497
11498The '``llvm.cttz``' intrinsic counts the trailing (least significant)
11499zeros in a variable, or within each element of a vector. If ``src == 0``
11500then the result is the size in bits of the type of ``src`` if
11501``is_zero_undef == 0`` and ``undef`` otherwise. For example,
11502``llvm.cttz(2) = 1``.
11503
Philip Reames34843ae2015-03-05 05:55:55 +000011504.. _int_overflow:
11505
Sean Silvab084af42012-12-07 10:36:55 +000011506Arithmetic with Overflow Intrinsics
11507-----------------------------------
11508
John Regehr6a493f22016-05-12 20:55:09 +000011509LLVM provides intrinsics for fast arithmetic overflow checking.
11510
11511Each of these intrinsics returns a two-element struct. The first
11512element of this struct contains the result of the corresponding
11513arithmetic operation modulo 2\ :sup:`n`\ , where n is the bit width of
11514the result. Therefore, for example, the first element of the struct
11515returned by ``llvm.sadd.with.overflow.i32`` is always the same as the
11516result of a 32-bit ``add`` instruction with the same operands, where
11517the ``add`` is *not* modified by an ``nsw`` or ``nuw`` flag.
11518
11519The second element of the result is an ``i1`` that is 1 if the
11520arithmetic operation overflowed and 0 otherwise. An operation
11521overflows if, for any values of its operands ``A`` and ``B`` and for
11522any ``N`` larger than the operands' width, ``ext(A op B) to iN`` is
11523not equal to ``(ext(A) to iN) op (ext(B) to iN)`` where ``ext`` is
11524``sext`` for signed overflow and ``zext`` for unsigned overflow, and
11525``op`` is the underlying arithmetic operation.
11526
11527The behavior of these intrinsics is well-defined for all argument
11528values.
Sean Silvab084af42012-12-07 10:36:55 +000011529
11530'``llvm.sadd.with.overflow.*``' Intrinsics
11531^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11532
11533Syntax:
11534"""""""
11535
11536This is an overloaded intrinsic. You can use ``llvm.sadd.with.overflow``
11537on any integer bit width.
11538
11539::
11540
11541 declare {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b)
11542 declare {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
11543 declare {i64, i1} @llvm.sadd.with.overflow.i64(i64 %a, i64 %b)
11544
11545Overview:
11546"""""""""
11547
11548The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
11549a signed addition of the two arguments, and indicate whether an overflow
11550occurred during the signed summation.
11551
11552Arguments:
11553""""""""""
11554
11555The arguments (%a and %b) and the first element of the result structure
11556may be of integer types of any bit width, but they must have the same
11557bit width. The second element of the result structure must be of type
11558``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
11559addition.
11560
11561Semantics:
11562""""""""""
11563
11564The '``llvm.sadd.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011565a signed addition of the two variables. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000011566first element of which is the signed summation, and the second element
11567of which is a bit specifying if the signed summation resulted in an
11568overflow.
11569
11570Examples:
11571"""""""""
11572
11573.. code-block:: llvm
11574
11575 %res = call {i32, i1} @llvm.sadd.with.overflow.i32(i32 %a, i32 %b)
11576 %sum = extractvalue {i32, i1} %res, 0
11577 %obit = extractvalue {i32, i1} %res, 1
11578 br i1 %obit, label %overflow, label %normal
11579
11580'``llvm.uadd.with.overflow.*``' Intrinsics
11581^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11582
11583Syntax:
11584"""""""
11585
11586This is an overloaded intrinsic. You can use ``llvm.uadd.with.overflow``
11587on any integer bit width.
11588
11589::
11590
11591 declare {i16, i1} @llvm.uadd.with.overflow.i16(i16 %a, i16 %b)
11592 declare {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
11593 declare {i64, i1} @llvm.uadd.with.overflow.i64(i64 %a, i64 %b)
11594
11595Overview:
11596"""""""""
11597
11598The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
11599an unsigned addition of the two arguments, and indicate whether a carry
11600occurred during the unsigned summation.
11601
11602Arguments:
11603""""""""""
11604
11605The arguments (%a and %b) and the first element of the result structure
11606may be of integer types of any bit width, but they must have the same
11607bit width. The second element of the result structure must be of type
11608``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
11609addition.
11610
11611Semantics:
11612""""""""""
11613
11614The '``llvm.uadd.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011615an unsigned addition of the two arguments. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000011616first element of which is the sum, and the second element of which is a
11617bit specifying if the unsigned summation resulted in a carry.
11618
11619Examples:
11620"""""""""
11621
11622.. code-block:: llvm
11623
11624 %res = call {i32, i1} @llvm.uadd.with.overflow.i32(i32 %a, i32 %b)
11625 %sum = extractvalue {i32, i1} %res, 0
11626 %obit = extractvalue {i32, i1} %res, 1
11627 br i1 %obit, label %carry, label %normal
11628
11629'``llvm.ssub.with.overflow.*``' Intrinsics
11630^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11631
11632Syntax:
11633"""""""
11634
11635This is an overloaded intrinsic. You can use ``llvm.ssub.with.overflow``
11636on any integer bit width.
11637
11638::
11639
11640 declare {i16, i1} @llvm.ssub.with.overflow.i16(i16 %a, i16 %b)
11641 declare {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
11642 declare {i64, i1} @llvm.ssub.with.overflow.i64(i64 %a, i64 %b)
11643
11644Overview:
11645"""""""""
11646
11647The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
11648a signed subtraction of the two arguments, and indicate whether an
11649overflow occurred during the signed subtraction.
11650
11651Arguments:
11652""""""""""
11653
11654The arguments (%a and %b) and the first element of the result structure
11655may be of integer types of any bit width, but they must have the same
11656bit width. The second element of the result structure must be of type
11657``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
11658subtraction.
11659
11660Semantics:
11661""""""""""
11662
11663The '``llvm.ssub.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011664a signed subtraction of the two arguments. They return a structure --- the
Sean Silvab084af42012-12-07 10:36:55 +000011665first element of which is the subtraction, and the second element of
11666which is a bit specifying if the signed subtraction resulted in an
11667overflow.
11668
11669Examples:
11670"""""""""
11671
11672.. code-block:: llvm
11673
11674 %res = call {i32, i1} @llvm.ssub.with.overflow.i32(i32 %a, i32 %b)
11675 %sum = extractvalue {i32, i1} %res, 0
11676 %obit = extractvalue {i32, i1} %res, 1
11677 br i1 %obit, label %overflow, label %normal
11678
11679'``llvm.usub.with.overflow.*``' Intrinsics
11680^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11681
11682Syntax:
11683"""""""
11684
11685This is an overloaded intrinsic. You can use ``llvm.usub.with.overflow``
11686on any integer bit width.
11687
11688::
11689
11690 declare {i16, i1} @llvm.usub.with.overflow.i16(i16 %a, i16 %b)
11691 declare {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
11692 declare {i64, i1} @llvm.usub.with.overflow.i64(i64 %a, i64 %b)
11693
11694Overview:
11695"""""""""
11696
11697The '``llvm.usub.with.overflow``' family of intrinsic functions perform
11698an unsigned subtraction of the two arguments, and indicate whether an
11699overflow occurred during the unsigned subtraction.
11700
11701Arguments:
11702""""""""""
11703
11704The arguments (%a and %b) and the first element of the result structure
11705may be of integer types of any bit width, but they must have the same
11706bit width. The second element of the result structure must be of type
11707``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
11708subtraction.
11709
11710Semantics:
11711""""""""""
11712
11713The '``llvm.usub.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011714an unsigned subtraction of the two arguments. They return a structure ---
Sean Silvab084af42012-12-07 10:36:55 +000011715the first element of which is the subtraction, and the second element of
11716which is a bit specifying if the unsigned subtraction resulted in an
11717overflow.
11718
11719Examples:
11720"""""""""
11721
11722.. code-block:: llvm
11723
11724 %res = call {i32, i1} @llvm.usub.with.overflow.i32(i32 %a, i32 %b)
11725 %sum = extractvalue {i32, i1} %res, 0
11726 %obit = extractvalue {i32, i1} %res, 1
11727 br i1 %obit, label %overflow, label %normal
11728
11729'``llvm.smul.with.overflow.*``' Intrinsics
11730^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11731
11732Syntax:
11733"""""""
11734
11735This is an overloaded intrinsic. You can use ``llvm.smul.with.overflow``
11736on any integer bit width.
11737
11738::
11739
11740 declare {i16, i1} @llvm.smul.with.overflow.i16(i16 %a, i16 %b)
11741 declare {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
11742 declare {i64, i1} @llvm.smul.with.overflow.i64(i64 %a, i64 %b)
11743
11744Overview:
11745"""""""""
11746
11747The '``llvm.smul.with.overflow``' family of intrinsic functions perform
11748a signed multiplication of the two arguments, and indicate whether an
11749overflow occurred during the signed multiplication.
11750
11751Arguments:
11752""""""""""
11753
11754The arguments (%a and %b) and the first element of the result structure
11755may be of integer types of any bit width, but they must have the same
11756bit width. The second element of the result structure must be of type
11757``i1``. ``%a`` and ``%b`` are the two values that will undergo signed
11758multiplication.
11759
11760Semantics:
11761""""""""""
11762
11763The '``llvm.smul.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011764a signed multiplication of the two arguments. They return a structure ---
Sean Silvab084af42012-12-07 10:36:55 +000011765the first element of which is the multiplication, and the second element
11766of which is a bit specifying if the signed multiplication resulted in an
11767overflow.
11768
11769Examples:
11770"""""""""
11771
11772.. code-block:: llvm
11773
11774 %res = call {i32, i1} @llvm.smul.with.overflow.i32(i32 %a, i32 %b)
11775 %sum = extractvalue {i32, i1} %res, 0
11776 %obit = extractvalue {i32, i1} %res, 1
11777 br i1 %obit, label %overflow, label %normal
11778
11779'``llvm.umul.with.overflow.*``' Intrinsics
11780^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11781
11782Syntax:
11783"""""""
11784
11785This is an overloaded intrinsic. You can use ``llvm.umul.with.overflow``
11786on any integer bit width.
11787
11788::
11789
11790 declare {i16, i1} @llvm.umul.with.overflow.i16(i16 %a, i16 %b)
11791 declare {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
11792 declare {i64, i1} @llvm.umul.with.overflow.i64(i64 %a, i64 %b)
11793
11794Overview:
11795"""""""""
11796
11797The '``llvm.umul.with.overflow``' family of intrinsic functions perform
11798a unsigned multiplication of the two arguments, and indicate whether an
11799overflow occurred during the unsigned multiplication.
11800
11801Arguments:
11802""""""""""
11803
11804The arguments (%a and %b) and the first element of the result structure
11805may be of integer types of any bit width, but they must have the same
11806bit width. The second element of the result structure must be of type
11807``i1``. ``%a`` and ``%b`` are the two values that will undergo unsigned
11808multiplication.
11809
11810Semantics:
11811""""""""""
11812
11813The '``llvm.umul.with.overflow``' family of intrinsic functions perform
Dmitri Gribenkoe8131122013-01-19 20:34:20 +000011814an unsigned multiplication of the two arguments. They return a structure ---
11815the first element of which is the multiplication, and the second
Sean Silvab084af42012-12-07 10:36:55 +000011816element of which is a bit specifying if the unsigned multiplication
11817resulted in an overflow.
11818
11819Examples:
11820"""""""""
11821
11822.. code-block:: llvm
11823
11824 %res = call {i32, i1} @llvm.umul.with.overflow.i32(i32 %a, i32 %b)
11825 %sum = extractvalue {i32, i1} %res, 0
11826 %obit = extractvalue {i32, i1} %res, 1
11827 br i1 %obit, label %overflow, label %normal
11828
11829Specialised Arithmetic Intrinsics
11830---------------------------------
11831
Owen Anderson1056a922015-07-11 07:01:27 +000011832'``llvm.canonicalize.*``' Intrinsic
11833^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11834
11835Syntax:
11836"""""""
11837
11838::
11839
11840 declare float @llvm.canonicalize.f32(float %a)
11841 declare double @llvm.canonicalize.f64(double %b)
11842
11843Overview:
11844"""""""""
11845
11846The '``llvm.canonicalize.*``' intrinsic returns the platform specific canonical
Sean Silvaa1190322015-08-06 22:56:48 +000011847encoding of a floating point number. This canonicalization is useful for
Owen Anderson1056a922015-07-11 07:01:27 +000011848implementing certain numeric primitives such as frexp. The canonical encoding is
11849defined by IEEE-754-2008 to be:
11850
11851::
11852
11853 2.1.8 canonical encoding: The preferred encoding of a floating-point
Sean Silvaa1190322015-08-06 22:56:48 +000011854 representation in a format. Applied to declets, significands of finite
Owen Anderson1056a922015-07-11 07:01:27 +000011855 numbers, infinities, and NaNs, especially in decimal formats.
11856
11857This operation can also be considered equivalent to the IEEE-754-2008
Sean Silvaa1190322015-08-06 22:56:48 +000011858conversion of a floating-point value to the same format. NaNs are handled
Owen Anderson1056a922015-07-11 07:01:27 +000011859according to section 6.2.
11860
11861Examples of non-canonical encodings:
11862
Sean Silvaa1190322015-08-06 22:56:48 +000011863- x87 pseudo denormals, pseudo NaNs, pseudo Infinity, Unnormals. These are
Owen Anderson1056a922015-07-11 07:01:27 +000011864 converted to a canonical representation per hardware-specific protocol.
11865- Many normal decimal floating point numbers have non-canonical alternative
11866 encodings.
11867- Some machines, like GPUs or ARMv7 NEON, do not support subnormal values.
Sanjay Patelcc330962016-02-24 23:44:19 +000011868 These are treated as non-canonical encodings of zero and will be flushed to
Owen Anderson1056a922015-07-11 07:01:27 +000011869 a zero of the same sign by this operation.
11870
11871Note that per IEEE-754-2008 6.2, systems that support signaling NaNs with
11872default exception handling must signal an invalid exception, and produce a
11873quiet NaN result.
11874
11875This function should always be implementable as multiplication by 1.0, provided
Sean Silvaa1190322015-08-06 22:56:48 +000011876that the compiler does not constant fold the operation. Likewise, division by
118771.0 and ``llvm.minnum(x, x)`` are possible implementations. Addition with
Owen Anderson1056a922015-07-11 07:01:27 +000011878-0.0 is also sufficient provided that the rounding mode is not -Infinity.
11879
Sean Silvaa1190322015-08-06 22:56:48 +000011880``@llvm.canonicalize`` must preserve the equality relation. That is:
Owen Anderson1056a922015-07-11 07:01:27 +000011881
11882- ``(@llvm.canonicalize(x) == x)`` is equivalent to ``(x == x)``
11883- ``(@llvm.canonicalize(x) == @llvm.canonicalize(y))`` is equivalent to
11884 to ``(x == y)``
11885
11886Additionally, the sign of zero must be conserved:
11887``@llvm.canonicalize(-0.0) = -0.0`` and ``@llvm.canonicalize(+0.0) = +0.0``
11888
11889The payload bits of a NaN must be conserved, with two exceptions.
11890First, environments which use only a single canonical representation of NaN
Sean Silvaa1190322015-08-06 22:56:48 +000011891must perform said canonicalization. Second, SNaNs must be quieted per the
Owen Anderson1056a922015-07-11 07:01:27 +000011892usual methods.
11893
11894The canonicalization operation may be optimized away if:
11895
Sean Silvaa1190322015-08-06 22:56:48 +000011896- The input is known to be canonical. For example, it was produced by a
Owen Anderson1056a922015-07-11 07:01:27 +000011897 floating-point operation that is required by the standard to be canonical.
11898- The result is consumed only by (or fused with) other floating-point
Sean Silvaa1190322015-08-06 22:56:48 +000011899 operations. That is, the bits of the floating point value are not examined.
Owen Anderson1056a922015-07-11 07:01:27 +000011900
Sean Silvab084af42012-12-07 10:36:55 +000011901'``llvm.fmuladd.*``' Intrinsic
11902^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11903
11904Syntax:
11905"""""""
11906
11907::
11908
11909 declare float @llvm.fmuladd.f32(float %a, float %b, float %c)
11910 declare double @llvm.fmuladd.f64(double %a, double %b, double %c)
11911
11912Overview:
11913"""""""""
11914
11915The '``llvm.fmuladd.*``' intrinsic functions represent multiply-add
Lang Hames045f4392013-01-17 00:00:49 +000011916expressions that can be fused if the code generator determines that (a) the
11917target instruction set has support for a fused operation, and (b) that the
11918fused operation is more efficient than the equivalent, separate pair of mul
11919and add instructions.
Sean Silvab084af42012-12-07 10:36:55 +000011920
11921Arguments:
11922""""""""""
11923
11924The '``llvm.fmuladd.*``' intrinsics each take three arguments: two
11925multiplicands, a and b, and an addend c.
11926
11927Semantics:
11928""""""""""
11929
11930The expression:
11931
11932::
11933
11934 %0 = call float @llvm.fmuladd.f32(%a, %b, %c)
11935
11936is equivalent to the expression a \* b + c, except that rounding will
11937not be performed between the multiplication and addition steps if the
11938code generator fuses the operations. Fusion is not guaranteed, even if
11939the target platform supports it. If a fused multiply-add is required the
Matt Arsenaultee364ee2014-01-31 00:09:00 +000011940corresponding llvm.fma.\* intrinsic function should be used
11941instead. This never sets errno, just as '``llvm.fma.*``'.
Sean Silvab084af42012-12-07 10:36:55 +000011942
11943Examples:
11944"""""""""
11945
11946.. code-block:: llvm
11947
Tim Northover675a0962014-06-13 14:24:23 +000011948 %r2 = call float @llvm.fmuladd.f32(float %a, float %b, float %c) ; yields float:r2 = (a * b) + c
Sean Silvab084af42012-12-07 10:36:55 +000011949
Amara Emersoncf9daa32017-05-09 10:43:25 +000011950
11951Experimental Vector Reduction Intrinsics
11952----------------------------------------
11953
11954Horizontal reductions of vectors can be expressed using the following
11955intrinsics. Each one takes a vector operand as an input and applies its
11956respective operation across all elements of the vector, returning a single
11957scalar result of the same element type.
11958
11959
11960'``llvm.experimental.vector.reduce.add.*``' Intrinsic
11961^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11962
11963Syntax:
11964"""""""
11965
11966::
11967
11968 declare i32 @llvm.experimental.vector.reduce.add.i32.v4i32(<4 x i32> %a)
11969 declare i64 @llvm.experimental.vector.reduce.add.i64.v2i64(<2 x i64> %a)
11970
11971Overview:
11972"""""""""
11973
11974The '``llvm.experimental.vector.reduce.add.*``' intrinsics do an integer ``ADD``
11975reduction of a vector, returning the result as a scalar. The return type matches
11976the element-type of the vector input.
11977
11978Arguments:
11979""""""""""
11980The argument to this intrinsic must be a vector of integer values.
11981
11982'``llvm.experimental.vector.reduce.fadd.*``' Intrinsic
11983^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11984
11985Syntax:
11986"""""""
11987
11988::
11989
11990 declare float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %a)
11991 declare double @llvm.experimental.vector.reduce.fadd.f64.v2f64(double %acc, <2 x double> %a)
11992
11993Overview:
11994"""""""""
11995
11996The '``llvm.experimental.vector.reduce.fadd.*``' intrinsics do a floating point
11997``ADD`` reduction of a vector, returning the result as a scalar. The return type
11998matches the element-type of the vector input.
11999
12000If the intrinsic call has fast-math flags, then the reduction will not preserve
12001the associativity of an equivalent scalarized counterpart. If it does not have
12002fast-math flags, then the reduction will be *ordered*, implying that the
12003operation respects the associativity of a scalarized reduction.
12004
12005
12006Arguments:
12007""""""""""
12008The first argument to this intrinsic is a scalar accumulator value, which is
12009only used when there are no fast-math flags attached. This argument may be undef
12010when fast-math flags are used.
12011
12012The second argument must be a vector of floating point values.
12013
12014Examples:
12015"""""""""
12016
12017.. code-block:: llvm
12018
12019 %fast = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
12020 %ord = call float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
12021
12022
12023'``llvm.experimental.vector.reduce.mul.*``' Intrinsic
12024^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12025
12026Syntax:
12027"""""""
12028
12029::
12030
12031 declare i32 @llvm.experimental.vector.reduce.mul.i32.v4i32(<4 x i32> %a)
12032 declare i64 @llvm.experimental.vector.reduce.mul.i64.v2i64(<2 x i64> %a)
12033
12034Overview:
12035"""""""""
12036
12037The '``llvm.experimental.vector.reduce.mul.*``' intrinsics do an integer ``MUL``
12038reduction of a vector, returning the result as a scalar. The return type matches
12039the element-type of the vector input.
12040
12041Arguments:
12042""""""""""
12043The argument to this intrinsic must be a vector of integer values.
12044
12045'``llvm.experimental.vector.reduce.fmul.*``' Intrinsic
12046^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12047
12048Syntax:
12049"""""""
12050
12051::
12052
12053 declare float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %a)
12054 declare double @llvm.experimental.vector.reduce.fmul.f64.v2f64(double %acc, <2 x double> %a)
12055
12056Overview:
12057"""""""""
12058
12059The '``llvm.experimental.vector.reduce.fmul.*``' intrinsics do a floating point
12060``MUL`` reduction of a vector, returning the result as a scalar. The return type
12061matches the element-type of the vector input.
12062
12063If the intrinsic call has fast-math flags, then the reduction will not preserve
12064the associativity of an equivalent scalarized counterpart. If it does not have
12065fast-math flags, then the reduction will be *ordered*, implying that the
12066operation respects the associativity of a scalarized reduction.
12067
12068
12069Arguments:
12070""""""""""
12071The first argument to this intrinsic is a scalar accumulator value, which is
12072only used when there are no fast-math flags attached. This argument may be undef
12073when fast-math flags are used.
12074
12075The second argument must be a vector of floating point values.
12076
12077Examples:
12078"""""""""
12079
12080.. code-block:: llvm
12081
12082 %fast = call fast float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float undef, <4 x float> %input) ; fast reduction
12083 %ord = call float @llvm.experimental.vector.reduce.fmul.f32.v4f32(float %acc, <4 x float> %input) ; ordered reduction
12084
12085'``llvm.experimental.vector.reduce.and.*``' Intrinsic
12086^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12087
12088Syntax:
12089"""""""
12090
12091::
12092
12093 declare i32 @llvm.experimental.vector.reduce.and.i32.v4i32(<4 x i32> %a)
12094
12095Overview:
12096"""""""""
12097
12098The '``llvm.experimental.vector.reduce.and.*``' intrinsics do a bitwise ``AND``
12099reduction of a vector, returning the result as a scalar. The return type matches
12100the element-type of the vector input.
12101
12102Arguments:
12103""""""""""
12104The argument to this intrinsic must be a vector of integer values.
12105
12106'``llvm.experimental.vector.reduce.or.*``' Intrinsic
12107^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12108
12109Syntax:
12110"""""""
12111
12112::
12113
12114 declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)
12115
12116Overview:
12117"""""""""
12118
12119The '``llvm.experimental.vector.reduce.or.*``' intrinsics do a bitwise ``OR`` reduction
12120of a vector, returning the result as a scalar. The return type matches the
12121element-type of the vector input.
12122
12123Arguments:
12124""""""""""
12125The argument to this intrinsic must be a vector of integer values.
12126
12127'``llvm.experimental.vector.reduce.xor.*``' Intrinsic
12128^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12129
12130Syntax:
12131"""""""
12132
12133::
12134
12135 declare i32 @llvm.experimental.vector.reduce.xor.i32.v4i32(<4 x i32> %a)
12136
12137Overview:
12138"""""""""
12139
12140The '``llvm.experimental.vector.reduce.xor.*``' intrinsics do a bitwise ``XOR``
12141reduction of a vector, returning the result as a scalar. The return type matches
12142the element-type of the vector input.
12143
12144Arguments:
12145""""""""""
12146The argument to this intrinsic must be a vector of integer values.
12147
12148'``llvm.experimental.vector.reduce.smax.*``' Intrinsic
12149^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12150
12151Syntax:
12152"""""""
12153
12154::
12155
12156 declare i32 @llvm.experimental.vector.reduce.smax.i32.v4i32(<4 x i32> %a)
12157
12158Overview:
12159"""""""""
12160
12161The '``llvm.experimental.vector.reduce.smax.*``' intrinsics do a signed integer
12162``MAX`` reduction of a vector, returning the result as a scalar. The return type
12163matches the element-type of the vector input.
12164
12165Arguments:
12166""""""""""
12167The argument to this intrinsic must be a vector of integer values.
12168
12169'``llvm.experimental.vector.reduce.smin.*``' Intrinsic
12170^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12171
12172Syntax:
12173"""""""
12174
12175::
12176
12177 declare i32 @llvm.experimental.vector.reduce.smin.i32.v4i32(<4 x i32> %a)
12178
12179Overview:
12180"""""""""
12181
12182The '``llvm.experimental.vector.reduce.smin.*``' intrinsics do a signed integer
12183``MIN`` reduction of a vector, returning the result as a scalar. The return type
12184matches the element-type of the vector input.
12185
12186Arguments:
12187""""""""""
12188The argument to this intrinsic must be a vector of integer values.
12189
12190'``llvm.experimental.vector.reduce.umax.*``' Intrinsic
12191^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12192
12193Syntax:
12194"""""""
12195
12196::
12197
12198 declare i32 @llvm.experimental.vector.reduce.umax.i32.v4i32(<4 x i32> %a)
12199
12200Overview:
12201"""""""""
12202
12203The '``llvm.experimental.vector.reduce.umax.*``' intrinsics do an unsigned
12204integer ``MAX`` reduction of a vector, returning the result as a scalar. The
12205return type matches the element-type of the vector input.
12206
12207Arguments:
12208""""""""""
12209The argument to this intrinsic must be a vector of integer values.
12210
12211'``llvm.experimental.vector.reduce.umin.*``' Intrinsic
12212^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12213
12214Syntax:
12215"""""""
12216
12217::
12218
12219 declare i32 @llvm.experimental.vector.reduce.umin.i32.v4i32(<4 x i32> %a)
12220
12221Overview:
12222"""""""""
12223
12224The '``llvm.experimental.vector.reduce.umin.*``' intrinsics do an unsigned
12225integer ``MIN`` reduction of a vector, returning the result as a scalar. The
12226return type matches the element-type of the vector input.
12227
12228Arguments:
12229""""""""""
12230The argument to this intrinsic must be a vector of integer values.
12231
12232'``llvm.experimental.vector.reduce.fmax.*``' Intrinsic
12233^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12234
12235Syntax:
12236"""""""
12237
12238::
12239
12240 declare float @llvm.experimental.vector.reduce.fmax.f32.v4f32(<4 x float> %a)
12241 declare double @llvm.experimental.vector.reduce.fmax.f64.v2f64(<2 x double> %a)
12242
12243Overview:
12244"""""""""
12245
12246The '``llvm.experimental.vector.reduce.fmax.*``' intrinsics do a floating point
12247``MAX`` reduction of a vector, returning the result as a scalar. The return type
12248matches the element-type of the vector input.
12249
12250If the intrinsic call has the ``nnan`` fast-math flag then the operation can
12251assume that NaNs are not present in the input vector.
12252
12253Arguments:
12254""""""""""
12255The argument to this intrinsic must be a vector of floating point values.
12256
12257'``llvm.experimental.vector.reduce.fmin.*``' Intrinsic
12258^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12259
12260Syntax:
12261"""""""
12262
12263::
12264
12265 declare float @llvm.experimental.vector.reduce.fmin.f32.v4f32(<4 x float> %a)
12266 declare double @llvm.experimental.vector.reduce.fmin.f64.v2f64(<2 x double> %a)
12267
12268Overview:
12269"""""""""
12270
12271The '``llvm.experimental.vector.reduce.fmin.*``' intrinsics do a floating point
12272``MIN`` reduction of a vector, returning the result as a scalar. The return type
12273matches the element-type of the vector input.
12274
12275If the intrinsic call has the ``nnan`` fast-math flag then the operation can
12276assume that NaNs are not present in the input vector.
12277
12278Arguments:
12279""""""""""
12280The argument to this intrinsic must be a vector of floating point values.
12281
Sean Silvab084af42012-12-07 10:36:55 +000012282Half Precision Floating Point Intrinsics
12283----------------------------------------
12284
12285For most target platforms, half precision floating point is a
12286storage-only format. This means that it is a dense encoding (in memory)
12287but does not support computation in the format.
12288
12289This means that code must first load the half-precision floating point
12290value as an i16, then convert it to float with
12291:ref:`llvm.convert.from.fp16 <int_convert_from_fp16>`. Computation can
12292then be performed on the float value (including extending to double
12293etc). To store the value back to memory, it is first converted to float
12294if needed, then converted to i16 with
12295:ref:`llvm.convert.to.fp16 <int_convert_to_fp16>`, then storing as an
12296i16 value.
12297
12298.. _int_convert_to_fp16:
12299
12300'``llvm.convert.to.fp16``' Intrinsic
12301^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12302
12303Syntax:
12304"""""""
12305
12306::
12307
Tim Northoverfd7e4242014-07-17 10:51:23 +000012308 declare i16 @llvm.convert.to.fp16.f32(float %a)
12309 declare i16 @llvm.convert.to.fp16.f64(double %a)
Sean Silvab084af42012-12-07 10:36:55 +000012310
12311Overview:
12312"""""""""
12313
Tim Northoverfd7e4242014-07-17 10:51:23 +000012314The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
12315conventional floating point type to half precision floating point format.
Sean Silvab084af42012-12-07 10:36:55 +000012316
12317Arguments:
12318""""""""""
12319
12320The intrinsic function contains single argument - the value to be
12321converted.
12322
12323Semantics:
12324""""""""""
12325
Tim Northoverfd7e4242014-07-17 10:51:23 +000012326The '``llvm.convert.to.fp16``' intrinsic function performs a conversion from a
12327conventional floating point format to half precision floating point format. The
12328return value is an ``i16`` which contains the converted number.
Sean Silvab084af42012-12-07 10:36:55 +000012329
12330Examples:
12331"""""""""
12332
12333.. code-block:: llvm
12334
Tim Northoverfd7e4242014-07-17 10:51:23 +000012335 %res = call i16 @llvm.convert.to.fp16.f32(float %a)
Sean Silvab084af42012-12-07 10:36:55 +000012336 store i16 %res, i16* @x, align 2
12337
12338.. _int_convert_from_fp16:
12339
12340'``llvm.convert.from.fp16``' Intrinsic
12341^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12342
12343Syntax:
12344"""""""
12345
12346::
12347
Tim Northoverfd7e4242014-07-17 10:51:23 +000012348 declare float @llvm.convert.from.fp16.f32(i16 %a)
12349 declare double @llvm.convert.from.fp16.f64(i16 %a)
Sean Silvab084af42012-12-07 10:36:55 +000012350
12351Overview:
12352"""""""""
12353
12354The '``llvm.convert.from.fp16``' intrinsic function performs a
12355conversion from half precision floating point format to single precision
12356floating point format.
12357
12358Arguments:
12359""""""""""
12360
12361The intrinsic function contains single argument - the value to be
12362converted.
12363
12364Semantics:
12365""""""""""
12366
12367The '``llvm.convert.from.fp16``' intrinsic function performs a
12368conversion from half single precision floating point format to single
12369precision floating point format. The input half-float value is
12370represented by an ``i16`` value.
12371
12372Examples:
12373"""""""""
12374
12375.. code-block:: llvm
12376
David Blaikiec7aabbb2015-03-04 22:06:14 +000012377 %a = load i16, i16* @x, align 2
Matt Arsenault3e3ddda2014-07-10 03:22:16 +000012378 %res = call float @llvm.convert.from.fp16(i16 %a)
Sean Silvab084af42012-12-07 10:36:55 +000012379
Duncan P. N. Exon Smithe2741802015-03-03 17:24:31 +000012380.. _dbg_intrinsics:
12381
Sean Silvab084af42012-12-07 10:36:55 +000012382Debugger Intrinsics
12383-------------------
12384
12385The LLVM debugger intrinsics (which all start with ``llvm.dbg.``
12386prefix), are described in the `LLVM Source Level
Hans Wennborg65195622017-09-28 15:16:37 +000012387Debugging <SourceLevelDebugging.html#format-common-intrinsics>`_
Sean Silvab084af42012-12-07 10:36:55 +000012388document.
12389
12390Exception Handling Intrinsics
12391-----------------------------
12392
12393The LLVM exception handling intrinsics (which all start with
12394``llvm.eh.`` prefix), are described in the `LLVM Exception
Hans Wennborg65195622017-09-28 15:16:37 +000012395Handling <ExceptionHandling.html#format-common-intrinsics>`_ document.
Sean Silvab084af42012-12-07 10:36:55 +000012396
12397.. _int_trampoline:
12398
12399Trampoline Intrinsics
12400---------------------
12401
12402These intrinsics make it possible to excise one parameter, marked with
12403the :ref:`nest <nest>` attribute, from a function. The result is a
12404callable function pointer lacking the nest parameter - the caller does
12405not need to provide a value for it. Instead, the value to use is stored
12406in advance in a "trampoline", a block of memory usually allocated on the
12407stack, which also contains code to splice the nest value into the
12408argument list. This is used to implement the GCC nested function address
12409extension.
12410
12411For example, if the function is ``i32 f(i8* nest %c, i32 %x, i32 %y)``
12412then the resulting function pointer has signature ``i32 (i32, i32)*``.
12413It can be created as follows:
12414
12415.. code-block:: llvm
12416
12417 %tramp = alloca [10 x i8], align 4 ; size and alignment only correct for X86
David Blaikie16a97eb2015-03-04 22:02:58 +000012418 %tramp1 = getelementptr [10 x i8], [10 x i8]* %tramp, i32 0, i32 0
Sean Silvab084af42012-12-07 10:36:55 +000012419 call i8* @llvm.init.trampoline(i8* %tramp1, i8* bitcast (i32 (i8*, i32, i32)* @f to i8*), i8* %nval)
12420 %p = call i8* @llvm.adjust.trampoline(i8* %tramp1)
12421 %fp = bitcast i8* %p to i32 (i32, i32)*
12422
12423The call ``%val = call i32 %fp(i32 %x, i32 %y)`` is then equivalent to
12424``%val = call i32 %f(i8* %nval, i32 %x, i32 %y)``.
12425
12426.. _int_it:
12427
12428'``llvm.init.trampoline``' Intrinsic
12429^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12430
12431Syntax:
12432"""""""
12433
12434::
12435
12436 declare void @llvm.init.trampoline(i8* <tramp>, i8* <func>, i8* <nval>)
12437
12438Overview:
12439"""""""""
12440
12441This fills the memory pointed to by ``tramp`` with executable code,
12442turning it into a trampoline.
12443
12444Arguments:
12445""""""""""
12446
12447The ``llvm.init.trampoline`` intrinsic takes three arguments, all
12448pointers. The ``tramp`` argument must point to a sufficiently large and
12449sufficiently aligned block of memory; this memory is written to by the
12450intrinsic. Note that the size and the alignment are target-specific -
12451LLVM currently provides no portable way of determining them, so a
12452front-end that generates this intrinsic needs to have some
12453target-specific knowledge. The ``func`` argument must hold a function
12454bitcast to an ``i8*``.
12455
12456Semantics:
12457""""""""""
12458
12459The block of memory pointed to by ``tramp`` is filled with target
12460dependent code, turning it into a function. Then ``tramp`` needs to be
12461passed to :ref:`llvm.adjust.trampoline <int_at>` to get a pointer which can
12462be :ref:`bitcast (to a new function) and called <int_trampoline>`. The new
12463function's signature is the same as that of ``func`` with any arguments
12464marked with the ``nest`` attribute removed. At most one such ``nest``
12465argument is allowed, and it must be of pointer type. Calling the new
12466function is equivalent to calling ``func`` with the same argument list,
12467but with ``nval`` used for the missing ``nest`` argument. If, after
12468calling ``llvm.init.trampoline``, the memory pointed to by ``tramp`` is
12469modified, then the effect of any later call to the returned function
12470pointer is undefined.
12471
12472.. _int_at:
12473
12474'``llvm.adjust.trampoline``' Intrinsic
12475^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12476
12477Syntax:
12478"""""""
12479
12480::
12481
12482 declare i8* @llvm.adjust.trampoline(i8* <tramp>)
12483
12484Overview:
12485"""""""""
12486
12487This performs any required machine-specific adjustment to the address of
12488a trampoline (passed as ``tramp``).
12489
12490Arguments:
12491""""""""""
12492
12493``tramp`` must point to a block of memory which already has trampoline
12494code filled in by a previous call to
12495:ref:`llvm.init.trampoline <int_it>`.
12496
12497Semantics:
12498""""""""""
12499
12500On some architectures the address of the code to be executed needs to be
Sanjay Patel69bf48e2014-07-04 19:40:43 +000012501different than the address where the trampoline is actually stored. This
Sean Silvab084af42012-12-07 10:36:55 +000012502intrinsic returns the executable address corresponding to ``tramp``
12503after performing the required machine specific adjustments. The pointer
12504returned can then be :ref:`bitcast and executed <int_trampoline>`.
12505
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012506.. _int_mload_mstore:
12507
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012508Masked Vector Load and Store Intrinsics
12509---------------------------------------
12510
12511LLVM provides intrinsics for predicated vector load and store operations. The predicate is specified by a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits of the mask are on, the intrinsic is identical to a regular vector load or store. When all bits are off, no memory is accessed.
12512
12513.. _int_mload:
12514
12515'``llvm.masked.load.*``' Intrinsics
12516^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12517
12518Syntax:
12519"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012520This is an overloaded intrinsic. The loaded data is a vector of any integer, floating point or pointer data type.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012521
12522::
12523
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012524 declare <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
12525 declare <2 x double> @llvm.masked.load.v2f64.p0v2f64 (<2 x double>* <ptr>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012526 ;; The data is a vector of pointers to double
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012527 declare <8 x double*> @llvm.masked.load.v8p0f64.p0v8p0f64 (<8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x double*> <passthru>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012528 ;; The data is a vector of function pointers
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012529 declare <8 x i32 ()*> @llvm.masked.load.v8p0f_i32f.p0v8p0f_i32f (<8 x i32 ()*>* <ptr>, i32 <alignment>, <8 x i1> <mask>, <8 x i32 ()*> <passthru>)
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012530
12531Overview:
12532"""""""""
12533
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012534Reads a vector from memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012535
12536
12537Arguments:
12538""""""""""
12539
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012540The first operand is the base pointer for the load. The second operand is the alignment of the source location. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the base pointer and the type of the '``passthru``' operand are the same vector types.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012541
12542
12543Semantics:
12544""""""""""
12545
12546The '``llvm.masked.load``' intrinsic is designed for conditional reading of selected vector elements in a single IR operation. It is useful for targets that support vector masked loads and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar load operations.
12547The result of this operation is equivalent to a regular vector load instruction followed by a 'select' between the loaded and the passthru values, predicated on the same mask. However, using this intrinsic prevents exceptions on memory access to masked-off lanes.
12548
12549
12550::
12551
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012552 %res = call <16 x float> @llvm.masked.load.v16f32.p0v16f32 (<16 x float>* %ptr, i32 4, <16 x i1>%mask, <16 x float> %passthru)
Mehdi Amini4a121fa2015-03-14 22:04:06 +000012553
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012554 ;; The result of the two following instructions is identical aside from potential memory access exception
David Blaikiec7aabbb2015-03-04 22:06:14 +000012555 %loadlal = load <16 x float>, <16 x float>* %ptr, align 4
Elena Demikhovskye86c8c82014-12-29 09:47:51 +000012556 %res = select <16 x i1> %mask, <16 x float> %loadlal, <16 x float> %passthru
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012557
12558.. _int_mstore:
12559
12560'``llvm.masked.store.*``' Intrinsics
12561^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12562
12563Syntax:
12564"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012565This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating point or pointer data type.
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012566
12567::
12568
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012569 declare void @llvm.masked.store.v8i32.p0v8i32 (<8 x i32> <value>, <8 x i32>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
12570 declare void @llvm.masked.store.v16f32.p0v16f32 (<16 x float> <value>, <16 x float>* <ptr>, i32 <alignment>, <16 x i1> <mask>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012571 ;; The data is a vector of pointers to double
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012572 declare void @llvm.masked.store.v8p0f64.p0v8p0f64 (<8 x double*> <value>, <8 x double*>* <ptr>, i32 <alignment>, <8 x i1> <mask>)
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012573 ;; The data is a vector of function pointers
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012574 declare void @llvm.masked.store.v4p0f_i32f.p0v4p0f_i32f (<4 x i32 ()*> <value>, <4 x i32 ()*>* <ptr>, i32 <alignment>, <4 x i1> <mask>)
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012575
12576Overview:
12577"""""""""
12578
12579Writes a vector to memory according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
12580
12581Arguments:
12582""""""""""
12583
12584The first operand is the vector value to be written to memory. The second operand is the base pointer for the store, it has the same underlying type as the value operand. The third operand is the alignment of the destination location. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
12585
12586
12587Semantics:
12588""""""""""
12589
12590The '``llvm.masked.store``' intrinsics is designed for conditional writing of selected vector elements in a single IR operation. It is useful for targets that support vector masked store and allows vectorizing predicated basic blocks on these targets. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
12591The result of this operation is equivalent to a load-modify-store sequence. However, using this intrinsic prevents exceptions and data races on memory access to masked-off lanes.
12592
12593::
12594
Artur Pilipenko7ad95ec2016-06-28 18:27:25 +000012595 call void @llvm.masked.store.v16f32.p0v16f32(<16 x float> %value, <16 x float>* %ptr, i32 4, <16 x i1> %mask)
Mehdi Amini4a121fa2015-03-14 22:04:06 +000012596
Elena Demikhovskye86c8c82014-12-29 09:47:51 +000012597 ;; The result of the following instructions is identical aside from potential data races and memory access exceptions
David Blaikiec7aabbb2015-03-04 22:06:14 +000012598 %oldval = load <16 x float>, <16 x float>* %ptr, align 4
Elena Demikhovsky3d13f1c2014-12-25 09:29:13 +000012599 %res = select <16 x i1> %mask, <16 x float> %value, <16 x float> %oldval
12600 store <16 x float> %res, <16 x float>* %ptr, align 4
12601
12602
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012603Masked Vector Gather and Scatter Intrinsics
12604-------------------------------------------
12605
12606LLVM provides intrinsics for vector gather and scatter operations. They are similar to :ref:`Masked Vector Load and Store <int_mload_mstore>`, except they are designed for arbitrary memory accesses, rather than sequential memory accesses. Gather and scatter also employ a mask operand, which holds one bit per vector element, switching the associated vector lane on or off. The memory addresses corresponding to the "off" lanes are not accessed. When all bits are off, no memory is accessed.
12607
12608.. _int_mgather:
12609
12610'``llvm.masked.gather.*``' Intrinsics
12611^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12612
12613Syntax:
12614"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012615This is an overloaded intrinsic. The loaded data are multiple scalar values of any integer, floating point or pointer data type gathered together into one vector.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012616
12617::
12618
Elad Cohenef5798a2017-05-03 12:28:54 +000012619 declare <16 x float> @llvm.masked.gather.v16f32.v16p0f32 (<16 x float*> <ptrs>, i32 <alignment>, <16 x i1> <mask>, <16 x float> <passthru>)
12620 declare <2 x double> @llvm.masked.gather.v2f64.v2p1f64 (<2 x double addrspace(1)*> <ptrs>, i32 <alignment>, <2 x i1> <mask>, <2 x double> <passthru>)
12621 declare <8 x float*> @llvm.masked.gather.v8p0f32.v8p0p0f32 (<8 x float**> <ptrs>, i32 <alignment>, <8 x i1> <mask>, <8 x float*> <passthru>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012622
12623Overview:
12624"""""""""
12625
12626Reads scalar values from arbitrary memory locations and gathers them into one vector. The memory locations are provided in the vector of pointers '``ptrs``'. The memory is accessed according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes. The masked-off lanes in the result vector are taken from the corresponding lanes of the '``passthru``' operand.
12627
12628
12629Arguments:
12630""""""""""
12631
12632The first operand is a vector of pointers which holds all memory addresses to read. The second operand is an alignment of the source addresses. It must be a constant integer value. The third operand, mask, is a vector of boolean values with the same number of elements as the return type. The fourth is a pass-through value that is used to fill the masked-off lanes of the result. The return type, underlying type of the vector of pointers and the type of the '``passthru``' operand are the same vector types.
12633
12634
12635Semantics:
12636""""""""""
12637
12638The '``llvm.masked.gather``' intrinsic is designed for conditional reading of multiple scalar values from arbitrary memory locations in a single IR operation. It is useful for targets that support vector masked gathers and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of scalar load operations.
12639The semantics of this operation are equivalent to a sequence of conditional scalar loads with subsequent gathering all loaded values into a single vector. The mask restricts memory access to certain lanes and facilitates vectorization of predicated basic blocks.
12640
12641
12642::
12643
Elad Cohenef5798a2017-05-03 12:28:54 +000012644 %res = call <4 x double> @llvm.masked.gather.v4f64.v4p0f64 (<4 x double*> %ptrs, i32 8, <4 x i1> <i1 true, i1 true, i1 true, i1 true>, <4 x double> undef)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012645
12646 ;; The gather with all-true mask is equivalent to the following instruction sequence
12647 %ptr0 = extractelement <4 x double*> %ptrs, i32 0
12648 %ptr1 = extractelement <4 x double*> %ptrs, i32 1
12649 %ptr2 = extractelement <4 x double*> %ptrs, i32 2
12650 %ptr3 = extractelement <4 x double*> %ptrs, i32 3
12651
12652 %val0 = load double, double* %ptr0, align 8
12653 %val1 = load double, double* %ptr1, align 8
12654 %val2 = load double, double* %ptr2, align 8
12655 %val3 = load double, double* %ptr3, align 8
12656
12657 %vec0 = insertelement <4 x double>undef, %val0, 0
12658 %vec01 = insertelement <4 x double>%vec0, %val1, 1
12659 %vec012 = insertelement <4 x double>%vec01, %val2, 2
12660 %vec0123 = insertelement <4 x double>%vec012, %val3, 3
12661
12662.. _int_mscatter:
12663
12664'``llvm.masked.scatter.*``' Intrinsics
12665^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12666
12667Syntax:
12668"""""""
Elena Demikhovsky1ca72e12015-11-19 07:17:16 +000012669This is an overloaded intrinsic. The data stored in memory is a vector of any integer, floating point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012670
12671::
12672
Elad Cohenef5798a2017-05-03 12:28:54 +000012673 declare void @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> <value>, <8 x i32*> <ptrs>, i32 <alignment>, <8 x i1> <mask>)
12674 declare void @llvm.masked.scatter.v16f32.v16p1f32 (<16 x float> <value>, <16 x float addrspace(1)*> <ptrs>, i32 <alignment>, <16 x i1> <mask>)
12675 declare void @llvm.masked.scatter.v4p0f64.v4p0p0f64 (<4 x double*> <value>, <4 x double**> <ptrs>, i32 <alignment>, <4 x i1> <mask>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012676
12677Overview:
12678"""""""""
12679
12680Writes each element from the value vector to the corresponding memory address. The memory addresses are represented as a vector of pointers. Writing is done according to the provided mask. The mask holds a bit for each vector lane, and is used to prevent memory accesses to the masked-off lanes.
12681
12682Arguments:
12683""""""""""
12684
12685The first operand is a vector value to be written to memory. The second operand is a vector of pointers, pointing to where the value elements should be stored. It has the same underlying type as the value operand. The third operand is an alignment of the destination addresses. The fourth operand, mask, is a vector of boolean values. The types of the mask and the value operand must have the same number of vector elements.
12686
12687
12688Semantics:
12689""""""""""
12690
Bruce Mitchenere9ffb452015-09-12 01:17:08 +000012691The '``llvm.masked.scatter``' intrinsics is designed for writing selected vector elements to arbitrary memory addresses in a single IR operation. The operation may be conditional, when not all bits in the mask are switched on. It is useful for targets that support vector masked scatter and allows vectorizing basic blocks with data and control divergence. Other targets may support this intrinsic differently, for example by lowering it into a sequence of branches that guard scalar store operations.
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012692
12693::
12694
Sylvestre Ledru84666a12016-02-14 20:16:22 +000012695 ;; This instruction unconditionally stores data vector in multiple addresses
Elad Cohenef5798a2017-05-03 12:28:54 +000012696 call @llvm.masked.scatter.v8i32.v8p0i32 (<8 x i32> %value, <8 x i32*> %ptrs, i32 4, <8 x i1> <true, true, .. true>)
Elena Demikhovsky82cdd652015-05-07 12:25:11 +000012697
12698 ;; It is equivalent to a list of scalar stores
12699 %val0 = extractelement <8 x i32> %value, i32 0
12700 %val1 = extractelement <8 x i32> %value, i32 1
12701 ..
12702 %val7 = extractelement <8 x i32> %value, i32 7
12703 %ptr0 = extractelement <8 x i32*> %ptrs, i32 0
12704 %ptr1 = extractelement <8 x i32*> %ptrs, i32 1
12705 ..
12706 %ptr7 = extractelement <8 x i32*> %ptrs, i32 7
12707 ;; Note: the order of the following stores is important when they overlap:
12708 store i32 %val0, i32* %ptr0, align 4
12709 store i32 %val1, i32* %ptr1, align 4
12710 ..
12711 store i32 %val7, i32* %ptr7, align 4
12712
12713
Sean Silvab084af42012-12-07 10:36:55 +000012714Memory Use Markers
12715------------------
12716
Sanjay Patel69bf48e2014-07-04 19:40:43 +000012717This class of intrinsics provides information about the lifetime of
Sean Silvab084af42012-12-07 10:36:55 +000012718memory objects and ranges where variables are immutable.
12719
Reid Klecknera534a382013-12-19 02:14:12 +000012720.. _int_lifestart:
12721
Sean Silvab084af42012-12-07 10:36:55 +000012722'``llvm.lifetime.start``' Intrinsic
12723^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12724
12725Syntax:
12726"""""""
12727
12728::
12729
12730 declare void @llvm.lifetime.start(i64 <size>, i8* nocapture <ptr>)
12731
12732Overview:
12733"""""""""
12734
12735The '``llvm.lifetime.start``' intrinsic specifies the start of a memory
12736object's lifetime.
12737
12738Arguments:
12739""""""""""
12740
12741The first argument is a constant integer representing the size of the
12742object, or -1 if it is variable sized. The second argument is a pointer
12743to the object.
12744
12745Semantics:
12746""""""""""
12747
12748This intrinsic indicates that before this point in the code, the value
12749of the memory pointed to by ``ptr`` is dead. This means that it is known
12750to never be used and has an undefined value. A load from the pointer
12751that precedes this intrinsic can be replaced with ``'undef'``.
12752
Reid Klecknera534a382013-12-19 02:14:12 +000012753.. _int_lifeend:
12754
Sean Silvab084af42012-12-07 10:36:55 +000012755'``llvm.lifetime.end``' Intrinsic
12756^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12757
12758Syntax:
12759"""""""
12760
12761::
12762
12763 declare void @llvm.lifetime.end(i64 <size>, i8* nocapture <ptr>)
12764
12765Overview:
12766"""""""""
12767
12768The '``llvm.lifetime.end``' intrinsic specifies the end of a memory
12769object's lifetime.
12770
12771Arguments:
12772""""""""""
12773
12774The first argument is a constant integer representing the size of the
12775object, or -1 if it is variable sized. The second argument is a pointer
12776to the object.
12777
12778Semantics:
12779""""""""""
12780
12781This intrinsic indicates that after this point in the code, the value of
12782the memory pointed to by ``ptr`` is dead. This means that it is known to
12783never be used and has an undefined value. Any stores into the memory
12784object following this intrinsic may be removed as dead.
12785
12786'``llvm.invariant.start``' Intrinsic
12787^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12788
12789Syntax:
12790"""""""
Mehdi Amini8c629ec2016-08-13 23:31:24 +000012791This is an overloaded intrinsic. The memory object can belong to any address space.
Sean Silvab084af42012-12-07 10:36:55 +000012792
12793::
12794
Mehdi Amini8c629ec2016-08-13 23:31:24 +000012795 declare {}* @llvm.invariant.start.p0i8(i64 <size>, i8* nocapture <ptr>)
Sean Silvab084af42012-12-07 10:36:55 +000012796
12797Overview:
12798"""""""""
12799
12800The '``llvm.invariant.start``' intrinsic specifies that the contents of
12801a memory object will not change.
12802
12803Arguments:
12804""""""""""
12805
12806The first argument is a constant integer representing the size of the
12807object, or -1 if it is variable sized. The second argument is a pointer
12808to the object.
12809
12810Semantics:
12811""""""""""
12812
12813This intrinsic indicates that until an ``llvm.invariant.end`` that uses
12814the return value, the referenced memory location is constant and
12815unchanging.
12816
12817'``llvm.invariant.end``' Intrinsic
12818^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12819
12820Syntax:
12821"""""""
Mehdi Amini8c629ec2016-08-13 23:31:24 +000012822This is an overloaded intrinsic. The memory object can belong to any address space.
Sean Silvab084af42012-12-07 10:36:55 +000012823
12824::
12825
Mehdi Amini8c629ec2016-08-13 23:31:24 +000012826 declare void @llvm.invariant.end.p0i8({}* <start>, i64 <size>, i8* nocapture <ptr>)
Sean Silvab084af42012-12-07 10:36:55 +000012827
12828Overview:
12829"""""""""
12830
12831The '``llvm.invariant.end``' intrinsic specifies that the contents of a
12832memory object are mutable.
12833
12834Arguments:
12835""""""""""
12836
12837The first argument is the matching ``llvm.invariant.start`` intrinsic.
12838The second argument is a constant integer representing the size of the
12839object, or -1 if it is variable sized and the third argument is a
12840pointer to the object.
12841
12842Semantics:
12843""""""""""
12844
12845This intrinsic indicates that the memory is mutable again.
12846
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000012847'``llvm.invariant.group.barrier``' Intrinsic
12848^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12849
12850Syntax:
12851"""""""
Yaxun Liu407ca362017-11-16 16:32:16 +000012852This is an overloaded intrinsic. The memory object can belong to any address
12853space. The returned pointer must belong to the same address space as the
12854argument.
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000012855
12856::
12857
Yaxun Liu407ca362017-11-16 16:32:16 +000012858 declare i8* @llvm.invariant.group.barrier.p0i8(i8* <ptr>)
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000012859
12860Overview:
12861"""""""""
12862
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000012863The '``llvm.invariant.group.barrier``' intrinsic can be used when an invariant
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000012864established by invariant.group metadata no longer holds, to obtain a new pointer
12865value that does not carry the invariant information.
12866
12867
12868Arguments:
12869""""""""""
12870
12871The ``llvm.invariant.group.barrier`` takes only one argument, which is
12872the pointer to the memory for which the ``invariant.group`` no longer holds.
12873
12874Semantics:
12875""""""""""
12876
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000012877Returns another pointer that aliases its argument but which is considered different
Piotr Padlewski6c15ec42015-09-15 18:32:14 +000012878for the purposes of ``load``/``store`` ``invariant.group`` metadata.
12879
Andrew Kaylora0a11642017-01-26 23:27:59 +000012880Constrained Floating Point Intrinsics
12881-------------------------------------
12882
12883These intrinsics are used to provide special handling of floating point
12884operations when specific rounding mode or floating point exception behavior is
12885required. By default, LLVM optimization passes assume that the rounding mode is
12886round-to-nearest and that floating point exceptions will not be monitored.
12887Constrained FP intrinsics are used to support non-default rounding modes and
12888accurately preserve exception behavior without compromising LLVM's ability to
12889optimize FP code when the default behavior is used.
12890
12891Each of these intrinsics corresponds to a normal floating point operation. The
12892first two arguments and the return value are the same as the corresponding FP
12893operation.
12894
12895The third argument is a metadata argument specifying the rounding mode to be
12896assumed. This argument must be one of the following strings:
12897
12898::
Andrew Kaylor73b4a9a2017-04-20 18:18:36 +000012899
Andrew Kaylora0a11642017-01-26 23:27:59 +000012900 "round.dynamic"
12901 "round.tonearest"
12902 "round.downward"
12903 "round.upward"
12904 "round.towardzero"
12905
12906If this argument is "round.dynamic" optimization passes must assume that the
12907rounding mode is unknown and may change at runtime. No transformations that
12908depend on rounding mode may be performed in this case.
12909
12910The other possible values for the rounding mode argument correspond to the
12911similarly named IEEE rounding modes. If the argument is any of these values
12912optimization passes may perform transformations as long as they are consistent
12913with the specified rounding mode.
12914
12915For example, 'x-0'->'x' is not a valid transformation if the rounding mode is
12916"round.downward" or "round.dynamic" because if the value of 'x' is +0 then
12917'x-0' should evaluate to '-0' when rounding downward. However, this
12918transformation is legal for all other rounding modes.
12919
12920For values other than "round.dynamic" optimization passes may assume that the
12921actual runtime rounding mode (as defined in a target-specific manner) matches
12922the specified rounding mode, but this is not guaranteed. Using a specific
12923non-dynamic rounding mode which does not match the actual rounding mode at
12924runtime results in undefined behavior.
12925
12926The fourth argument to the constrained floating point intrinsics specifies the
12927required exception behavior. This argument must be one of the following
12928strings:
12929
12930::
Andrew Kaylor73b4a9a2017-04-20 18:18:36 +000012931
Andrew Kaylora0a11642017-01-26 23:27:59 +000012932 "fpexcept.ignore"
12933 "fpexcept.maytrap"
12934 "fpexcept.strict"
12935
12936If this argument is "fpexcept.ignore" optimization passes may assume that the
12937exception status flags will not be read and that floating point exceptions will
12938be masked. This allows transformations to be performed that may change the
12939exception semantics of the original code. For example, FP operations may be
12940speculatively executed in this case whereas they must not be for either of the
12941other possible values of this argument.
12942
12943If the exception behavior argument is "fpexcept.maytrap" optimization passes
12944must avoid transformations that may raise exceptions that would not have been
12945raised by the original code (such as speculatively executing FP operations), but
12946passes are not required to preserve all exceptions that are implied by the
12947original code. For example, exceptions may be potentially hidden by constant
12948folding.
12949
12950If the exception behavior argument is "fpexcept.strict" all transformations must
12951strictly preserve the floating point exception semantics of the original code.
12952Any FP exception that would have been raised by the original code must be raised
12953by the transformed code, and the transformed code must not raise any FP
12954exceptions that would not have been raised by the original code. This is the
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000012955exception behavior argument that will be used if the code being compiled reads
Andrew Kaylora0a11642017-01-26 23:27:59 +000012956the FP exception status flags, but this mode can also be used with code that
12957unmasks FP exceptions.
12958
12959The number and order of floating point exceptions is NOT guaranteed. For
12960example, a series of FP operations that each may raise exceptions may be
12961vectorized into a single instruction that raises each unique exception a single
12962time.
12963
12964
12965'``llvm.experimental.constrained.fadd``' Intrinsic
12966^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
12967
12968Syntax:
12969"""""""
12970
12971::
12972
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000012973 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000012974 @llvm.experimental.constrained.fadd(<type> <op1>, <type> <op2>,
12975 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000012976 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000012977
12978Overview:
12979"""""""""
12980
12981The '``llvm.experimental.constrained.fadd``' intrinsic returns the sum of its
12982two operands.
12983
12984
12985Arguments:
12986""""""""""
12987
12988The first two arguments to the '``llvm.experimental.constrained.fadd``'
12989intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
12990of floating point values. Both arguments must have identical types.
12991
12992The third and fourth arguments specify the rounding mode and exception
12993behavior as described above.
12994
12995Semantics:
12996""""""""""
12997
12998The value produced is the floating point sum of the two value operands and has
12999the same type as the operands.
13000
13001
13002'``llvm.experimental.constrained.fsub``' Intrinsic
13003^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13004
13005Syntax:
13006"""""""
13007
13008::
13009
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013010 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013011 @llvm.experimental.constrained.fsub(<type> <op1>, <type> <op2>,
13012 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013013 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013014
13015Overview:
13016"""""""""
13017
13018The '``llvm.experimental.constrained.fsub``' intrinsic returns the difference
13019of its two operands.
13020
13021
13022Arguments:
13023""""""""""
13024
13025The first two arguments to the '``llvm.experimental.constrained.fsub``'
13026intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
13027of floating point values. Both arguments must have identical types.
13028
13029The third and fourth arguments specify the rounding mode and exception
13030behavior as described above.
13031
13032Semantics:
13033""""""""""
13034
13035The value produced is the floating point difference of the two value operands
13036and has the same type as the operands.
13037
13038
13039'``llvm.experimental.constrained.fmul``' Intrinsic
13040^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13041
13042Syntax:
13043"""""""
13044
13045::
13046
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013047 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013048 @llvm.experimental.constrained.fmul(<type> <op1>, <type> <op2>,
13049 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013050 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013051
13052Overview:
13053"""""""""
13054
13055The '``llvm.experimental.constrained.fmul``' intrinsic returns the product of
13056its two operands.
13057
13058
13059Arguments:
13060""""""""""
13061
13062The first two arguments to the '``llvm.experimental.constrained.fmul``'
13063intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
13064of floating point values. Both arguments must have identical types.
13065
13066The third and fourth arguments specify the rounding mode and exception
13067behavior as described above.
13068
13069Semantics:
13070""""""""""
13071
13072The value produced is the floating point product of the two value operands and
13073has the same type as the operands.
13074
13075
13076'``llvm.experimental.constrained.fdiv``' Intrinsic
13077^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13078
13079Syntax:
13080"""""""
13081
13082::
13083
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013084 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013085 @llvm.experimental.constrained.fdiv(<type> <op1>, <type> <op2>,
13086 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013087 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013088
13089Overview:
13090"""""""""
13091
13092The '``llvm.experimental.constrained.fdiv``' intrinsic returns the quotient of
13093its two operands.
13094
13095
13096Arguments:
13097""""""""""
13098
13099The first two arguments to the '``llvm.experimental.constrained.fdiv``'
13100intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
13101of floating point values. Both arguments must have identical types.
13102
13103The third and fourth arguments specify the rounding mode and exception
13104behavior as described above.
13105
13106Semantics:
13107""""""""""
13108
13109The value produced is the floating point quotient of the two value operands and
13110has the same type as the operands.
13111
13112
13113'``llvm.experimental.constrained.frem``' Intrinsic
13114^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13115
13116Syntax:
13117"""""""
13118
13119::
13120
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013121 declare <type>
Andrew Kaylora0a11642017-01-26 23:27:59 +000013122 @llvm.experimental.constrained.frem(<type> <op1>, <type> <op2>,
13123 metadata <rounding mode>,
Andrew Kaylorf4660012017-05-25 21:31:00 +000013124 metadata <exception behavior>)
Andrew Kaylora0a11642017-01-26 23:27:59 +000013125
13126Overview:
13127"""""""""
13128
13129The '``llvm.experimental.constrained.frem``' intrinsic returns the remainder
13130from the division of its two operands.
13131
13132
13133Arguments:
13134""""""""""
13135
13136The first two arguments to the '``llvm.experimental.constrained.frem``'
13137intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector <t_vector>`
13138of floating point values. Both arguments must have identical types.
13139
13140The third and fourth arguments specify the rounding mode and exception
13141behavior as described above. The rounding mode argument has no effect, since
13142the result of frem is never rounded, but the argument is included for
13143consistency with the other constrained floating point intrinsics.
13144
13145Semantics:
13146""""""""""
13147
13148The value produced is the floating point remainder from the division of the two
13149value operands and has the same type as the operands. The remainder has the
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013150same sign as the dividend.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013151
Wei Dinga131d3f2017-08-24 04:18:24 +000013152'``llvm.experimental.constrained.fma``' Intrinsic
13153^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13154
13155Syntax:
13156"""""""
13157
13158::
13159
13160 declare <type>
13161 @llvm.experimental.constrained.fma(<type> <op1>, <type> <op2>, <type> <op3>,
13162 metadata <rounding mode>,
13163 metadata <exception behavior>)
13164
13165Overview:
13166"""""""""
13167
13168The '``llvm.experimental.constrained.fma``' intrinsic returns the result of a
13169fused-multiply-add operation on its operands.
13170
13171Arguments:
13172""""""""""
13173
13174The first three arguments to the '``llvm.experimental.constrained.fma``'
13175intrinsic must be :ref:`floating point <t_floating>` or :ref:`vector
13176<t_vector>` of floating point values. All arguments must have identical types.
13177
13178The fourth and fifth arguments specify the rounding mode and exception behavior
13179as described above.
13180
13181Semantics:
13182""""""""""
13183
13184The result produced is the product of the first two operands added to the third
13185operand computed with infinite precision, and then rounded to the target
13186precision.
Andrew Kaylora0a11642017-01-26 23:27:59 +000013187
Andrew Kaylorf4660012017-05-25 21:31:00 +000013188Constrained libm-equivalent Intrinsics
13189--------------------------------------
13190
13191In addition to the basic floating point operations for which constrained
13192intrinsics are described above, there are constrained versions of various
13193operations which provide equivalent behavior to a corresponding libm function.
13194These intrinsics allow the precise behavior of these operations with respect to
13195rounding mode and exception behavior to be controlled.
13196
13197As with the basic constrained floating point intrinsics, the rounding mode
13198and exception behavior arguments only control the behavior of the optimizer.
13199They do not change the runtime floating point environment.
13200
13201
13202'``llvm.experimental.constrained.sqrt``' Intrinsic
13203^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13204
13205Syntax:
13206"""""""
13207
13208::
13209
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013210 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013211 @llvm.experimental.constrained.sqrt(<type> <op1>,
13212 metadata <rounding mode>,
13213 metadata <exception behavior>)
13214
13215Overview:
13216"""""""""
13217
13218The '``llvm.experimental.constrained.sqrt``' intrinsic returns the square root
13219of the specified value, returning the same value as the libm '``sqrt``'
13220functions would, but without setting ``errno``.
13221
13222Arguments:
13223""""""""""
13224
13225The first argument and the return type are floating point numbers of the same
13226type.
13227
13228The second and third arguments specify the rounding mode and exception
13229behavior as described above.
13230
13231Semantics:
13232""""""""""
13233
13234This function returns the nonnegative square root of the specified value.
13235If the value is less than negative zero, a floating point exception occurs
Hiroshi Inoue760c0c92018-01-16 13:19:48 +000013236and the return value is architecture specific.
Andrew Kaylorf4660012017-05-25 21:31:00 +000013237
13238
13239'``llvm.experimental.constrained.pow``' Intrinsic
13240^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13241
13242Syntax:
13243"""""""
13244
13245::
13246
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013247 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013248 @llvm.experimental.constrained.pow(<type> <op1>, <type> <op2>,
13249 metadata <rounding mode>,
13250 metadata <exception behavior>)
13251
13252Overview:
13253"""""""""
13254
13255The '``llvm.experimental.constrained.pow``' intrinsic returns the first operand
13256raised to the (positive or negative) power specified by the second operand.
13257
13258Arguments:
13259""""""""""
13260
13261The first two arguments and the return value are floating point numbers of the
13262same type. The second argument specifies the power to which the first argument
13263should be raised.
13264
13265The third and fourth arguments specify the rounding mode and exception
13266behavior as described above.
13267
13268Semantics:
13269""""""""""
13270
13271This function returns the first value raised to the second power,
13272returning the same values as the libm ``pow`` functions would, and
13273handles error conditions in the same way.
13274
13275
13276'``llvm.experimental.constrained.powi``' Intrinsic
13277^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13278
13279Syntax:
13280"""""""
13281
13282::
13283
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013284 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013285 @llvm.experimental.constrained.powi(<type> <op1>, i32 <op2>,
13286 metadata <rounding mode>,
13287 metadata <exception behavior>)
13288
13289Overview:
13290"""""""""
13291
13292The '``llvm.experimental.constrained.powi``' intrinsic returns the first operand
13293raised to the (positive or negative) power specified by the second operand. The
13294order of evaluation of multiplications is not defined. When a vector of floating
13295point type is used, the second argument remains a scalar integer value.
13296
13297
13298Arguments:
13299""""""""""
13300
13301The first argument and the return value are floating point numbers of the same
13302type. The second argument is a 32-bit signed integer specifying the power to
13303which the first argument should be raised.
13304
13305The third and fourth arguments specify the rounding mode and exception
13306behavior as described above.
13307
13308Semantics:
13309""""""""""
13310
13311This function returns the first value raised to the second power with an
13312unspecified sequence of rounding operations.
13313
13314
13315'``llvm.experimental.constrained.sin``' Intrinsic
13316^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13317
13318Syntax:
13319"""""""
13320
13321::
13322
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013323 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013324 @llvm.experimental.constrained.sin(<type> <op1>,
13325 metadata <rounding mode>,
13326 metadata <exception behavior>)
13327
13328Overview:
13329"""""""""
13330
13331The '``llvm.experimental.constrained.sin``' intrinsic returns the sine of the
13332first operand.
13333
13334Arguments:
13335""""""""""
13336
13337The first argument and the return type are floating point numbers of the same
13338type.
13339
13340The second and third arguments specify the rounding mode and exception
13341behavior as described above.
13342
13343Semantics:
13344""""""""""
13345
13346This function returns the sine of the specified operand, returning the
13347same values as the libm ``sin`` functions would, and handles error
13348conditions in the same way.
13349
13350
13351'``llvm.experimental.constrained.cos``' Intrinsic
13352^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13353
13354Syntax:
13355"""""""
13356
13357::
13358
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013359 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013360 @llvm.experimental.constrained.cos(<type> <op1>,
13361 metadata <rounding mode>,
13362 metadata <exception behavior>)
13363
13364Overview:
13365"""""""""
13366
13367The '``llvm.experimental.constrained.cos``' intrinsic returns the cosine of the
13368first operand.
13369
13370Arguments:
13371""""""""""
13372
13373The first argument and the return type are floating point numbers of the same
13374type.
13375
13376The second and third arguments specify the rounding mode and exception
13377behavior as described above.
13378
13379Semantics:
13380""""""""""
13381
13382This function returns the cosine of the specified operand, returning the
13383same values as the libm ``cos`` functions would, and handles error
13384conditions in the same way.
13385
13386
13387'``llvm.experimental.constrained.exp``' Intrinsic
13388^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13389
13390Syntax:
13391"""""""
13392
13393::
13394
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013395 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013396 @llvm.experimental.constrained.exp(<type> <op1>,
13397 metadata <rounding mode>,
13398 metadata <exception behavior>)
13399
13400Overview:
13401"""""""""
13402
13403The '``llvm.experimental.constrained.exp``' intrinsic computes the base-e
13404exponential of the specified value.
13405
13406Arguments:
13407""""""""""
13408
13409The first argument and the return value are floating point numbers of the same
13410type.
13411
13412The second and third arguments specify the rounding mode and exception
13413behavior as described above.
13414
13415Semantics:
13416""""""""""
13417
13418This function returns the same values as the libm ``exp`` functions
13419would, and handles error conditions in the same way.
13420
13421
13422'``llvm.experimental.constrained.exp2``' Intrinsic
13423^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13424
13425Syntax:
13426"""""""
13427
13428::
13429
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013430 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013431 @llvm.experimental.constrained.exp2(<type> <op1>,
13432 metadata <rounding mode>,
13433 metadata <exception behavior>)
13434
13435Overview:
13436"""""""""
13437
13438The '``llvm.experimental.constrained.exp2``' intrinsic computes the base-2
13439exponential of the specified value.
13440
13441
13442Arguments:
13443""""""""""
13444
13445The first argument and the return value are floating point numbers of the same
13446type.
13447
13448The second and third arguments specify the rounding mode and exception
13449behavior as described above.
13450
13451Semantics:
13452""""""""""
13453
13454This function returns the same values as the libm ``exp2`` functions
13455would, and handles error conditions in the same way.
13456
13457
13458'``llvm.experimental.constrained.log``' Intrinsic
13459^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13460
13461Syntax:
13462"""""""
13463
13464::
13465
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013466 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013467 @llvm.experimental.constrained.log(<type> <op1>,
13468 metadata <rounding mode>,
13469 metadata <exception behavior>)
13470
13471Overview:
13472"""""""""
13473
13474The '``llvm.experimental.constrained.log``' intrinsic computes the base-e
13475logarithm of the specified value.
13476
13477Arguments:
13478""""""""""
13479
13480The first argument and the return value are floating point numbers of the same
13481type.
13482
13483The second and third arguments specify the rounding mode and exception
13484behavior as described above.
13485
13486
13487Semantics:
13488""""""""""
13489
13490This function returns the same values as the libm ``log`` functions
13491would, and handles error conditions in the same way.
13492
13493
13494'``llvm.experimental.constrained.log10``' Intrinsic
13495^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13496
13497Syntax:
13498"""""""
13499
13500::
13501
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013502 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013503 @llvm.experimental.constrained.log10(<type> <op1>,
13504 metadata <rounding mode>,
13505 metadata <exception behavior>)
13506
13507Overview:
13508"""""""""
13509
13510The '``llvm.experimental.constrained.log10``' intrinsic computes the base-10
13511logarithm of the specified value.
13512
13513Arguments:
13514""""""""""
13515
13516The first argument and the return value are floating point numbers of the same
13517type.
13518
13519The second and third arguments specify the rounding mode and exception
13520behavior as described above.
13521
13522Semantics:
13523""""""""""
13524
13525This function returns the same values as the libm ``log10`` functions
13526would, and handles error conditions in the same way.
13527
13528
13529'``llvm.experimental.constrained.log2``' Intrinsic
13530^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13531
13532Syntax:
13533"""""""
13534
13535::
13536
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013537 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013538 @llvm.experimental.constrained.log2(<type> <op1>,
13539 metadata <rounding mode>,
13540 metadata <exception behavior>)
13541
13542Overview:
13543"""""""""
13544
13545The '``llvm.experimental.constrained.log2``' intrinsic computes the base-2
13546logarithm of the specified value.
13547
13548Arguments:
13549""""""""""
13550
13551The first argument and the return value are floating point numbers of the same
13552type.
13553
13554The second and third arguments specify the rounding mode and exception
13555behavior as described above.
13556
13557Semantics:
13558""""""""""
13559
13560This function returns the same values as the libm ``log2`` functions
13561would, and handles error conditions in the same way.
13562
13563
13564'``llvm.experimental.constrained.rint``' Intrinsic
13565^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13566
13567Syntax:
13568"""""""
13569
13570::
13571
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013572 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013573 @llvm.experimental.constrained.rint(<type> <op1>,
13574 metadata <rounding mode>,
13575 metadata <exception behavior>)
13576
13577Overview:
13578"""""""""
13579
13580The '``llvm.experimental.constrained.rint``' intrinsic returns the first
13581operand rounded to the nearest integer. It may raise an inexact floating point
13582exception if the operand is not an integer.
13583
13584Arguments:
13585""""""""""
13586
13587The first argument and the return value are floating point numbers of the same
13588type.
13589
13590The second and third arguments specify the rounding mode and exception
13591behavior as described above.
13592
13593Semantics:
13594""""""""""
13595
13596This function returns the same values as the libm ``rint`` functions
13597would, and handles error conditions in the same way. The rounding mode is
13598described, not determined, by the rounding mode argument. The actual rounding
13599mode is determined by the runtime floating point environment. The rounding
13600mode argument is only intended as information to the compiler.
13601
13602
13603'``llvm.experimental.constrained.nearbyint``' Intrinsic
13604^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13605
13606Syntax:
13607"""""""
13608
13609::
13610
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000013611 declare <type>
Andrew Kaylorf4660012017-05-25 21:31:00 +000013612 @llvm.experimental.constrained.nearbyint(<type> <op1>,
13613 metadata <rounding mode>,
13614 metadata <exception behavior>)
13615
13616Overview:
13617"""""""""
13618
13619The '``llvm.experimental.constrained.nearbyint``' intrinsic returns the first
13620operand rounded to the nearest integer. It will not raise an inexact floating
13621point exception if the operand is not an integer.
13622
13623
13624Arguments:
13625""""""""""
13626
13627The first argument and the return value are floating point numbers of the same
13628type.
13629
13630The second and third arguments specify the rounding mode and exception
13631behavior as described above.
13632
13633Semantics:
13634""""""""""
13635
13636This function returns the same values as the libm ``nearbyint`` functions
13637would, and handles error conditions in the same way. The rounding mode is
13638described, not determined, by the rounding mode argument. The actual rounding
13639mode is determined by the runtime floating point environment. The rounding
13640mode argument is only intended as information to the compiler.
13641
13642
Sean Silvab084af42012-12-07 10:36:55 +000013643General Intrinsics
13644------------------
13645
13646This class of intrinsics is designed to be generic and has no specific
13647purpose.
13648
13649'``llvm.var.annotation``' Intrinsic
13650^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13651
13652Syntax:
13653"""""""
13654
13655::
13656
13657 declare void @llvm.var.annotation(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
13658
13659Overview:
13660"""""""""
13661
13662The '``llvm.var.annotation``' intrinsic.
13663
13664Arguments:
13665""""""""""
13666
13667The first argument is a pointer to a value, the second is a pointer to a
13668global string, the third is a pointer to a global string which is the
13669source file name, and the last argument is the line number.
13670
13671Semantics:
13672""""""""""
13673
13674This intrinsic allows annotation of local variables with arbitrary
13675strings. This can be useful for special purpose optimizations that want
13676to look for these annotations. These have no other defined use; they are
13677ignored by code generation and optimization.
13678
Michael Gottesman88d18832013-03-26 00:34:27 +000013679'``llvm.ptr.annotation.*``' Intrinsic
13680^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13681
13682Syntax:
13683"""""""
13684
13685This is an overloaded intrinsic. You can use '``llvm.ptr.annotation``' on a
13686pointer to an integer of any width. *NOTE* you must specify an address space for
13687the pointer. The identifier for the default address space is the integer
13688'``0``'.
13689
13690::
13691
13692 declare i8* @llvm.ptr.annotation.p<address space>i8(i8* <val>, i8* <str>, i8* <str>, i32 <int>)
13693 declare i16* @llvm.ptr.annotation.p<address space>i16(i16* <val>, i8* <str>, i8* <str>, i32 <int>)
13694 declare i32* @llvm.ptr.annotation.p<address space>i32(i32* <val>, i8* <str>, i8* <str>, i32 <int>)
13695 declare i64* @llvm.ptr.annotation.p<address space>i64(i64* <val>, i8* <str>, i8* <str>, i32 <int>)
13696 declare i256* @llvm.ptr.annotation.p<address space>i256(i256* <val>, i8* <str>, i8* <str>, i32 <int>)
13697
13698Overview:
13699"""""""""
13700
13701The '``llvm.ptr.annotation``' intrinsic.
13702
13703Arguments:
13704""""""""""
13705
13706The first argument is a pointer to an integer value of arbitrary bitwidth
13707(result of some expression), the second is a pointer to a global string, the
13708third is a pointer to a global string which is the source file name, and the
13709last argument is the line number. It returns the value of the first argument.
13710
13711Semantics:
13712""""""""""
13713
13714This intrinsic allows annotation of a pointer to an integer with arbitrary
13715strings. This can be useful for special purpose optimizations that want to look
13716for these annotations. These have no other defined use; they are ignored by code
13717generation and optimization.
13718
Sean Silvab084af42012-12-07 10:36:55 +000013719'``llvm.annotation.*``' Intrinsic
13720^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13721
13722Syntax:
13723"""""""
13724
13725This is an overloaded intrinsic. You can use '``llvm.annotation``' on
13726any integer bit width.
13727
13728::
13729
13730 declare i8 @llvm.annotation.i8(i8 <val>, i8* <str>, i8* <str>, i32 <int>)
13731 declare i16 @llvm.annotation.i16(i16 <val>, i8* <str>, i8* <str>, i32 <int>)
13732 declare i32 @llvm.annotation.i32(i32 <val>, i8* <str>, i8* <str>, i32 <int>)
13733 declare i64 @llvm.annotation.i64(i64 <val>, i8* <str>, i8* <str>, i32 <int>)
13734 declare i256 @llvm.annotation.i256(i256 <val>, i8* <str>, i8* <str>, i32 <int>)
13735
13736Overview:
13737"""""""""
13738
13739The '``llvm.annotation``' intrinsic.
13740
13741Arguments:
13742""""""""""
13743
13744The first argument is an integer value (result of some expression), the
13745second is a pointer to a global string, the third is a pointer to a
13746global string which is the source file name, and the last argument is
13747the line number. It returns the value of the first argument.
13748
13749Semantics:
13750""""""""""
13751
13752This intrinsic allows annotations to be put on arbitrary expressions
13753with arbitrary strings. This can be useful for special purpose
13754optimizations that want to look for these annotations. These have no
13755other defined use; they are ignored by code generation and optimization.
13756
Reid Klecknere33c94f2017-09-05 20:14:58 +000013757'``llvm.codeview.annotation``' Intrinsic
Reid Klecknerd4523682017-09-05 20:26:25 +000013758^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Reid Klecknere33c94f2017-09-05 20:14:58 +000013759
13760Syntax:
13761"""""""
13762
13763This annotation emits a label at its program point and an associated
13764``S_ANNOTATION`` codeview record with some additional string metadata. This is
13765used to implement MSVC's ``__annotation`` intrinsic. It is marked
13766``noduplicate``, so calls to this intrinsic prevent inlining and should be
13767considered expensive.
13768
13769::
13770
13771 declare void @llvm.codeview.annotation(metadata)
13772
13773Arguments:
13774""""""""""
13775
13776The argument should be an MDTuple containing any number of MDStrings.
13777
Sean Silvab084af42012-12-07 10:36:55 +000013778'``llvm.trap``' Intrinsic
13779^^^^^^^^^^^^^^^^^^^^^^^^^
13780
13781Syntax:
13782"""""""
13783
13784::
13785
13786 declare void @llvm.trap() noreturn nounwind
13787
13788Overview:
13789"""""""""
13790
13791The '``llvm.trap``' intrinsic.
13792
13793Arguments:
13794""""""""""
13795
13796None.
13797
13798Semantics:
13799""""""""""
13800
13801This intrinsic is lowered to the target dependent trap instruction. If
13802the target does not have a trap instruction, this intrinsic will be
13803lowered to a call of the ``abort()`` function.
13804
13805'``llvm.debugtrap``' Intrinsic
13806^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13807
13808Syntax:
13809"""""""
13810
13811::
13812
13813 declare void @llvm.debugtrap() nounwind
13814
13815Overview:
13816"""""""""
13817
13818The '``llvm.debugtrap``' intrinsic.
13819
13820Arguments:
13821""""""""""
13822
13823None.
13824
13825Semantics:
13826""""""""""
13827
13828This intrinsic is lowered to code which is intended to cause an
13829execution trap with the intention of requesting the attention of a
13830debugger.
13831
13832'``llvm.stackprotector``' Intrinsic
13833^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13834
13835Syntax:
13836"""""""
13837
13838::
13839
13840 declare void @llvm.stackprotector(i8* <guard>, i8** <slot>)
13841
13842Overview:
13843"""""""""
13844
13845The ``llvm.stackprotector`` intrinsic takes the ``guard`` and stores it
13846onto the stack at ``slot``. The stack slot is adjusted to ensure that it
13847is placed on the stack before local variables.
13848
13849Arguments:
13850""""""""""
13851
13852The ``llvm.stackprotector`` intrinsic requires two pointer arguments.
13853The first argument is the value loaded from the stack guard
13854``@__stack_chk_guard``. The second variable is an ``alloca`` that has
13855enough space to hold the value of the guard.
13856
13857Semantics:
13858""""""""""
13859
Michael Gottesmandafc7d92013-08-12 18:35:32 +000013860This intrinsic causes the prologue/epilogue inserter to force the position of
13861the ``AllocaInst`` stack slot to be before local variables on the stack. This is
13862to ensure that if a local variable on the stack is overwritten, it will destroy
13863the value of the guard. When the function exits, the guard on the stack is
13864checked against the original guard by ``llvm.stackprotectorcheck``. If they are
13865different, then ``llvm.stackprotectorcheck`` causes the program to abort by
13866calling the ``__stack_chk_fail()`` function.
13867
Tim Shene885d5e2016-04-19 19:40:37 +000013868'``llvm.stackguard``' Intrinsic
13869^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13870
13871Syntax:
13872"""""""
13873
13874::
13875
13876 declare i8* @llvm.stackguard()
13877
13878Overview:
13879"""""""""
13880
13881The ``llvm.stackguard`` intrinsic returns the system stack guard value.
13882
13883It should not be generated by frontends, since it is only for internal usage.
13884The reason why we create this intrinsic is that we still support IR form Stack
13885Protector in FastISel.
13886
13887Arguments:
13888""""""""""
13889
13890None.
13891
13892Semantics:
13893""""""""""
13894
13895On some platforms, the value returned by this intrinsic remains unchanged
13896between loads in the same thread. On other platforms, it returns the same
13897global variable value, if any, e.g. ``@__stack_chk_guard``.
13898
13899Currently some platforms have IR-level customized stack guard loading (e.g.
13900X86 Linux) that is not handled by ``llvm.stackguard()``, while they should be
13901in the future.
13902
Sean Silvab084af42012-12-07 10:36:55 +000013903'``llvm.objectsize``' Intrinsic
13904^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13905
13906Syntax:
13907"""""""
13908
13909::
13910
George Burgess IV56c7e882017-03-21 20:08:59 +000013911 declare i32 @llvm.objectsize.i32(i8* <object>, i1 <min>, i1 <nullunknown>)
13912 declare i64 @llvm.objectsize.i64(i8* <object>, i1 <min>, i1 <nullunknown>)
Sean Silvab084af42012-12-07 10:36:55 +000013913
13914Overview:
13915"""""""""
13916
13917The ``llvm.objectsize`` intrinsic is designed to provide information to
13918the optimizers to determine at compile time whether a) an operation
13919(like memcpy) will overflow a buffer that corresponds to an object, or
13920b) that a runtime check for overflow isn't necessary. An object in this
13921context means an allocation of a specific class, structure, array, or
13922other object.
13923
13924Arguments:
13925""""""""""
13926
George Burgess IV56c7e882017-03-21 20:08:59 +000013927The ``llvm.objectsize`` intrinsic takes three arguments. The first argument is
13928a pointer to or into the ``object``. The second argument determines whether
13929``llvm.objectsize`` returns 0 (if true) or -1 (if false) when the object size
13930is unknown. The third argument controls how ``llvm.objectsize`` acts when
13931``null`` is used as its pointer argument. If it's true and the pointer is in
13932address space 0, ``null`` is treated as an opaque value with an unknown number
13933of bytes. Otherwise, ``llvm.objectsize`` reports 0 bytes available when given
13934``null``.
13935
13936The second and third arguments only accept constants.
Sean Silvab084af42012-12-07 10:36:55 +000013937
13938Semantics:
13939""""""""""
13940
13941The ``llvm.objectsize`` intrinsic is lowered to a constant representing
13942the size of the object concerned. If the size cannot be determined at
13943compile time, ``llvm.objectsize`` returns ``i32/i64 -1 or 0`` (depending
13944on the ``min`` argument).
13945
13946'``llvm.expect``' Intrinsic
13947^^^^^^^^^^^^^^^^^^^^^^^^^^^
13948
13949Syntax:
13950"""""""
13951
Duncan P. N. Exon Smith1ff08e32014-02-02 22:43:55 +000013952This is an overloaded intrinsic. You can use ``llvm.expect`` on any
13953integer bit width.
13954
Sean Silvab084af42012-12-07 10:36:55 +000013955::
13956
Duncan P. N. Exon Smith1ff08e32014-02-02 22:43:55 +000013957 declare i1 @llvm.expect.i1(i1 <val>, i1 <expected_val>)
Sean Silvab084af42012-12-07 10:36:55 +000013958 declare i32 @llvm.expect.i32(i32 <val>, i32 <expected_val>)
13959 declare i64 @llvm.expect.i64(i64 <val>, i64 <expected_val>)
13960
13961Overview:
13962"""""""""
13963
13964The ``llvm.expect`` intrinsic provides information about expected (the
13965most probable) value of ``val``, which can be used by optimizers.
13966
13967Arguments:
13968""""""""""
13969
13970The ``llvm.expect`` intrinsic takes two arguments. The first argument is
13971a value. The second argument is an expected value, this needs to be a
13972constant value, variables are not allowed.
13973
13974Semantics:
13975""""""""""
13976
13977This intrinsic is lowered to the ``val``.
13978
Philip Reamese0e90832015-04-26 22:23:12 +000013979.. _int_assume:
13980
Hal Finkel93046912014-07-25 21:13:35 +000013981'``llvm.assume``' Intrinsic
13982^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13983
13984Syntax:
13985"""""""
13986
13987::
13988
13989 declare void @llvm.assume(i1 %cond)
13990
13991Overview:
13992"""""""""
13993
13994The ``llvm.assume`` allows the optimizer to assume that the provided
13995condition is true. This information can then be used in simplifying other parts
13996of the code.
13997
13998Arguments:
13999""""""""""
14000
14001The condition which the optimizer may assume is always true.
14002
14003Semantics:
14004""""""""""
14005
14006The intrinsic allows the optimizer to assume that the provided condition is
14007always true whenever the control flow reaches the intrinsic call. No code is
14008generated for this intrinsic, and instructions that contribute only to the
14009provided condition are not used for code generation. If the condition is
14010violated during execution, the behavior is undefined.
14011
Sanjay Patel1ed2bb52015-01-14 16:03:58 +000014012Note that the optimizer might limit the transformations performed on values
Hal Finkel93046912014-07-25 21:13:35 +000014013used by the ``llvm.assume`` intrinsic in order to preserve the instructions
14014only used to form the intrinsic's input argument. This might prove undesirable
Sanjay Patel1ed2bb52015-01-14 16:03:58 +000014015if the extra information provided by the ``llvm.assume`` intrinsic does not cause
Hal Finkel93046912014-07-25 21:13:35 +000014016sufficient overall improvement in code quality. For this reason,
14017``llvm.assume`` should not be used to document basic mathematical invariants
14018that the optimizer can otherwise deduce or facts that are of little use to the
14019optimizer.
14020
Daniel Berlin2c438a32017-02-07 19:29:25 +000014021.. _int_ssa_copy:
14022
14023'``llvm.ssa_copy``' Intrinsic
14024^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14025
14026Syntax:
14027"""""""
14028
14029::
14030
14031 declare type @llvm.ssa_copy(type %operand) returned(1) readnone
14032
14033Arguments:
14034""""""""""
14035
14036The first argument is an operand which is used as the returned value.
14037
14038Overview:
14039""""""""""
14040
14041The ``llvm.ssa_copy`` intrinsic can be used to attach information to
14042operations by copying them and giving them new names. For example,
14043the PredicateInfo utility uses it to build Extended SSA form, and
14044attach various forms of information to operands that dominate specific
14045uses. It is not meant for general use, only for building temporary
14046renaming forms that require value splits at certain points.
14047
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014048.. _type.test:
Peter Collingbournee6909c82015-02-20 20:30:47 +000014049
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014050'``llvm.type.test``' Intrinsic
Peter Collingbournee6909c82015-02-20 20:30:47 +000014051^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14052
14053Syntax:
14054"""""""
14055
14056::
14057
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014058 declare i1 @llvm.type.test(i8* %ptr, metadata %type) nounwind readnone
Peter Collingbournee6909c82015-02-20 20:30:47 +000014059
14060
14061Arguments:
14062""""""""""
14063
14064The first argument is a pointer to be tested. The second argument is a
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014065metadata object representing a :doc:`type identifier <TypeMetadata>`.
Peter Collingbournee6909c82015-02-20 20:30:47 +000014066
14067Overview:
14068"""""""""
14069
Peter Collingbourne7efd7502016-06-24 21:21:32 +000014070The ``llvm.type.test`` intrinsic tests whether the given pointer is associated
14071with the given type identifier.
Peter Collingbournee6909c82015-02-20 20:30:47 +000014072
Peter Collingbourne0312f612016-06-25 00:23:04 +000014073'``llvm.type.checked.load``' Intrinsic
14074^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14075
14076Syntax:
14077"""""""
14078
14079::
14080
14081 declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type) argmemonly nounwind readonly
14082
14083
14084Arguments:
14085""""""""""
14086
14087The first argument is a pointer from which to load a function pointer. The
14088second argument is the byte offset from which to load the function pointer. The
14089third argument is a metadata object representing a :doc:`type identifier
14090<TypeMetadata>`.
14091
14092Overview:
14093"""""""""
14094
14095The ``llvm.type.checked.load`` intrinsic safely loads a function pointer from a
14096virtual table pointer using type metadata. This intrinsic is used to implement
14097control flow integrity in conjunction with virtual call optimization. The
14098virtual call optimization pass will optimize away ``llvm.type.checked.load``
14099intrinsics associated with devirtualized calls, thereby removing the type
14100check in cases where it is not needed to enforce the control flow integrity
14101constraint.
14102
14103If the given pointer is associated with a type metadata identifier, this
14104function returns true as the second element of its return value. (Note that
14105the function may also return true if the given pointer is not associated
14106with a type metadata identifier.) If the function's return value's second
14107element is true, the following rules apply to the first element:
14108
14109- If the given pointer is associated with the given type metadata identifier,
14110 it is the function pointer loaded from the given byte offset from the given
14111 pointer.
14112
14113- If the given pointer is not associated with the given type metadata
14114 identifier, it is one of the following (the choice of which is unspecified):
14115
14116 1. The function pointer that would have been loaded from an arbitrarily chosen
14117 (through an unspecified mechanism) pointer associated with the type
14118 metadata.
14119
14120 2. If the function has a non-void return type, a pointer to a function that
14121 returns an unspecified value without causing side effects.
14122
14123If the function's return value's second element is false, the value of the
14124first element is undefined.
14125
14126
Sean Silvab084af42012-12-07 10:36:55 +000014127'``llvm.donothing``' Intrinsic
14128^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14129
14130Syntax:
14131"""""""
14132
14133::
14134
14135 declare void @llvm.donothing() nounwind readnone
14136
14137Overview:
14138"""""""""
14139
Juergen Ributzkac9161192014-10-23 22:36:13 +000014140The ``llvm.donothing`` intrinsic doesn't perform any operation. It's one of only
Sanjoy Das7a4c94d2016-02-26 03:33:59 +000014141three intrinsics (besides ``llvm.experimental.patchpoint`` and
14142``llvm.experimental.gc.statepoint``) that can be called with an invoke
14143instruction.
Sean Silvab084af42012-12-07 10:36:55 +000014144
14145Arguments:
14146""""""""""
14147
14148None.
14149
14150Semantics:
14151""""""""""
14152
14153This intrinsic does nothing, and it's removed by optimizers and ignored
14154by codegen.
Andrew Trick5e029ce2013-12-24 02:57:25 +000014155
Sanjoy Dasb51325d2016-03-11 19:08:34 +000014156'``llvm.experimental.deoptimize``' Intrinsic
14157^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14158
14159Syntax:
14160"""""""
14161
14162::
14163
14164 declare type @llvm.experimental.deoptimize(...) [ "deopt"(...) ]
14165
14166Overview:
14167"""""""""
14168
14169This intrinsic, together with :ref:`deoptimization operand bundles
14170<deopt_opbundles>`, allow frontends to express transfer of control and
14171frame-local state from the currently executing (typically more specialized,
14172hence faster) version of a function into another (typically more generic, hence
14173slower) version.
14174
14175In languages with a fully integrated managed runtime like Java and JavaScript
14176this intrinsic can be used to implement "uncommon trap" or "side exit" like
14177functionality. In unmanaged languages like C and C++, this intrinsic can be
14178used to represent the slow paths of specialized functions.
14179
14180
14181Arguments:
14182""""""""""
14183
14184The intrinsic takes an arbitrary number of arguments, whose meaning is
14185decided by the :ref:`lowering strategy<deoptimize_lowering>`.
14186
14187Semantics:
14188""""""""""
14189
14190The ``@llvm.experimental.deoptimize`` intrinsic executes an attached
14191deoptimization continuation (denoted using a :ref:`deoptimization
14192operand bundle <deopt_opbundles>`) and returns the value returned by
14193the deoptimization continuation. Defining the semantic properties of
14194the continuation itself is out of scope of the language reference --
14195as far as LLVM is concerned, the deoptimization continuation can
14196invoke arbitrary side effects, including reading from and writing to
14197the entire heap.
14198
14199Deoptimization continuations expressed using ``"deopt"`` operand bundles always
14200continue execution to the end of the physical frame containing them, so all
14201calls to ``@llvm.experimental.deoptimize`` must be in "tail position":
14202
14203 - ``@llvm.experimental.deoptimize`` cannot be invoked.
14204 - The call must immediately precede a :ref:`ret <i_ret>` instruction.
14205 - The ``ret`` instruction must return the value produced by the
14206 ``@llvm.experimental.deoptimize`` call if there is one, or void.
14207
14208Note that the above restrictions imply that the return type for a call to
14209``@llvm.experimental.deoptimize`` will match the return type of its immediate
14210caller.
14211
14212The inliner composes the ``"deopt"`` continuations of the caller into the
14213``"deopt"`` continuations present in the inlinee, and also updates calls to this
14214intrinsic to return directly from the frame of the function it inlined into.
14215
Sanjoy Dase0aa4142016-05-12 01:17:38 +000014216All declarations of ``@llvm.experimental.deoptimize`` must share the
14217same calling convention.
14218
Sanjoy Dasb51325d2016-03-11 19:08:34 +000014219.. _deoptimize_lowering:
14220
14221Lowering:
14222"""""""""
14223
Sanjoy Dasdf9ae702016-03-24 20:23:29 +000014224Calls to ``@llvm.experimental.deoptimize`` are lowered to calls to the
14225symbol ``__llvm_deoptimize`` (it is the frontend's responsibility to
14226ensure that this symbol is defined). The call arguments to
14227``@llvm.experimental.deoptimize`` are lowered as if they were formal
14228arguments of the specified types, and not as varargs.
14229
Sanjoy Dasb51325d2016-03-11 19:08:34 +000014230
Sanjoy Das021de052016-03-31 00:18:46 +000014231'``llvm.experimental.guard``' Intrinsic
14232^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14233
14234Syntax:
14235"""""""
14236
14237::
14238
14239 declare void @llvm.experimental.guard(i1, ...) [ "deopt"(...) ]
14240
14241Overview:
14242"""""""""
14243
14244This intrinsic, together with :ref:`deoptimization operand bundles
14245<deopt_opbundles>`, allows frontends to express guards or checks on
14246optimistic assumptions made during compilation. The semantics of
14247``@llvm.experimental.guard`` is defined in terms of
14248``@llvm.experimental.deoptimize`` -- its body is defined to be
14249equivalent to:
14250
Renato Golin124f2592016-07-20 12:16:38 +000014251.. code-block:: text
Sanjoy Das021de052016-03-31 00:18:46 +000014252
Renato Golin124f2592016-07-20 12:16:38 +000014253 define void @llvm.experimental.guard(i1 %pred, <args...>) {
14254 %realPred = and i1 %pred, undef
14255 br i1 %realPred, label %continue, label %leave [, !make.implicit !{}]
Sanjoy Das021de052016-03-31 00:18:46 +000014256
Renato Golin124f2592016-07-20 12:16:38 +000014257 leave:
14258 call void @llvm.experimental.deoptimize(<args...>) [ "deopt"() ]
14259 ret void
Sanjoy Das021de052016-03-31 00:18:46 +000014260
Renato Golin124f2592016-07-20 12:16:38 +000014261 continue:
14262 ret void
14263 }
Sanjoy Das021de052016-03-31 00:18:46 +000014264
Sanjoy Das47cf2af2016-04-30 00:55:59 +000014265
14266with the optional ``[, !make.implicit !{}]`` present if and only if it
14267is present on the call site. For more details on ``!make.implicit``,
14268see :doc:`FaultMaps`.
14269
Sanjoy Das021de052016-03-31 00:18:46 +000014270In words, ``@llvm.experimental.guard`` executes the attached
14271``"deopt"`` continuation if (but **not** only if) its first argument
14272is ``false``. Since the optimizer is allowed to replace the ``undef``
14273with an arbitrary value, it can optimize guard to fail "spuriously",
14274i.e. without the original condition being false (hence the "not only
14275if"); and this allows for "check widening" type optimizations.
14276
14277``@llvm.experimental.guard`` cannot be invoked.
14278
14279
Peter Collingbourne7dd8dbf2016-04-22 21:18:02 +000014280'``llvm.load.relative``' Intrinsic
14281^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14282
14283Syntax:
14284"""""""
14285
14286::
14287
14288 declare i8* @llvm.load.relative.iN(i8* %ptr, iN %offset) argmemonly nounwind readonly
14289
14290Overview:
14291"""""""""
14292
14293This intrinsic loads a 32-bit value from the address ``%ptr + %offset``,
14294adds ``%ptr`` to that value and returns it. The constant folder specifically
14295recognizes the form of this intrinsic and the constant initializers it may
14296load from; if a loaded constant initializer is known to have the form
14297``i32 trunc(x - %ptr)``, the intrinsic call is folded to ``x``.
14298
14299LLVM provides that the calculation of such a constant initializer will
14300not overflow at link time under the medium code model if ``x`` is an
14301``unnamed_addr`` function. However, it does not provide this guarantee for
14302a constant initializer folded into a function body. This intrinsic can be
14303used to avoid the possibility of overflows when loading from such a constant.
14304
Dan Gohman2c74fe92017-11-08 21:59:51 +000014305'``llvm.sideeffect``' Intrinsic
14306^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14307
14308Syntax:
14309"""""""
14310
14311::
14312
14313 declare void @llvm.sideeffect() inaccessiblememonly nounwind
14314
14315Overview:
14316"""""""""
14317
14318The ``llvm.sideeffect`` intrinsic doesn't perform any operation. Optimizers
14319treat it as having side effects, so it can be inserted into a loop to
14320indicate that the loop shouldn't be assumed to terminate (which could
14321potentially lead to the loop being optimized away entirely), even if it's
14322an infinite loop with no other side effects.
14323
14324Arguments:
14325""""""""""
14326
14327None.
14328
14329Semantics:
14330""""""""""
14331
14332This intrinsic actually does nothing, but optimizers must assume that it
14333has externally observable side effects.
14334
Andrew Trick5e029ce2013-12-24 02:57:25 +000014335Stack Map Intrinsics
14336--------------------
14337
14338LLVM provides experimental intrinsics to support runtime patching
14339mechanisms commonly desired in dynamic language JITs. These intrinsics
14340are described in :doc:`StackMaps`.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014341
14342Element Wise Atomic Memory Intrinsics
Igor Laevskyfedab152016-12-29 15:08:57 +000014343-------------------------------------
Igor Laevsky4f31e522016-12-29 14:31:07 +000014344
14345These intrinsics are similar to the standard library memory intrinsics except
14346that they perform memory transfer as a sequence of atomic memory accesses.
14347
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014348.. _int_memcpy_element_unordered_atomic:
Igor Laevsky4f31e522016-12-29 14:31:07 +000014349
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014350'``llvm.memcpy.element.unordered.atomic``' Intrinsic
14351^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Igor Laevsky4f31e522016-12-29 14:31:07 +000014352
14353Syntax:
14354"""""""
14355
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014356This is an overloaded intrinsic. You can use ``llvm.memcpy.element.unordered.atomic`` on
Igor Laevsky4f31e522016-12-29 14:31:07 +000014357any integer bit width and for different address spaces. Not all targets
14358support all bit widths however.
14359
14360::
14361
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014362 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
14363 i8* <src>,
14364 i32 <len>,
14365 i32 <element_size>)
14366 declare void @llvm.memcpy.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
14367 i8* <src>,
14368 i64 <len>,
14369 i32 <element_size>)
Igor Laevsky4f31e522016-12-29 14:31:07 +000014370
14371Overview:
14372"""""""""
14373
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014374The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic is a specialization of the
14375'``llvm.memcpy.*``' intrinsic. It differs in that the ``dest`` and ``src`` are treated
14376as arrays with elements that are exactly ``element_size`` bytes, and the copy between
14377buffers uses a sequence of :ref:`unordered atomic <ordering>` load/store operations
14378that are a positive integer multiple of the ``element_size`` in size.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014379
14380Arguments:
14381""""""""""
14382
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014383The first three arguments are the same as they are in the :ref:`@llvm.memcpy <int_memcpy>`
14384intrinsic, with the added constraint that ``len`` is required to be a positive integer
14385multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
14386``element_size``, then the behaviour of the intrinsic is undefined.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014387
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014388``element_size`` must be a compile-time constant positive power of two no greater than
14389target-specific atomic access size limit.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014390
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014391For each of the input pointers ``align`` parameter attribute must be specified. It
14392must be a power of two no less than the ``element_size``. Caller guarantees that
14393both the source and destination pointers are aligned to that boundary.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014394
14395Semantics:
14396""""""""""
14397
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014398The '``llvm.memcpy.element.unordered.atomic.*``' intrinsic copies ``len`` bytes of
14399memory from the source location to the destination location. These locations are not
14400allowed to overlap. The memory copy is performed as a sequence of load/store operations
14401where each access is guaranteed to be a multiple of ``element_size`` bytes wide and
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014402aligned at an ``element_size`` boundary.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014403
14404The order of the copy is unspecified. The same value may be read from the source
14405buffer many times, but only one write is issued to the destination buffer per
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014406element. It is well defined to have concurrent reads and writes to both source and
14407destination provided those reads and writes are unordered atomic when specified.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014408
14409This intrinsic does not provide any additional ordering guarantees over those
14410provided by a set of unordered loads from the source location and stores to the
14411destination.
14412
14413Lowering:
Igor Laevskyfedab152016-12-29 15:08:57 +000014414"""""""""
Igor Laevsky4f31e522016-12-29 14:31:07 +000014415
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014416In the most general case call to the '``llvm.memcpy.element.unordered.atomic.*``' is
14417lowered to a call to the symbol ``__llvm_memcpy_element_unordered_atomic_*``. Where '*'
14418is replaced with an actual element size.
Igor Laevsky4f31e522016-12-29 14:31:07 +000014419
Daniel Neilson57226ef2017-07-12 15:25:26 +000014420Optimizer is allowed to inline memory copy when it's profitable to do so.
14421
14422'``llvm.memmove.element.unordered.atomic``' Intrinsic
14423^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14424
14425Syntax:
14426"""""""
14427
14428This is an overloaded intrinsic. You can use
14429``llvm.memmove.element.unordered.atomic`` on any integer bit width and for
14430different address spaces. Not all targets support all bit widths however.
14431
14432::
14433
14434 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i32(i8* <dest>,
14435 i8* <src>,
14436 i32 <len>,
14437 i32 <element_size>)
14438 declare void @llvm.memmove.element.unordered.atomic.p0i8.p0i8.i64(i8* <dest>,
14439 i8* <src>,
14440 i64 <len>,
14441 i32 <element_size>)
14442
14443Overview:
14444"""""""""
14445
14446The '``llvm.memmove.element.unordered.atomic.*``' intrinsic is a specialization
14447of the '``llvm.memmove.*``' intrinsic. It differs in that the ``dest`` and
14448``src`` are treated as arrays with elements that are exactly ``element_size``
14449bytes, and the copy between buffers uses a sequence of
14450:ref:`unordered atomic <ordering>` load/store operations that are a positive
14451integer multiple of the ``element_size`` in size.
14452
14453Arguments:
14454""""""""""
14455
14456The first three arguments are the same as they are in the
14457:ref:`@llvm.memmove <int_memmove>` intrinsic, with the added constraint that
14458``len`` is required to be a positive integer multiple of the ``element_size``.
14459If ``len`` is not a positive integer multiple of ``element_size``, then the
14460behaviour of the intrinsic is undefined.
14461
14462``element_size`` must be a compile-time constant positive power of two no
14463greater than a target-specific atomic access size limit.
14464
14465For each of the input pointers the ``align`` parameter attribute must be
14466specified. It must be a power of two no less than the ``element_size``. Caller
14467guarantees that both the source and destination pointers are aligned to that
14468boundary.
14469
14470Semantics:
14471""""""""""
14472
14473The '``llvm.memmove.element.unordered.atomic.*``' intrinsic copies ``len`` bytes
14474of memory from the source location to the destination location. These locations
14475are allowed to overlap. The memory copy is performed as a sequence of load/store
14476operations where each access is guaranteed to be a multiple of ``element_size``
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014477bytes wide and aligned at an ``element_size`` boundary.
Daniel Neilson57226ef2017-07-12 15:25:26 +000014478
14479The order of the copy is unspecified. The same value may be read from the source
14480buffer many times, but only one write is issued to the destination buffer per
14481element. It is well defined to have concurrent reads and writes to both source
14482and destination provided those reads and writes are unordered atomic when
14483specified.
14484
14485This intrinsic does not provide any additional ordering guarantees over those
14486provided by a set of unordered loads from the source location and stores to the
14487destination.
14488
14489Lowering:
14490"""""""""
14491
14492In the most general case call to the
14493'``llvm.memmove.element.unordered.atomic.*``' is lowered to a call to the symbol
14494``__llvm_memmove_element_unordered_atomic_*``. Where '*' is replaced with an
14495actual element size.
14496
Daniel Neilson3faabbb2017-06-16 14:43:59 +000014497The optimizer is allowed to inline the memory copy when it's profitable to do so.
Daniel Neilson965613e2017-07-12 21:57:23 +000014498
14499.. _int_memset_element_unordered_atomic:
14500
14501'``llvm.memset.element.unordered.atomic``' Intrinsic
14502^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
14503
14504Syntax:
14505"""""""
14506
14507This is an overloaded intrinsic. You can use ``llvm.memset.element.unordered.atomic`` on
14508any integer bit width and for different address spaces. Not all targets
14509support all bit widths however.
14510
14511::
14512
14513 declare void @llvm.memset.element.unordered.atomic.p0i8.i32(i8* <dest>,
14514 i8 <value>,
14515 i32 <len>,
14516 i32 <element_size>)
14517 declare void @llvm.memset.element.unordered.atomic.p0i8.i64(i8* <dest>,
14518 i8 <value>,
14519 i64 <len>,
14520 i32 <element_size>)
14521
14522Overview:
14523"""""""""
14524
14525The '``llvm.memset.element.unordered.atomic.*``' intrinsic is a specialization of the
14526'``llvm.memset.*``' intrinsic. It differs in that the ``dest`` is treated as an array
14527with elements that are exactly ``element_size`` bytes, and the assignment to that array
14528uses uses a sequence of :ref:`unordered atomic <ordering>` store operations
14529that are a positive integer multiple of the ``element_size`` in size.
14530
14531Arguments:
14532""""""""""
14533
14534The first three arguments are the same as they are in the :ref:`@llvm.memset <int_memset>`
14535intrinsic, with the added constraint that ``len`` is required to be a positive integer
14536multiple of the ``element_size``. If ``len`` is not a positive integer multiple of
14537``element_size``, then the behaviour of the intrinsic is undefined.
14538
14539``element_size`` must be a compile-time constant positive power of two no greater than
14540target-specific atomic access size limit.
14541
14542The ``dest`` input pointer must have the ``align`` parameter attribute specified. It
14543must be a power of two no less than the ``element_size``. Caller guarantees that
14544the destination pointer is aligned to that boundary.
14545
14546Semantics:
14547""""""""""
14548
14549The '``llvm.memset.element.unordered.atomic.*``' intrinsic sets the ``len`` bytes of
14550memory starting at the destination location to the given ``value``. The memory is
14551set with a sequence of store operations where each access is guaranteed to be a
Jonas Devlieghereaaecdc42017-11-06 11:47:24 +000014552multiple of ``element_size`` bytes wide and aligned at an ``element_size`` boundary.
Daniel Neilson965613e2017-07-12 21:57:23 +000014553
14554The order of the assignment is unspecified. Only one write is issued to the
14555destination buffer per element. It is well defined to have concurrent reads and
14556writes to the destination provided those reads and writes are unordered atomic
14557when specified.
14558
14559This intrinsic does not provide any additional ordering guarantees over those
14560provided by a set of unordered stores to the destination.
14561
14562Lowering:
14563"""""""""
14564
14565In the most general case call to the '``llvm.memset.element.unordered.atomic.*``' is
14566lowered to a call to the symbol ``__llvm_memset_element_unordered_atomic_*``. Where '*'
14567is replaced with an actual element size.
14568
14569The optimizer is allowed to inline the memory assignment when it's profitable to do so.