blob: 4441859177b39a4ac42acc9821db368f86ad791b [file] [log] [blame]
Tom Stellard45bb48e2015-06-13 03:28:10 +00001==============================
2User Guide for AMDGPU Back-end
3==============================
4
5Introduction
6============
7
8The AMDGPU back-end provides ISA code generation for AMD GPUs, starting with
9the R600 family up until the current Volcanic Islands (GCN Gen 3).
10
11
12Assembler
13=========
14
15The assembler is currently considered experimental.
16
17For syntax examples look in test/MC/AMDGPU.
18
19Below some of the currently supported features (modulo bugs). These
20all apply to the Southern Islands ISA, Sea Islands and Volcanic Islands
21are also supported but may be missing some instructions and have more bugs:
22
23DS Instructions
24---------------
25All DS instructions are supported.
26
27FLAT Instructions
28------------------
29These instructions are only present in the Sea Islands and Volcanic Islands
30instruction set. All FLAT instructions are supported for these architectures
31
32MUBUF Instructions
33------------------
34All non-atomic MUBUF instructions are supported.
35
36SMRD Instructions
37-----------------
38Only the s_load_dword* SMRD instructions are supported.
39
40SOP1 Instructions
41-----------------
42All SOP1 instructions are supported.
43
44SOP2 Instructions
45-----------------
46All SOP2 instructions are supported.
47
48SOPC Instructions
49-----------------
50All SOPC instructions are supported.
51
52SOPP Instructions
53-----------------
54
55Unless otherwise mentioned, all SOPP instructions that have one or more
56operands accept integer operands only. No verification is performed
57on the operands, so it is up to the programmer to be familiar with the
58range or acceptable values.
59
60s_waitcnt
61^^^^^^^^^
62
63s_waitcnt accepts named arguments to specify which memory counter(s) to
64wait for.
65
66.. code-block:: nasm
67
68 // Wait for all counters to be 0
69 s_waitcnt 0
70
71 // Equivalent to s_waitcnt 0. Counter names can also be delimited by
72 // '&' or ','.
73 s_waitcnt vmcnt(0) expcnt(0) lgkcmt(0)
74
75 // Wait for vmcnt counter to be 1.
76 s_waitcnt vmcnt(1)
77
78VOP1, VOP2, VOP3, VOPC Instructions
79-----------------------------------
80
81All 32-bit and 64-bit encodings should work.
82
83The assembler will automatically detect which encoding size to use for
84VOP1, VOP2, and VOPC instructions based on the operands. If you want to force
85a specific encoding size, you can add an _e32 (for 32-bit encoding) or
86_e64 (for 64-bit encoding) suffix to the instruction. Most, but not all
87instructions support an explicit suffix. These are all valid assembly
88strings:
89
90.. code-block:: nasm
91
92 v_mul_i32_i24 v1, v2, v3
93 v_mul_i32_i24_e32 v1, v2, v3
94 v_mul_i32_i24_e64 v1, v2, v3
Tom Stellard347ac792015-06-26 21:15:07 +000095
96Assembler Directives
97--------------------
98
99.hsa_code_object_version major, minor
100^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
101
102*major* and *minor* are integers that specify the version of the HSA code
103object that will be generated by the assembler. This value will be stored
104in an entry of the .note section.
105
106.hsa_code_object_isa [major, minor, stepping, vendor, arch]
107^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
108
109*major*, *minor*, and *stepping* are all integers that describe the instruction
110set architecture (ISA) version of the assembly program.
111
112*vendor* and *arch* are quoted strings. *vendor* should always be equal to
113"AMD" and *arch* should always be equal to "AMDGPU".
114
115If no arguments are specified, then the assembler will derive the ISA version,
116*vendor*, and *arch* from the value of the -mcpu option that is passed to the
117assembler.
118
119ISA version, *vendor*, and *arch* will all be stored in a single entry of the
120.note section.