Tom Stellard | 45bb48e | 2015-06-13 03:28:10 +0000 | [diff] [blame] | 1 | ============================== |
| 2 | User Guide for AMDGPU Back-end |
| 3 | ============================== |
| 4 | |
| 5 | Introduction |
| 6 | ============ |
| 7 | |
| 8 | The AMDGPU back-end provides ISA code generation for AMD GPUs, starting with |
| 9 | the R600 family up until the current Volcanic Islands (GCN Gen 3). |
| 10 | |
| 11 | |
| 12 | Assembler |
| 13 | ========= |
| 14 | |
| 15 | The assembler is currently considered experimental. |
| 16 | |
| 17 | For syntax examples look in test/MC/AMDGPU. |
| 18 | |
| 19 | Below some of the currently supported features (modulo bugs). These |
| 20 | all apply to the Southern Islands ISA, Sea Islands and Volcanic Islands |
| 21 | are also supported but may be missing some instructions and have more bugs: |
| 22 | |
| 23 | DS Instructions |
| 24 | --------------- |
| 25 | All DS instructions are supported. |
| 26 | |
| 27 | FLAT Instructions |
| 28 | ------------------ |
| 29 | These instructions are only present in the Sea Islands and Volcanic Islands |
| 30 | instruction set. All FLAT instructions are supported for these architectures |
| 31 | |
| 32 | MUBUF Instructions |
| 33 | ------------------ |
| 34 | All non-atomic MUBUF instructions are supported. |
| 35 | |
| 36 | SMRD Instructions |
| 37 | ----------------- |
| 38 | Only the s_load_dword* SMRD instructions are supported. |
| 39 | |
| 40 | SOP1 Instructions |
| 41 | ----------------- |
| 42 | All SOP1 instructions are supported. |
| 43 | |
| 44 | SOP2 Instructions |
| 45 | ----------------- |
| 46 | All SOP2 instructions are supported. |
| 47 | |
| 48 | SOPC Instructions |
| 49 | ----------------- |
| 50 | All SOPC instructions are supported. |
| 51 | |
| 52 | SOPP Instructions |
| 53 | ----------------- |
| 54 | |
| 55 | Unless otherwise mentioned, all SOPP instructions that have one or more |
| 56 | operands accept integer operands only. No verification is performed |
| 57 | on the operands, so it is up to the programmer to be familiar with the |
| 58 | range or acceptable values. |
| 59 | |
| 60 | s_waitcnt |
| 61 | ^^^^^^^^^ |
| 62 | |
| 63 | s_waitcnt accepts named arguments to specify which memory counter(s) to |
| 64 | wait for. |
| 65 | |
| 66 | .. code-block:: nasm |
| 67 | |
| 68 | // Wait for all counters to be 0 |
| 69 | s_waitcnt 0 |
| 70 | |
| 71 | // Equivalent to s_waitcnt 0. Counter names can also be delimited by |
| 72 | // '&' or ','. |
| 73 | s_waitcnt vmcnt(0) expcnt(0) lgkcmt(0) |
| 74 | |
| 75 | // Wait for vmcnt counter to be 1. |
| 76 | s_waitcnt vmcnt(1) |
| 77 | |
| 78 | VOP1, VOP2, VOP3, VOPC Instructions |
| 79 | ----------------------------------- |
| 80 | |
| 81 | All 32-bit and 64-bit encodings should work. |
| 82 | |
| 83 | The assembler will automatically detect which encoding size to use for |
| 84 | VOP1, VOP2, and VOPC instructions based on the operands. If you want to force |
| 85 | a specific encoding size, you can add an _e32 (for 32-bit encoding) or |
| 86 | _e64 (for 64-bit encoding) suffix to the instruction. Most, but not all |
| 87 | instructions support an explicit suffix. These are all valid assembly |
| 88 | strings: |
| 89 | |
| 90 | .. code-block:: nasm |
| 91 | |
| 92 | v_mul_i32_i24 v1, v2, v3 |
| 93 | v_mul_i32_i24_e32 v1, v2, v3 |
| 94 | v_mul_i32_i24_e64 v1, v2, v3 |
Tom Stellard | 347ac79 | 2015-06-26 21:15:07 +0000 | [diff] [blame^] | 95 | |
| 96 | Assembler Directives |
| 97 | -------------------- |
| 98 | |
| 99 | .hsa_code_object_version major, minor |
| 100 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 101 | |
| 102 | *major* and *minor* are integers that specify the version of the HSA code |
| 103 | object that will be generated by the assembler. This value will be stored |
| 104 | in an entry of the .note section. |
| 105 | |
| 106 | .hsa_code_object_isa [major, minor, stepping, vendor, arch] |
| 107 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 108 | |
| 109 | *major*, *minor*, and *stepping* are all integers that describe the instruction |
| 110 | set architecture (ISA) version of the assembly program. |
| 111 | |
| 112 | *vendor* and *arch* are quoted strings. *vendor* should always be equal to |
| 113 | "AMD" and *arch* should always be equal to "AMDGPU". |
| 114 | |
| 115 | If no arguments are specified, then the assembler will derive the ISA version, |
| 116 | *vendor*, and *arch* from the value of the -mcpu option that is passed to the |
| 117 | assembler. |
| 118 | |
| 119 | ISA version, *vendor*, and *arch* will all be stored in a single entry of the |
| 120 | .note section. |