blob: 4fa2bb2c9eb03a0f6f070def28964f35ae91ad60 [file] [log] [blame]
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +00001=====================================
2Syntax of AMDGPU Instruction Operands
3=====================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +00004
5.. contents::
6 :local:
7
8Conventions
9===========
10
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000011The following notation is used throughout this document:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +000012
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000013 =================== =============================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +000014 Notation Description
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000015 =================== =============================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +000016 {0..N} Any integer value in the range from 0 to N (inclusive).
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000017 <x> Syntax and meaning of *x* is explained elsewhere.
18 =================== =============================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +000019
20.. _amdgpu_syn_operands:
21
22Operands
23========
24
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000025.. _amdgpu_synid_v:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +000026
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000027v
28-
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +000029
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000030Vector registers. There are 256 32-bit vector registers.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +000031
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +000032A sequence of *vector* registers may be used to operate with more than 32 bits of data.
33
34Assembler currently supports sequences of 1, 2, 3, 4, 8 and 16 *vector* registers.
35
36 =================================================== ====================================================================
37 Syntax Description
38 =================================================== ====================================================================
39 **v**\<N> A single 32-bit *vector* register.
40
41 *N* must be a decimal integer number.
42 **v[**\ <N>\ **]** A single 32-bit *vector* register.
43
44 *N* may be specified as an
45 :ref:`integer number<amdgpu_synid_integer_number>`
46 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
47 **v[**\ <N>:<K>\ **]** A sequence of (\ *K-N+1*\ ) *vector* registers.
48
49 *N* and *K* may be specified as
50 :ref:`integer numbers<amdgpu_synid_integer_number>`
51 or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
52 **[v**\ <N>, \ **v**\ <N+1>, ... **v**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *vector* registers.
53
54 Register indices must be specified as decimal integer numbers.
55 =================================================== ====================================================================
56
57Note. *N* and *K* must satisfy the following conditions:
58
59* *N* <= *K*.
60* 0 <= *N* <= 255.
61* 0 <= *K* <= 255.
62* *K-N+1* must be equal to 1, 2, 3, 4, 8 or 16.
63
64Examples:
65
66.. code-block:: nasm
67
68 v255
69 v[0]
70 v[0:1]
71 v[1:1]
72 v[0:3]
73 v[2*2]
74 v[1-1:2-1]
75 [v252]
76 [v252,v253,v254,v255]
77
78.. _amdgpu_synid_s:
79
80s
81-
82
83Scalar 32-bit registers. The number of available *scalar* registers depends on GPU:
84
85 ======= ============================
86 GPU Number of *scalar* registers
87 ======= ============================
88 GFX7 104
89 GFX8 102
90 GFX9 102
91 ======= ============================
92
93A sequence of *scalar* registers may be used to operate with more than 32 bits of data.
94Assembler currently supports sequences of 1, 2, 4, 8 and 16 *scalar* registers.
95
96Pairs of *scalar* registers must be even-aligned (the first register must be even).
97Sequences of 4 and more *scalar* registers must be quad-aligned.
98
99 ======================================================== ====================================================================
100 Syntax Description
101 ======================================================== ====================================================================
102 **s**\ <N> A single 32-bit *scalar* register.
103
104 *N* must be a decimal integer number.
105 **s[**\ <N>\ **]** A single 32-bit *scalar* register.
106
107 *N* may be specified as an
108 :ref:`integer number<amdgpu_synid_integer_number>`
109 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
110 **s[**\ <N>:<K>\ **]** A sequence of (\ *K-N+1*\ ) *scalar* registers.
111
112 *N* and *K* may be specified as
113 :ref:`integer numbers<amdgpu_synid_integer_number>`
114 or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
115 **[s**\ <N>, \ **s**\ <N+1>, ... **s**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *scalar* registers.
116
117 Register indices must be specified as decimal integer numbers.
118 ======================================================== ====================================================================
119
120Note. *N* and *K* must satisfy the following conditions:
121
122* *N* must be properly aligned based on sequence size.
123* *N* <= *K*.
124* 0 <= *N* < *SMAX*\ , where *SMAX* is the number of available *scalar* registers.
125* 0 <= *K* < *SMAX*\ , where *SMAX* is the number of available *scalar* registers.
126* *K-N+1* must be equal to 1, 2, 4, 8 or 16.
127
128Examples:
129
130.. code-block:: nasm
131
132 s0
133 s[0]
134 s[0:1]
135 s[1:1]
136 s[0:3]
137 s[2*2]
138 s[1-1:2-1]
139 [s4]
140 [s4,s5,s6,s7]
141
142Examples of *scalar* registers with an invalid alignment:
143
144.. code-block:: nasm
145
146 s[1:2]
147 s[2:5]
148
149.. _amdgpu_synid_trap:
150
151trap
152----
153
154A set of trap handler registers:
155
156* :ref:`ttmp<amdgpu_synid_ttmp>`
157* :ref:`tba<amdgpu_synid_tba>`
158* :ref:`tma<amdgpu_synid_tma>`
159
160.. _amdgpu_synid_ttmp:
161
162ttmp
163----
164
165Trap handler temporary scalar registers, 32-bits wide.
166The number of available *ttmp* registers depends on GPU:
167
168 ======= ===========================
169 GPU Number of *ttmp* registers
170 ======= ===========================
171 GFX7 12
172 GFX8 12
173 GFX9 16
174 ======= ===========================
175
176A sequence of *ttmp* registers may be used to operate with more than 32 bits of data.
177Assembler currently supports sequences of 1, 2, 4, 8 and 16 *ttmp* registers.
178
179Pairs of *ttmp* registers must be even-aligned (the first register must be even).
180Sequences of 4 and more *ttmp* registers must be quad-aligned.
181
182 ============================================================= ====================================================================
183 Syntax Description
184 ============================================================= ====================================================================
185 **ttmp**\ <N> A single 32-bit *ttmp* register.
186
187 *N* must be a decimal integer number.
188 **ttmp[**\ <N>\ **]** A single 32-bit *ttmp* register.
189
190 *N* may be specified as an
191 :ref:`integer number<amdgpu_synid_integer_number>`
192 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`.
193 **ttmp[**\ <N>:<K>\ **]** A sequence of (\ *K-N+1*\ ) *ttmp* registers.
194
195 *N* and *K* may be specified as
196 :ref:`integer numbers<amdgpu_synid_integer_number>`
197 or :ref:`absolute expressions<amdgpu_synid_absolute_expression>`.
198 **[ttmp**\ <N>, \ **ttmp**\ <N+1>, ... **ttmp**\ <K>\ **]** A sequence of (\ *K-N+1*\ ) *ttmp* registers.
199
200 Register indices must be specified as decimal integer numbers.
201 ============================================================= ====================================================================
202
203Note. *N* and *K* must satisfy the following conditions:
204
205* *N* must be properly aligned based on sequence size.
206* *N* <= *K*.
207* 0 <= *N* < *TMAX*, where *TMAX* is the number of available *ttmp* registers.
208* 0 <= *K* < *TMAX*, where *TMAX* is the number of available *ttmp* registers.
209* *K-N+1* must be equal to 1, 2, 4, 8 or 16.
210
211Examples:
212
213.. code-block:: nasm
214
215 ttmp0
216 ttmp[0]
217 ttmp[0:1]
218 ttmp[1:1]
219 ttmp[0:3]
220 ttmp[2*2]
221 ttmp[1-1:2-1]
222 [ttmp4]
223 [ttmp4,ttmp5,ttmp6,ttmp7]
224
225Examples of *ttmp* registers with an invalid alignment:
226
227.. code-block:: nasm
228
229 ttmp[1:2]
230 ttmp[2:5]
231
232.. _amdgpu_synid_tba:
233
234tba
235---
236
237Trap base address, 64-bits wide. Holds the pointer to the current trap handler program.
238
239 ================== ======================================================================= =============
240 Syntax Description Availability
241 ================== ======================================================================= =============
242 tba 64-bit *trap base address* register. GFX7, GFX8
243 [tba] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8
244 [tba_lo,tba_hi] 64-bit *trap base address* register (an alternative syntax). GFX7, GFX8
245 ================== ======================================================================= =============
246
247High and low 32 bits of *trap base address* may be accessed as separate registers:
248
249 ================== ======================================================================= =============
250 Syntax Description Availability
251 ================== ======================================================================= =============
252 tba_lo Low 32 bits of *trap base address* register. GFX7, GFX8
253 tba_hi High 32 bits of *trap base address* register. GFX7, GFX8
254 [tba_lo] Low 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8
255 [tba_hi] High 32 bits of *trap base address* register (an alternative syntax). GFX7, GFX8
256 ================== ======================================================================= =============
257
258Note that *tba*, *tba_lo* and *tba_hi* are not accessible as assembler registers in GFX9,
259but *tba* is readable/writable with the help of *s_get_reg* and *s_set_reg* instructions.
260
261.. _amdgpu_synid_tma:
262
263tma
264---
265
266Trap memory address, 64-bits wide.
267
268 ================= ======================================================================= ==================
269 Syntax Description Availability
270 ================= ======================================================================= ==================
271 tma 64-bit *trap memory address* register. GFX7, GFX8
272 [tma] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8
273 [tma_lo,tma_hi] 64-bit *trap memory address* register (an alternative syntax). GFX7, GFX8
274 ================= ======================================================================= ==================
275
276High and low 32 bits of *trap memory address* may be accessed as separate registers:
277
278 ================= ======================================================================= ==================
279 Syntax Description Availability
280 ================= ======================================================================= ==================
281 tma_lo Low 32 bits of *trap memory address* register. GFX7, GFX8
282 tma_hi High 32 bits of *trap memory address* register. GFX7, GFX8
283 [tma_lo] Low 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8
284 [tma_hi] High 32 bits of *trap memory address* register (an alternative syntax). GFX7, GFX8
285 ================= ======================================================================= ==================
286
287Note that *tma*, *tma_lo* and *tma_hi* are not accessible as assembler registers in GFX9,
288but *tma* is readable/writable with the help of *s_get_reg* and *s_set_reg* instructions.
289
290.. _amdgpu_synid_flat_scratch:
291
292flat_scratch
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000293------------
294
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000295Flat scratch address, 64-bits wide. Holds the base address of scratch memory.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000296
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000297 ================================== ================================================================
298 Syntax Description
299 ================================== ================================================================
300 flat_scratch 64-bit *flat scratch* address register.
301 [flat_scratch] 64-bit *flat scratch* address register (an alternative syntax).
302 [flat_scratch_lo,flat_scratch_hi] 64-bit *flat scratch* address register (an alternative syntax).
303 ================================== ================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000304
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000305High and low 32 bits of *flat scratch* address may be accessed as separate registers:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000306
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000307 ========================= =========================================================================
308 Syntax Description
309 ========================= =========================================================================
310 flat_scratch_lo Low 32 bits of *flat scratch* address register.
311 flat_scratch_hi High 32 bits of *flat scratch* address register.
312 [flat_scratch_lo] Low 32 bits of *flat scratch* address register (an alternative syntax).
313 [flat_scratch_hi] High 32 bits of *flat scratch* address register (an alternative syntax).
314 ========================= =========================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000315
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000316.. _amdgpu_synid_xnack:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000317
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000318xnack
319-----
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000320
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000321Xnack mask, 64-bits wide. Holds a 64-bit mask of which threads
322received an *XNACK* due to a vector memory operation.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000323
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000324.. WARNING:: GFX7 does not support *xnack* feature. Not all GFX8 and GFX9 :ref:`processors<amdgpu-processors>` support *xnack* feature.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000325
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000326\
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000327
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000328 ============================== =====================================================
329 Syntax Description
330 ============================== =====================================================
331 xnack_mask 64-bit *xnack mask* register.
332 [xnack_mask] 64-bit *xnack mask* register (an alternative syntax).
333 [xnack_mask_lo,xnack_mask_hi] 64-bit *xnack mask* register (an alternative syntax).
334 ============================== =====================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000335
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000336High and low 32 bits of *xnack mask* may be accessed as separate registers:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000337
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000338 ===================== ==============================================================
339 Syntax Description
340 ===================== ==============================================================
341 xnack_mask_lo Low 32 bits of *xnack mask* register.
342 xnack_mask_hi High 32 bits of *xnack mask* register.
343 [xnack_mask_lo] Low 32 bits of *xnack mask* register (an alternative syntax).
344 [xnack_mask_hi] High 32 bits of *xnack mask* register (an alternative syntax).
345 ===================== ==============================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000346
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000347.. _amdgpu_synid_vcc:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000348
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000349vcc
350---
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000351
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000352Vector condition code, 64-bits wide. A bit mask with one bit per thread;
353it holds the result of a vector compare operation.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000354
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000355 ================ =========================================================================
356 Syntax Description
357 ================ =========================================================================
358 vcc 64-bit *vector condition code* register.
359 [vcc] 64-bit *vector condition code* register (an alternative syntax).
360 [vcc_lo,vcc_hi] 64-bit *vector condition code* register (an alternative syntax).
361 ================ =========================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000362
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000363High and low 32 bits of *vector condition code* may be accessed as separate registers:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000364
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000365 ================ =========================================================================
366 Syntax Description
367 ================ =========================================================================
368 vcc_lo Low 32 bits of *vector condition code* register.
369 vcc_hi High 32 bits of *vector condition code* register.
370 [vcc_lo] Low 32 bits of *vector condition code* register (an alternative syntax).
371 [vcc_hi] High 32 bits of *vector condition code* register (an alternative syntax).
372 ================ =========================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000373
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000374.. _amdgpu_synid_m0:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000375
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000376m0
377--
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000378
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000379A 32-bit memory register. It has various uses,
380including register indexing and bounds checking.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000381
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000382 =========== ===================================================
383 Syntax Description
384 =========== ===================================================
385 m0 A 32-bit *memory* register.
386 [m0] A 32-bit *memory* register (an alternative syntax).
387 =========== ===================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000388
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000389.. _amdgpu_synid_exec:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000390
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000391exec
392----
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000393
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000394Execute mask, 64-bits wide. A bit mask with one bit per thread,
395which is applied to vector instructions and controls which threads execute
396and which ignore the instruction.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000397
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000398 ===================== =================================================================
399 Syntax Description
400 ===================== =================================================================
401 exec 64-bit *execute mask* register.
402 [exec] 64-bit *execute mask* register (an alternative syntax).
403 [exec_lo,exec_hi] 64-bit *execute mask* register (an alternative syntax).
404 ===================== =================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000405
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000406High and low 32 bits of *execute mask* may be accessed as separate registers:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000407
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000408 ===================== =================================================================
409 Syntax Description
410 ===================== =================================================================
411 exec_lo Low 32 bits of *execute mask* register.
412 exec_hi High 32 bits of *execute mask* register.
413 [exec_lo] Low 32 bits of *execute mask* register (an alternative syntax).
414 [exec_hi] High 32 bits of *execute mask* register (an alternative syntax).
415 ===================== =================================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000416
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000417.. _amdgpu_synid_vccz:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000418
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000419vccz
420----
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000421
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000422A single bit-flag indicating that the :ref:`vcc<amdgpu_synid_vcc>` is all zeros.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000423
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000424.. WARNING:: This operand is not currently supported by AMDGPU assembler.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000425
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000426.. _amdgpu_synid_execz:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000427
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000428execz
429-----
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000430
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000431A single bit flag indicating that the :ref:`exec<amdgpu_synid_exec>` is all zeros.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000432
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000433.. WARNING:: This operand is not currently supported by AMDGPU assembler.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000434
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000435.. _amdgpu_synid_scc:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000436
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000437scc
438---
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000439
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000440A single bit flag indicating the result of a scalar compare operation.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000441
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000442.. WARNING:: This operand is not currently supported by AMDGPU assembler.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000443
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000444.. _amdgpu_synid_ldsdirect:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000445
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000446lds_direct
447----------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000448
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000449A special operand which supplies a 32-bit value
450fetched from *LDS* memory using :ref:`m0<amdgpu_synid_m0>` as an address.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000451
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000452.. WARNING:: This operand is not currently supported by AMDGPU assembler.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000453
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000454.. _amdgpu_synid_constant:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000455
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000456constant
457--------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000458
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000459A set of integer and floating-point *inline constants*:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000460
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000461* :ref:`iconst<amdgpu_synid_iconst>`
462* :ref:`fconst<amdgpu_synid_fconst>`
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000463
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000464These operands are encoded as a part of instruction.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000465
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000466If a number may be encoded as either
467a :ref:`literal<amdgpu_synid_literal>` or
468an :ref:`inline constant<amdgpu_synid_constant>`,
469assembler selects the latter encoding as more efficient.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000470
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000471.. _amdgpu_synid_iconst:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000472
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000473iconst
474------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000475
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000476An :ref:`integer number<amdgpu_synid_integer_number>`
477encoded as an *inline constant*.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000478
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000479Only a small fraction of integer numbers may be encoded as *inline constants*.
480They are enumerated in the table below.
481Other integer numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000482
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000483Integer *inline constants* are converted to
484:ref:`expected operand type<amdgpu_syn_instruction_type>`
485as described :ref:`here<amdgpu_synid_int_const_conv>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000486
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000487 ================================== ====================================
488 Value Note
489 ================================== ====================================
490 {0..64} Positive integer inline constants.
491 {-16..-1} Negative integer inline constants.
492 ================================== ====================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000493
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000494.. WARNING:: GFX7 does not support inline constants for *f16* operands.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000495
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000496There are also symbolic inline constants which provide read-only access to H/W registers.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000497
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000498.. WARNING:: These inline constants are not currently supported by AMDGPU assembler.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000499
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000500\
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000501
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000502 ======================== ================================================ =============
503 Syntax Note Availability
504 ======================== ================================================ =============
505 shared_base Base address of shared memory region. GFX9
506 shared_limit Address of the end of shared memory region. GFX9
507 private_base Base address of private memory region. GFX9
508 private_limit Address of the end of private memory region. GFX9
509 pops_exiting_wave_id A dedicated counter for POPS. GFX9
510 ======================== ================================================ =============
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000511
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000512.. _amdgpu_synid_fconst:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000513
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000514fconst
515------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000516
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000517A :ref:`floating-point number<amdgpu_synid_floating-point_number>`
518encoded as an *inline constant*.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000519
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000520Only a small fraction of floating-point numbers may be encoded as *inline constants*.
521They are enumerated in the table below.
522Other floating-point numbers have to be encoded as :ref:`literals<amdgpu_synid_literal>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000523
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000524Floating-point *inline constants* are converted to
525:ref:`expected operand type<amdgpu_syn_instruction_type>`
526as described :ref:`here<amdgpu_synid_fp_const_conv>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000527
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000528 ================================== ===================================================== ==================
529 Value Note Availability
530 ================================== ===================================================== ==================
531 0.0 The same as integer constant 0. All GPUs
532 0.5 Floating-point constant 0.5 All GPUs
533 1.0 Floating-point constant 1.0 All GPUs
534 2.0 Floating-point constant 2.0 All GPUs
535 4.0 Floating-point constant 4.0 All GPUs
536 -0.5 Floating-point constant -0.5 All GPUs
537 -1.0 Floating-point constant -1.0 All GPUs
538 -2.0 Floating-point constant -2.0 All GPUs
539 -4.0 Floating-point constant -4.0 All GPUs
540 0.1592 1.0/(2.0*pi). Use only for 16-bit operands. GFX8, GFX9
541 0.15915494 1.0/(2.0*pi). Use only for 16- and 32-bit operands. GFX8, GFX9
542 0.159154943091895317852646485335 1.0/(2.0*pi). GFX8, GFX9
543 ================================== ===================================================== ==================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000544
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000545.. WARNING:: GFX7 does not support inline constants for *f16* operands.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000546
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000547.. _amdgpu_synid_literal:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000548
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000549literal
550-------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000551
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000552A literal is a 64-bit value which is encoded as a separate 32-bit dword in the instruction stream.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000553
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000554If a number may be encoded as either
555a :ref:`literal<amdgpu_synid_literal>` or
556an :ref:`inline constant<amdgpu_synid_constant>`,
557assembler selects the latter encoding as more efficient.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000558
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000559Literals may be specified as :ref:`integer numbers<amdgpu_synid_integer_number>`,
560:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or
561:ref:`expressions<amdgpu_synid_expression>`
562(expressions are currently supported for 32-bit operands only).
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000563
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000564A 64-bit literal value is converted by assembler
565to an :ref:`expected operand type<amdgpu_syn_instruction_type>`
566as described :ref:`here<amdgpu_synid_lit_conv>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000567
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000568An instruction may use only one literal but several operands may refer the same literal.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000569
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000570.. _amdgpu_synid_uimm8:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000571
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000572uimm8
573-----
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000574
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000575A 8-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
576The value is encoded as part of the opcode so it is free to use.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000577
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000578.. _amdgpu_synid_uimm32:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000579
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000580uimm32
581------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000582
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000583A 32-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
584The value is stored as a separate 32-bit dword in the instruction stream.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000585
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000586.. _amdgpu_synid_uimm20:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000587
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000588uimm20
589------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000590
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000591A 20-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000592
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000593.. _amdgpu_synid_uimm21:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000594
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000595uimm21
596------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000597
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000598A 21-bit positive :ref:`integer number<amdgpu_synid_integer_number>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000599
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000600.. WARNING:: Assembler currently supports 20-bit offsets only. Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000601
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000602.. _amdgpu_synid_simm21:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000603
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000604simm21
605------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000606
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000607A 21-bit :ref:`integer number<amdgpu_synid_integer_number>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000608
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000609.. WARNING:: Assembler currently supports 20-bit unsigned offsets only .Use :ref:`uimm20<amdgpu_synid_uimm20>` as a replacement.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000610
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000611.. _amdgpu_synid_off:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000612
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000613off
614---
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000615
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000616A special entity which indicates that the value of this operand is not used.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000617
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000618 ================================== ===================================================
619 Syntax Description
620 ================================== ===================================================
621 off Indicates an unused operand.
622 ================================== ===================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000623
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000624
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000625.. _amdgpu_synid_number:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000626
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000627Numbers
628=======
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000629
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000630.. _amdgpu_synid_integer_number:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000631
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000632Integer Numbers
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000633---------------
634
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000635Integer numbers are 64 bits wide.
636They may be specified in binary, octal, hexadecimal and decimal formats:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000637
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000638 ============== ====================================
639 Format Syntax
640 ============== ====================================
641 Decimal [-]?[1-9][0-9]*
642 Binary [-]?0b[01]+
643 Octal [-]?0[0-7]+
644 Hexadecimal [-]?0x[0-9a-fA-F]+
645 \ [-]?[0x]?[0-9][0-9a-fA-F]*[hH]
646 ============== ====================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000647
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000648Examples:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000649
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000650.. code-block:: nasm
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000651
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000652 -1234
653 0b1010
654 010
655 0xff
656 0ffh
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000657
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000658.. _amdgpu_synid_floating-point_number:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000659
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000660Floating-Point Numbers
661----------------------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000662
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000663All floating-point numbers are handled as double (64 bits wide).
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000664
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000665Floating-point numbers may be specified in hexadecimal and decimal formats:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000666
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000667 ============== ======================================================== ========================================================
668 Format Syntax Note
669 ============== ======================================================== ========================================================
670 Decimal [-]?[0-9]*[.][0-9]*([eE][+-]?[0-9]*)? Must include either a decimal separator or an exponent.
671 Hexadecimal [-]0x[0-9a-fA-F]*(.[0-9a-fA-F]*)?[pP][+-]?[0-9a-fA-F]+
672 ============== ======================================================== ========================================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000673
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000674Examples:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000675
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000676.. code-block:: nasm
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000677
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000678 -1.234
679 234e2
680 -0x1afp-10
681 0x.1afp10
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000682
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000683.. _amdgpu_synid_expression:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000684
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000685Expressions
686===========
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000687
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000688An expression specifies an address or a numeric value.
689There are two kinds of expressions:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000690
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000691* :ref:`Absolute<amdgpu_synid_absolute_expression>`.
692* :ref:`Relocatable<amdgpu_synid_relocatable_expression>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000693
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000694.. _amdgpu_synid_absolute_expression:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000695
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000696Absolute Expressions
697--------------------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000698
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000699The value of an absolute expression remains the same after program relocation.
700Absolute expressions must not include unassigned and relocatable values
701such as labels.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000702
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000703Examples:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000704
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000705.. code-block:: nasm
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000706
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000707 x = -1
708 y = x + 10
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000709
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000710.. _amdgpu_synid_relocatable_expression:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000711
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000712Relocatable Expressions
713-----------------------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000714
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000715The value of a relocatable expression depends on program relocation.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000716
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000717Note that use of relocatable expressions is limited with branch targets
718and 32-bit :ref:`literals<amdgpu_synid_literal>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000719
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000720Addition information about relocation may be found :ref:`here<amdgpu-relocation-records>`.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000721
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000722Examples:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000723
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000724.. code-block:: nasm
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000725
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000726 y = x + 10 // x is not yet defined. Undefined symbols are assumed to be PC-relative.
727 z = .
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000728
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000729Expression Data Type
730--------------------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000731
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000732Expressions and operands of expressions are interpreted as 64-bit integers.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000733
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000734Expressions may include 64-bit :ref:`floating-point numbers<amdgpu_synid_floating-point_number>` (double).
735However these operands are also handled as 64-bit integers
736using binary representation of specified floating-point numbers.
737No conversion from floating-point to integer is performed.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000738
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000739Examples:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000740
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000741.. code-block:: nasm
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000742
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000743 x = 0.1 // x is assigned an integer 4591870180066957722 which is a binary representation of 0.1.
744 y = x + x // y is a sum of two integer values; it is not equal to 0.2!
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000745
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000746Syntax
747------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000748
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000749Expressions are composed of
750:ref:`symbols<amdgpu_synid_symbol>`,
751:ref:`integer numbers<amdgpu_synid_integer_number>`,
752:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`,
753:ref:`binary operators<amdgpu_synid_expression_bin_op>`,
754:ref:`unary operators<amdgpu_synid_expression_un_op>` and subexpressions.
Dmitry Preobrazhenskyc80b1652018-07-27 14:17:15 +0000755
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000756Expressions may also use "." which is a reference to the current PC (program counter).
Dmitry Preobrazhenskyc80b1652018-07-27 14:17:15 +0000757
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000758The syntax of expressions is shown below::
Dmitry Preobrazhenskyc80b1652018-07-27 14:17:15 +0000759
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000760 expr ::= expr binop expr | primaryexpr ;
Dmitry Preobrazhenskyc80b1652018-07-27 14:17:15 +0000761
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000762 primaryexpr ::= '(' expr ')' | symbol | number | '.' | unop primaryexpr ;
Dmitry Preobrazhenskyc80b1652018-07-27 14:17:15 +0000763
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000764 binop ::= '&&'
765 | '||'
766 | '|'
767 | '^'
768 | '&'
769 | '!'
770 | '=='
771 | '!='
772 | '<>'
773 | '<'
774 | '<='
775 | '>'
776 | '>='
777 | '<<'
778 | '>>'
779 | '+'
780 | '-'
781 | '*'
782 | '/'
783 | '%' ;
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000784
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000785 unop ::= '~'
786 | '+'
787 | '-'
788 | '!' ;
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000789
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000790.. _amdgpu_synid_expression_bin_op:
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000791
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000792Binary Operators
793----------------
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000794
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000795Binary operators are described in the following table.
796They operate on and produce 64-bit integers.
797Operators with higher priority are performed first.
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000798
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000799 ========== ========= ===============================================
800 Operator Priority Meaning
801 ========== ========= ===============================================
802 \* 5 Integer multiplication.
803 / 5 Integer division.
804 % 5 Integer signed remainder.
805 \+ 4 Integer addition.
806 \- 4 Integer subtraction.
807 << 3 Integer shift left.
808 >> 3 Logical shift right.
809 == 2 Equality comparison.
810 != 2 Inequality comparison.
811 <> 2 Inequality comparison.
812 < 2 Signed less than comparison.
813 <= 2 Signed less than or equal comparison.
814 > 2 Signed greater than comparison.
815 >= 2 Signed greater than or equal comparison.
816 \| 1 Bitwise or.
817 ^ 1 Bitwise xor.
818 & 1 Bitwise and.
819 && 0 Logical and.
820 || 0 Logical or.
821 ========== ========= ===============================================
Dmitry Preobrazhenskyc6d31e62018-03-12 15:55:08 +0000822
Dmitry Preobrazhensky47eb6362018-12-17 17:38:11 +0000823.. _amdgpu_synid_expression_un_op:
824
825Unary Operators
826---------------
827
828Unary operators are described in the following table.
829They operate on and produce 64-bit integers.
830
831 ========== ===============================================
832 Operator Meaning
833 ========== ===============================================
834 ! Logical negation.
835 ~ Bitwise negation.
836 \+ Integer unary plus.
837 \- Integer unary minus.
838 ========== ===============================================
839
840.. _amdgpu_synid_symbol:
841
842Symbols
843-------
844
845A symbol is a named 64-bit value, representing a relocatable
846address or an absolute (non-relocatable) number.
847
848Symbol names have the following syntax:
849 ``[a-zA-Z_.][a-zA-Z0-9_$.@]*``
850
851The table below provides several examples of syntax used for symbol definition.
852
853 ================ ==========================================================
854 Syntax Meaning
855 ================ ==========================================================
856 .globl <S> Declares a global symbol S without assigning it a value.
857 .set <S>, <E> Assigns the value of an expression E to a symbol S.
858 <S> = <E> Assigns the value of an expression E to a symbol S.
859 <S>: Declares a label S and assigns it the current PC value.
860 ================ ==========================================================
861
862A symbol may be used before it is declared or assigned;
863unassigned symbols are assumed to be PC-relative.
864
865Addition information about symbols may be found :ref:`here<amdgpu-symbols>`.
866
867.. _amdgpu_synid_conv:
868
869Conversions
870===========
871
872This section describes what happens when a 64-bit
873:ref:`integer number<amdgpu_synid_integer_number>`, a
874:ref:`floating-point numbers<amdgpu_synid_floating-point_number>` or a
875:ref:`symbol<amdgpu_synid_symbol>`
876is used for an operand which has a different type or size.
877
878Depending on operand kind, this conversion is performed by either assembler or AMDGPU H/W:
879
880* Values encoded as :ref:`inline constants<amdgpu_synid_constant>` are handled by H/W.
881* Values encoded as :ref:`literals<amdgpu_synid_literal>` are converted by assembler.
882
883.. _amdgpu_synid_const_conv:
884
885Inline Constants
886----------------
887
888.. _amdgpu_synid_int_const_conv:
889
890Integer Inline Constants
891~~~~~~~~~~~~~~~~~~~~~~~~
892
893Integer :ref:`inline constants<amdgpu_synid_constant>`
894may be thought of as 64-bit
895:ref:`integer numbers<amdgpu_synid_integer_number>`;
896when used as operands they are truncated to the size of
897:ref:`expected operand type<amdgpu_syn_instruction_type>`.
898No data type conversions are performed.
899
900Examples:
901
902.. code-block:: nasm
903
904 // GFX9
905
906 v_add_u16 v0, -1, 0 // v0 = 0xFFFF
907 v_add_f16 v0, -1, 0 // v0 = 0xFFFF (NaN)
908
909 v_add_u32 v0, -1, 0 // v0 = 0xFFFFFFFF
910 v_add_f32 v0, -1, 0 // v0 = 0xFFFFFFFF (NaN)
911
912.. _amdgpu_synid_fp_const_conv:
913
914Floating-Point Inline Constants
915~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
916
917Floating-point :ref:`inline constants<amdgpu_synid_constant>`
918may be thought of as 64-bit
919:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`;
920when used as operands they are converted to a floating-point number of
921:ref:`expected operand size<amdgpu_syn_instruction_type>`.
922
923Examples:
924
925.. code-block:: nasm
926
927 // GFX9
928
929 v_add_f16 v0, 1.0, 0 // v0 = 0x3C00 (1.0)
930 v_add_u16 v0, 1.0, 0 // v0 = 0x3C00
931
932 v_add_f32 v0, 1.0, 0 // v0 = 0x3F800000 (1.0)
933 v_add_u32 v0, 1.0, 0 // v0 = 0x3F800000
934
935
936.. _amdgpu_synid_lit_conv:
937
938Literals
939--------
940
941.. _amdgpu_synid_int_lit_conv:
942
943Integer Literals
944~~~~~~~~~~~~~~~~
945
946Integer :ref:`literals<amdgpu_synid_literal>`
947are specified as 64-bit :ref:`integer numbers<amdgpu_synid_integer_number>`.
948
949When used as operands they are converted to
950:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below.
951
952 ============== ============== =============== ====================================================================
953 Expected type Condition Result Note
954 ============== ============== =============== ====================================================================
955 i16, u16, b16 cond(num, 16) num.u16 Truncate to 16 bits.
956 i32, u32, b32 cond(num, 32) num.u32 Truncate to 32 bits.
957 i64 cond(num, 32) {-1, num.i32} Truncate to 32 bits and then sign-extend the result to 64 bits.
958 u64, b64 cond(num, 32) { 0, num.u32} Truncate to 32 bits and then zero-extend the result to 64 bits.
959 f16 cond(num, 16) num.u16 Use low 16 bits as an f16 value.
960 f32 cond(num, 32) num.u32 Use low 32 bits as an f32 value.
961 f64 cond(num, 32) {num.u32, 0} Use low 32 bits of the number as high 32 bits
962 of the result; low 32 bits of the result are zeroed.
963 ============== ============== =============== ====================================================================
964
965The condition *cond(X,S)* indicates if a 64-bit number *X*
966can be converted to a smaller size *S* by truncation of upper bits.
967There are two cases when the conversion is possible:
968
969* The truncated bits are all 0.
970* The truncated bits are all 1 and the value after truncation has its MSB bit set.
971
972Examples of valid literals:
973
974.. code-block:: nasm
975
976 // GFX9
977
978 v_add_u16 v0, 0xff00, v0 // value after conversion: 0xff00
979 v_add_u16 v0, 0xffffffffffffff00, v0 // value after conversion: 0xff00
980 v_add_u16 v0, -256, v0 // value after conversion: 0xff00
981
982 s_bfe_i64 s[0:1], 0xffefffff, s3 // value after conversion: 0xffffffffffefffff
983 s_bfe_u64 s[0:1], 0xffefffff, s3 // value after conversion: 0x00000000ffefffff
984 v_ceil_f64_e32 v[0:1], 0xffefffff // value after conversion: 0xffefffff00000000 (-1.7976922776554302e308)
985
986Examples of invalid literals:
987
988.. code-block:: nasm
989
990 // GFX9
991
992 v_add_u16 v0, 0x1ff00, v0 // conversion is not possible as truncated bits are not all 0 or 1
993 v_add_u16 v0, 0xffffffffffff00ff, v0 // conversion is not possible as truncated bits do not match MSB of the result
994
995.. _amdgpu_synid_fp_lit_conv:
996
997Floating-Point Literals
998~~~~~~~~~~~~~~~~~~~~~~~
999
1000Floating-point :ref:`literals<amdgpu_synid_literal>` are specified as 64-bit
1001:ref:`floating-point numbers<amdgpu_synid_floating-point_number>`.
1002
1003When used as operands they are converted to
1004:ref:`expected operand type<amdgpu_syn_instruction_type>` as described below.
1005
1006 ============== ============== ================= =================================================================
1007 Expected type Condition Result Note
1008 ============== ============== ================= =================================================================
1009 i16, u16, b16 cond(num, 16) f16(num) Convert to f16 and use bits of the result as an integer value.
1010 i32, u32, b32 cond(num, 32) f32(num) Convert to f32 and use bits of the result as an integer value.
1011 i64, u64, b64 false \- Conversion disabled because of an unclear semantics.
1012 f16 cond(num, 16) f16(num) Convert to f16.
1013 f32 cond(num, 32) f32(num) Convert to f32.
1014 f64 true {num.u32.hi, 0} Use high 32 bits of the number as high 32 bits of the result;
1015 zero-fill low 32 bits of the result.
1016
1017 Note that the result may differ from the original number.
1018 ============== ============== ================= =================================================================
1019
1020The condition *cond(X,S)* indicates if an f64 number *X* can be converted
1021to a smaller *S*-bit floating-point type without overflow or underflow.
1022Precision lost is allowed.
1023
1024Examples of valid literals:
1025
1026.. code-block:: nasm
1027
1028 // GFX9
1029
1030 v_add_f16 v1, 65500.0, v2
1031 v_add_f32 v1, 65600.0, v2
1032
1033 // value before conversion: 0x7fefffffffffffff (1.7976931348623157e308)
1034 v_ceil_f64 v[0:1], 1.7976931348623157e308 // value after conversion: 0x7fefffff00000000 (1.7976922776554302e308)
1035
1036Examples of invalid literals:
1037
1038.. code-block:: nasm
1039
1040 // GFX9
1041
1042 v_add_f16 v1, 65600.0, v2 // cannot be converted to f16 because of overflow
1043
1044.. _amdgpu_synid_exp_conv:
1045
1046Expressions
1047~~~~~~~~~~~
1048
1049Expressions operate with and result in 64-bit integers.
1050
1051When used as operands they are truncated to
1052:ref:`expected operand size<amdgpu_syn_instruction_type>`.
1053No data type conversions are performed.
1054
1055Examples:
1056
1057.. code-block:: nasm
1058
1059 // GFX9
1060
1061 x = 0.1
1062 v_sqrt_f32 v0, x // v0 = [low 32 bits of 0.1 (double)]
1063 v_sqrt_f32 v0, (0.1 + 0) // the same as above
1064 v_sqrt_f32 v0, 0.1 // v0 = [0.1 (double) converted to float]
1065