Blame - llvm/docs/AMDGPUOperandSyntax.rst - toolchain/llvm-project

blob: 4f3536eed40d05b6475e7ba146fc8b4458b5c2ec [file] [log] [blame]

Dmitry Preobrazhensky	c6d31e6	2018-03-12 15:55:08 +0000	[diff] [blame]	1	=================================================
				2	Syntax of AMDGPU Assembler Operands and Modifiers
				3	=================================================
				4
				5	.. contents::
				6	:local:
				7
				8	Conventions
				9	===========
				10
				11	The following conventions are used in syntax description:
				12
				13	=================== =============================================================
				14	Notation Description
				15	=================== =============================================================
				16	{0..N} Any integer value in the range from 0 to N (inclusive).
				17	Unless stated otherwise, this value may be specified as
				18	either a literal or an llvm expression.
				19	<x> Syntax and meaning of <x> is explained elsewhere.
				20	=================== =============================================================
				21
				22	.. _amdgpu_syn_operands:
				23
				24	Operands
				25	========
				26
				27	TBD
				28
				29	.. _amdgpu_syn_modifiers:
				30
				31	Modifiers
				32	=========
				33
				34	DS Modifiers
				35	------------
				36
				37	.. _amdgpu_synid_ds_offset8:
				38
				39	ds_offset8
				40	~~~~~~~~~~
				41
				42	Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0.
				43
				44	Used with DS instructions which have 2 addresses.
				45
				46	======================================== ================================================
				47	Syntax Description
				48	======================================== ================================================
				49	offset:{0..0xFF} Specifies a 8-bit offset.
				50	======================================== ================================================
				51
				52	.. _amdgpu_synid_ds_offset16:
				53
				54	ds_offset16
				55	~~~~~~~~~~~
				56
				57	Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0.
				58
				59	Used with DS instructions which have 1 address.
				60
				61	======================================== ================================================
				62	Syntax Description
				63	======================================== ================================================
				64	offset:{0..0xFFFF} Specifies a 16-bit offset.
				65	======================================== ================================================
				66
				67	.. _amdgpu_synid_sw_offset16:
				68
				69	sw_offset16
				70	~~~~~~~~~~~
				71
				72	This is a special modifier which may be used with ds_swizzle_b32 instruction only.
				73	Specifies a sizzle pattern in numeric or symbolic form. The default value is 0.
				74
				75	See AMD documentation for more information.
				76
				77	======================================================= ===================================================
				78	Syntax Description
				79	======================================================= ===================================================
				80	offset:{0..0xFFFF} Specifies a 16-bit swizzle pattern
				81	in a numeric form.
				82	offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3}) Specifies a quad permute mode pattern; each
				83	number is a lane id.
				84	offset:swizzle(BITMASK_PERM, "<mask>") Specifies a bitmask permute mode pattern
				85	which converts a 5-bit lane id to another
				86	lane id with which the lane interacts.
				87
				88	<mask> is a 5 character sequence which
				89	specifies how to transform the bits of the
				90	lane id. The following characters are allowed:
				91
				92	* "0" - set bit to 0.
				93
				94	* "1" - set bit to 1.
				95
				96	* "p" - preserve bit.
				97
				98	* "i" - inverse bit.
				99
				100	offset:swizzle(BROADCAST,{2..32},{0..N}) Specifies a broadcast mode.
				101	Broadcasts the value of any particular lane to
				102	all lanes in its group.
				103
				104	The first numeric parameter is a group
				105	size and must be equal to 2, 4, 8, 16 or 32.
				106
				107	The second numeric parameter is an index of the
				108	lane being broadcasted. The index must not exceed
				109	group size.
				110	offset:swizzle(SWAP,{1..16}) Specifies a swap mode.
				111	Swaps the neighboring groups of
				112	1, 2, 4, 8 or 16 lanes.
				113	offset:swizzle(REVERSE,{2..32}) Specifies a reverse mode. Reverses
				114	the lanes for groups of 2, 4, 8, 16 or 32 lanes.
				115	======================================================= ===================================================
				116
				117	.. _amdgpu_synid_gds:
				118
				119	gds
				120	~~~
				121
				122	Specifies whether to use GDS or LDS memory (LDS is the default).
				123
				124	======================================== ================================================
				125	Syntax Description
				126	======================================== ================================================
				127	gds Use GDS memory.
				128	======================================== ================================================
				129
				130
				131	EXP Modifiers
				132	-------------
				133
				134	.. _amdgpu_synid_done:
				135
				136	done
				137	~~~~
				138
				139	Specifies if this is the last export from the shader to the target. By default, current
				140	instruction does not finish an export sequence.
				141
				142	======================================== ================================================
				143	Syntax Description
				144	======================================== ================================================
				145	done Indicates the last export operation.
				146	======================================== ================================================
				147
				148	.. _amdgpu_synid_compr:
				149
				150	compr
				151	~~~~~
				152
				153	Indicates if the data are compressed (not compressed by default).
				154
				155	======================================== ================================================
				156	Syntax Description
				157	======================================== ================================================
				158	compr Data are compressed.
				159	======================================== ================================================
				160
				161	.. _amdgpu_synid_vm:
				162
				163	vm
				164	~~
				165
				166	Specifies valid mask flag state (off by default).
				167
				168	======================================== ================================================
				169	Syntax Description
				170	======================================== ================================================
				171	vm Set valid mask flag.
				172	======================================== ================================================
				173
				174	FLAT Modifiers
				175	--------------
				176
				177	.. _amdgpu_synid_flat_offset12:
				178
				179	flat_offset12
				180	~~~~~~~~~~~~~
				181
				182	Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
				183
				184	Cannot be used with global/scratch opcodes. GFX9 only.
				185
				186	======================================== ================================================
				187	Syntax Description
				188	======================================== ================================================
				189	offset:{0..4095} Specifies a 12-bit unsigned offset.
				190	======================================== ================================================
				191
				192	.. _amdgpu_synid_flat_offset13:
				193
				194	flat_offset13
				195	~~~~~~~~~~~~~
				196
				197	Specifies an immediate signed 13-bit offset, in bytes. The default value is 0.
				198
				199	Can be used with global/scratch opcodes only. GFX9 only.
				200
				201	======================================== ================================================
				202	Syntax Description
				203	======================================== ================================================
				204	offset:{-4096..+4095} Specifies a 13-bit signed offset.
				205	======================================== ================================================
				206
				207	glc
				208	~~~
				209
				210	See a description :ref:`here<amdgpu_synid_glc>`.
				211
				212	slc
				213	~~~
				214
				215	See a description :ref:`here<amdgpu_synid_slc>`.
				216
				217	tfe
				218	~~~
				219
				220	See a description :ref:`here<amdgpu_synid_tfe>`.
				221
				222	nv
				223	~~
				224
				225	See a description :ref:`here<amdgpu_synid_nv>`.
				226
				227	MIMG Modifiers
				228	--------------
				229
				230	.. _amdgpu_synid_dmask:
				231
				232	dmask
				233	~~~~~
				234
				235	Specifies which channels (image components) are used by the operation. By default, no channels
				236	are used.
				237
				238	======================================== ================================================
				239	Syntax Description
				240	======================================== ================================================
				241	dmask:{0..15} Each bit corresponds to one of 4 image
				242	components (RGBA). If the specified bit value
				243	is 0, the component is not used, value 1 means
				244	that the component is used.
				245	======================================== ================================================
				246
				247	This modifier has some limitations depending on instruction kind:
				248
				249	======================================== ================================================
				250	Instruction Kind Valid dmask Values
				251	======================================== ================================================
				252	32-bit atomic cmpswap 0x3
				253	other 32-bit atomic instructions 0x1
				254	64-bit atomic cmpswap 0xF
				255	other 64-bit atomic instructions 0x3
				256	GATHER4 0x1, 0x2, 0x4, 0x8
				257	Other instructions any value
				258	======================================== ================================================
				259
				260	.. _amdgpu_synid_unorm:
				261
				262	unorm
				263	~~~~~
				264
				265	Specifies whether address is normalized or not (normalized by default).
				266
				267	======================================== ================================================
				268	Syntax Description
				269	======================================== ================================================
				270	unorm Force address to be un-normalized.
				271	======================================== ================================================
				272
				273	glc
				274	~~~
				275
				276	See a description :ref:`here<amdgpu_synid_glc>`.
				277
				278	slc
				279	~~~
				280
				281	See a description :ref:`here<amdgpu_synid_slc>`.
				282
				283	.. _amdgpu_synid_r128:
				284
				285	r128
				286	~~~~
				287
				288	Specifies texture resource size. The default size is 256 bits.
				289
				290	GFX7 and GFX8 only.
				291
				292	======================================== ================================================
				293	Syntax Description
				294	======================================== ================================================
				295	r128 Specifies 128 bits texture resource size.
				296	======================================== ================================================
				297
				298	tfe
				299	~~~
				300
				301	See a description :ref:`here<amdgpu_synid_tfe>`.
				302
				303	.. _amdgpu_synid_lwe:
				304
				305	lwe
				306	~~~
				307
				308	Specifies LOD warning status (LOD warning is disabled by default).
				309
				310	======================================== ================================================
				311	Syntax Description
				312	======================================== ================================================
				313	lwe Enables LOD warning.
				314	======================================== ================================================
				315
				316	.. _amdgpu_synid_da:
				317
				318	da
				319	~~
				320
				321	Specifies if an array index must be sent to TA. By default, array index is not sent.
				322
				323	======================================== ================================================
				324	Syntax Description
				325	======================================== ================================================
				326	da Send an array-index to TA.
				327	======================================== ================================================
				328
				329	.. _amdgpu_synid_d16:
				330
				331	d16
				332	~~~
				333
				334	Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7.
				335
				336	======================================== ================================================
				337	Syntax Description
				338	======================================== ================================================
				339	d16 Enables 16-bits data mode.
				340
				341	On loads, convert data in memory to 16-bit
				342	format before storing it in VGPRs.
				343
				344	For stores, convert 16-bit data in VGPRs to
				345	32 bits before going to memory.
				346
				347	Note that 16-bit data are stored in VGPRs
				348	unpacked in GFX8.0. In GFX8.1 and GFX9 16-bit
				349	data are packed.
				350	======================================== ================================================
				351
				352	.. _amdgpu_synid_a16:
				353
				354	a16
				355	~~~
				356
				357	Specifies size of image address components: 16 or 32 bits (32 bits by default). GFX9 only.
				358
				359	======================================== ================================================
				360	Syntax Description
				361	======================================== ================================================
				362	a16 Enables 16-bits image address components.
				363	======================================== ================================================
				364
				365	Miscellaneous Modifiers
				366	-----------------------
				367
				368	.. _amdgpu_synid_glc:
				369
				370	glc
				371	~~~
				372
				373	This modifier has different meaning for loads, stores, and atomic operations.
				374	The default value is off (0).
				375
				376	See AMD documentation for details.
				377
				378	======================================== ================================================
				379	Syntax Description
				380	======================================== ================================================
				381	glc Set glc bit to 1.
				382	======================================== ================================================
				383
				384	.. _amdgpu_synid_slc:
				385
				386	slc
				387	~~~
				388
				389	Specifies cache policy. The default value is off (0).
				390
				391	See AMD documentation for details.
				392
				393	======================================== ================================================
				394	Syntax Description
				395	======================================== ================================================
				396	slc Set slc bit to 1.
				397	======================================== ================================================
				398
				399	.. _amdgpu_synid_tfe:
				400
				401	tfe
				402	~~~
				403
				404	Controls access to partially resident textures. The default value is off (0).
				405
				406	See AMD documentation for details.
				407
				408	======================================== ================================================
				409	Syntax Description
				410	======================================== ================================================
				411	tfe Set tfe bit to 1.
				412	======================================== ================================================
				413
				414	.. _amdgpu_synid_nv:
				415
				416	nv
				417	~~
				418
				419	Specifies if instruction is operating on non-volatile memory. By default, memory is volatile.
				420
				421	GFX9 only.
				422
				423	======================================== ================================================
				424	Syntax Description
				425	======================================== ================================================
				426	nv Indicates that instruction operates on
				427	non-volatile memory.
				428	======================================== ================================================
				429
				430	MUBUF/MTBUF Modifiers
				431	---------------------
				432
				433	.. _amdgpu_synid_idxen:
				434
				435	idxen
				436	~~~~~
				437
				438	Specifies whether address components include an index. By default, no components are used.
				439
				440	Can be used together with :ref:`offen<amdgpu_synid_offen>`.
				441
				442	Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
				443
				444	======================================== ================================================
				445	Syntax Description
				446	======================================== ================================================
				447	idxen Address components include an index.
				448	======================================== ================================================
				449
				450	.. _amdgpu_synid_offen:
				451
				452	offen
				453	~~~~~
				454
				455	Specifies whether address components include an offset. By default, no components are used.
				456
				457	Can be used together with :ref:`idxen<amdgpu_synid_idxen>`.
				458
				459	Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`.
				460
				461	======================================== ================================================
				462	Syntax Description
				463	======================================== ================================================
				464	offen Address components include an offset.
				465	======================================== ================================================
				466
				467	.. _amdgpu_synid_addr64:
				468
				469	addr64
				470	~~~~~~
				471
				472	Specifies whether a 64-bit address is used. By default, no address is used.
				473
				474	GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and
				475	:ref:`idxen<amdgpu_synid_idxen>` modifiers.
				476
				477	======================================== ================================================
				478	Syntax Description
				479	======================================== ================================================
				480	addr64 A 64-bit address is used.
				481	======================================== ================================================
				482
				483	.. _amdgpu_synid_buf_offset12:
				484
				485	buf_offset12
				486	~~~~~~~~~~~~
				487
				488	Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0.
				489
				490	======================================== ================================================
				491	Syntax Description
				492	======================================== ================================================
				493	offset:{0..0xFFF} Specifies a 12-bit unsigned offset.
				494	======================================== ================================================
				495
				496	glc
				497	~~~
				498
				499	See a description :ref:`here<amdgpu_synid_glc>`.
				500
				501	slc
				502	~~~
				503
				504	See a description :ref:`here<amdgpu_synid_slc>`.
				505
				506	.. _amdgpu_synid_lds:
				507
				508	lds
				509	~~~
				510
				511	Specifies where to store the result: VGPRs or LDS (VGPRs by default).
				512
				513	======================================== ================================================
				514	Syntax Description
				515	======================================== ================================================
				516	lds Store result in LDS.
				517	======================================== ================================================
				518
				519	tfe
				520	~~~
				521
				522	See a description :ref:`here<amdgpu_synid_tfe>`.
				523
				524	.. _amdgpu_synid_dfmt:
				525
				526	dfmt
				527	~~~~
				528
				529	TBD
				530
				531	.. _amdgpu_synid_nfmt:
				532
				533	nfmt
				534	~~~~
				535
				536	TBD
				537
				538	SMRD/SMEM Modifiers
				539	-------------------
				540
				541	glc
				542	~~~
				543
				544	See a description :ref:`here<amdgpu_synid_glc>`.
				545
				546	nv
				547	~~
				548
				549	See a description :ref:`here<amdgpu_synid_nv>`.
				550
				551	VINTRP Modifiers
				552	----------------
				553
				554	.. _amdgpu_synid_high:
				555
				556	high
				557	~~~~
				558
				559	Specifies which half of the LDS word to use. Low half of LDS word is used by default.
				560	GFX9 only.
				561
				562	======================================== ================================================
				563	Syntax Description
				564	======================================== ================================================
				565	high Use high half of LDS word.
				566	======================================== ================================================
				567
				568	VOP1/VOP2 DPP Modifiers
				569	-----------------------
				570
				571	GFX8 and GFX9 only.
				572
				573	.. _amdgpu_synid_dpp_ctrl:
				574
				575	dpp_ctrl
				576	~~~~~~~~
				577
				578	Specifies how data are shared between threads. This is a mandatory modifier.
				579	There is no default value.
				580
				581	Note. The lanes of a wavefront are organized in four banks and four rows.
				582
				583	======================================== ================================================
				584	Syntax Description
				585	======================================== ================================================
				586	quad_perm:[{0..3},{0..3},{0..3},{0..3}] Full permute of 4 threads.
				587	row_mirror Mirror threads within row.
				588	row_half_mirror Mirror threads within 1/2 row (8 threads).
				589	row_bcast:15 Broadcast 15th thread of each row to next row.
				590	row_bcast:31 Broadcast thread 31 to rows 2 and 3.
				591	wave_shl:1 Wavefront left shift by 1 thread.
				592	wave_rol:1 Wavefront left rotate by 1 thread.
				593	wave_shr:1 Wavefront right shift by 1 thread.
				594	wave_ror:1 Wavefront right rotate by 1 thread.
				595	row_shl:{1..15} Row shift left by 1-15 threads.
				596	row_shr:{1..15} Row shift right by 1-15 threads.
				597	row_ror:{1..15} Row rotate right by 1-15 threads.
				598	======================================== ================================================
				599
				600	.. _amdgpu_synid_row_mask:
				601
				602	row_mask
				603	~~~~~~~~
				604
				605	Controls which rows are enabled for data sharing. By default, all rows are enabled.
				606
				607	Note. The lanes of a wavefront are organized in four banks and four rows.
				608
				609	======================================== ================================================
				610	Syntax Description
				611	======================================== ================================================
				612	row_mask:{0..15} Each of 4 bits in the mask controls one
				613	row (0 - disabled, 1 - enabled).
				614	======================================== ================================================
				615
				616	.. _amdgpu_synid_bank_mask:
				617
				618	bank_mask
				619	~~~~~~~~~
				620
				621	Controls which banks are enabled for data sharing. By default, all banks are enabled.
				622
				623	Note. The lanes of a wavefront are organized in four banks and four rows.
				624
				625	======================================== ================================================
				626	Syntax Description
				627	======================================== ================================================
				628	bank_mask:{0..15} Each of 4 bits in the mask controls one
				629	bank (0 - disabled, 1 - enabled).
				630	======================================== ================================================
				631
				632	.. _amdgpu_synid_bound_ctrl:
				633
				634	bound_ctrl
				635	~~~~~~~~~~
				636
				637	Controls data sharing when accessing an invalid lane. By default, data sharing with
				638	invalid lanes is disabled.
				639
				640	======================================== ================================================
				641	Syntax Description
				642	======================================== ================================================
				643	bound_ctrl:0 Enables data sharing with invalid lanes.
				644	Accessing data from an invalid lane will
				645	return zero.
				646	======================================== ================================================
				647
				648	VOP1/VOP2/VOPC SDWA Modifiers
				649	-----------------------------
				650
				651	GFX8 and GFX9 only.
				652
				653	clamp
				654	~~~~~
				655
				656	See a description :ref:`here<amdgpu_synid_clamp>`.
				657
				658	omod
				659	~~~~
				660
				661	See a description :ref:`here<amdgpu_synid_omod>`.
				662
				663	GFX9 only.
				664
				665	.. _amdgpu_synid_dst_sel:
				666
				667	dst_sel
				668	~~~~~~~
				669
				670	Selects which bits in the destination are affected. By default, all bits are affected.
				671
				672	======================================== ================================================
				673	Syntax Description
				674	======================================== ================================================
				675	dst_sel:DWORD Use bits 31:0.
				676	dst_sel:BYTE_0 Use bits 7:0.
				677	dst_sel:BYTE_1 Use bits 15:8.
				678	dst_sel:BYTE_2 Use bits 23:16.
				679	dst_sel:BYTE_3 Use bits 31:24.
				680	dst_sel:WORD_0 Use bits 15:0.
				681	dst_sel:WORD_1 Use bits 31:16.
				682	======================================== ================================================
				683
				684
				685	.. _amdgpu_synid_dst_unused:
				686
				687	dst_unused
				688	~~~~~~~~~~
				689
				690	Controls what to do with the bits in the destination which are not selected
				691	by :ref:`dst_sel<amdgpu_synid_dst_sel>`.
				692	By default, unused bits are preserved.
				693
				694	======================================== ================================================
				695	Syntax Description
				696	======================================== ================================================
				697	dst_unused:UNUSED_PAD Pad with zeros.
				698	dst_unused:UNUSED_SEXT Sign-extend upper bits, zero lower bits.
				699	dst_unused:UNUSED_PRESERVE Preserve bits.
				700	======================================== ================================================
				701
				702	.. _amdgpu_synid_src0_sel:
				703
				704	src0_sel
				705	~~~~~~~~
				706
				707	Controls which bits in the src0 are used. By default, all bits are used.
				708
				709	======================================== ================================================
				710	Syntax Description
				711	======================================== ================================================
				712	src0_sel:DWORD Use bits 31:0.
				713	src0_sel:BYTE_0 Use bits 7:0.
				714	src0_sel:BYTE_1 Use bits 15:8.
				715	src0_sel:BYTE_2 Use bits 23:16.
				716	src0_sel:BYTE_3 Use bits 31:24.
				717	src0_sel:WORD_0 Use bits 15:0.
				718	src0_sel:WORD_1 Use bits 31:16.
				719	======================================== ================================================
				720
				721	.. _amdgpu_synid_src1_sel:
				722
				723	src1_sel
				724	~~~~~~~~
				725
				726	Controls which bits in the src1 are used. By default, all bits are used.
				727
				728	======================================== ================================================
				729	Syntax Description
				730	======================================== ================================================
				731	src1_sel:DWORD Use bits 31:0.
				732	src1_sel:BYTE_0 Use bits 7:0.
				733	src1_sel:BYTE_1 Use bits 15:8.
				734	src1_sel:BYTE_2 Use bits 23:16.
				735	src1_sel:BYTE_3 Use bits 31:24.
				736	src1_sel:WORD_0 Use bits 15:0.
				737	src1_sel:WORD_1 Use bits 31:16.
				738	======================================== ================================================
				739
				740	VOP1/VOP2/VOPC SDWA Operand Modifiers
				741	-------------------------------------
				742
				743	Operand modifiers are not used separately. They are applied to source operands.
				744
				745	GFX8 and GFX9 only.
				746
				747	abs
				748	~~~
				749
				750	See a description :ref:`here<amdgpu_synid_abs>`.
				751
				752	neg
				753	~~~
				754
				755	See a description :ref:`here<amdgpu_synid_neg>`.
				756
				757	.. _amdgpu_synid_sext:
				758
				759	sext
				760	~~~~
				761
				762	Sign-extends value of a (sub-dword) operand to fill all 32 bits.
				763	Has no effect for 32-bit operands.
				764
				765	Valid for integer operands only.
				766
				767	======================================== ================================================
				768	Syntax Description
				769	======================================== ================================================
				770	sext(<operand>) Sign-extend operand value.
				771	======================================== ================================================
				772
				773	VOP3 Modifiers
				774	--------------
				775
				776	.. _amdgpu_synid_vop3_op_sel:
				777
				778	vop3_op_sel
				779	~~~~~~~~~~~
				780
				781	Selects the low [15:0] or high [31:16] operand bits for source and destination operands.
				782	By default, low bits are used for all operands.
				783
				784	The number of values specified with the op_sel modifier must match the number of instruction
				785	operands (both source and destination). First value controls src0, second value controls src1
				786	and so on, except that the last value controls destination.
				787	The value 0 selects the low bits, while 1 selects the high bits.
				788
				789	Note. op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified
				790	by op_sel must be 0.
				791
				792	GFX9 only.
				793
				794	======================================== ============================================================
				795	Syntax Description
				796	======================================== ============================================================
				797	op_sel:[{0..1},{0..1}] Select operand bits for instructions with 1 source operand.
				798	op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
				799	op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
				800	======================================== ============================================================
				801
				802	.. _amdgpu_synid_clamp:
				803
				804	clamp
				805	~~~~~
				806
				807	Clamp meaning depends on instruction.
				808
				809	For v_cmp instructions, clamp modifier indicates that the compare signals
				810	if a floating point exception occurs. By default, signaling is disabled.
				811	Not supported by GFX7.
				812
				813	For integer operations, clamp modifier indicates that the result must be clamped
				814	to the largest and smallest representable value. By default, there is no clamping.
				815	Integer clamping is not supported by GFX7.
				816
				817	For floating point operations, clamp modifier indicates that the result must be clamped
				818	to the range [0.0, 1.0]. By default, there is no clamping.
				819
				820	Note. Clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any).
				821
				822	======================================== ================================================
				823	Syntax Description
				824	======================================== ================================================
				825	clamp Enables clamping (or signaling).
				826	======================================== ================================================
				827
				828	.. _amdgpu_synid_omod:
				829
				830	omod
				831	~~~~
				832
				833	Specifies if an output modifier must be applied to the result.
				834	By default, no output modifiers are applied.
				835
				836	Note. Output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any).
				837
				838	Output modifiers are valid for f32 and f64 floating point results only.
				839	They must not be used with f16.
				840
				841	Note. v_cvt_f16_f32 is an exception. This instruction produces f16 result
				842	but accepts output modifiers.
				843
				844	======================================== ================================================
				845	Syntax Description
				846	======================================== ================================================
				847	mul:2 Multiply the result by 2.
				848	mul:4 Multiply the result by 4.
				849	div:2 Multiply the result by 0.5.
				850	======================================== ================================================
				851
				852	VOP3 Operand Modifiers
				853	----------------------
				854
				855	Operand modifiers are not used separately. They are applied to source operands.
				856
				857	.. _amdgpu_synid_abs:
				858
				859	abs
				860	~~~
				861
				862	Computes absolute value of its operand. Applied before :ref:`neg<amdgpu_synid_neg>` (if any).
				863	Valid for floating point operands only.
				864
				865	======================================== ================================================
				866	Syntax Description
				867	======================================== ================================================
				868	abs(<operand>) Get absolute value of operand.
				869	\\|<operand>\| The same as above.
				870	======================================== ================================================
				871
				872	.. _amdgpu_synid_neg:
				873
				874	neg
				875	~~~
				876
				877	Computes negative value of its operand. Applied after :ref:`abs<amdgpu_synid_abs>` (if any).
				878	Valid for floating point operands only.
				879
				880	======================================== ================================================
				881	Syntax Description
				882	======================================== ================================================
				883	neg(<operand>) Get negative value of operand.
				884	-<operand> The same as above.
				885	======================================== ================================================
				886
				887	VOP3P Modifiers
				888	---------------
				889
				890	This section describes modifiers of regular VOP3P instructions.
				891	v_mad_mix modifiers are described :ref:`in a separate section<amdgpu_synid_mad_mix>`.
				892
				893	GFX9 only.
				894
				895	.. _amdgpu_synid_op_sel:
				896
				897	op_sel
				898	~~~~~~
				899
				900	Selects the low [15:0] or high [31:16] operand bits as input to the operation
				901	which results in the lower-half of the destination.
				902	By default, low bits are used for all operands.
				903
				904	The number of values specified with the op_sel modifier must match the number of source
				905	operands. First value controls src0, second value controls src1 and so on.
				906	The value 0 selects the low bits, while 1 selects the high bits.
				907
				908	======================================== =============================================================
				909	Syntax Description
				910	======================================== =============================================================
				911	op_sel:[{0..1}] Select operand bits for instructions with 1 source operand.
				912	op_sel:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
				913	op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
				914	======================================== =============================================================
				915
				916	.. _amdgpu_synid_op_sel_hi:
				917
				918	op_sel_hi
				919	~~~~~~~~~
				920
				921	Selects the low [15:0] or high [31:16] operand bits as input to the operation
				922	which results in the upper-half of the destination.
				923	By default, high bits are used for all operands.
				924
				925	The number of values specified with the op_sel_hi modifier must match the number of source
				926	operands. First value controls src0, second value controls src1 and so on.
				927	The value 0 selects the low bits, while 1 selects the high bits.
				928
				929	======================================== =============================================================
				930	Syntax Description
				931	======================================== =============================================================
				932	op_sel_hi:[{0..1}] Select operand bits for instructions with 1 source operand.
				933	op_sel_hi:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands.
				934	op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands.
				935	======================================== =============================================================
				936
				937	.. _amdgpu_synid_neg_lo:
				938
				939	neg_lo
				940	~~~~~~
				941
				942	Specifies whether to change sign of operand values selected by
				943	:ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used
				944	as input to the operation which results in the upper-half of the destination.
				945
				946	The number of values specified with this modifier must match the number of source
				947	operands. First value controls src0, second value controls src1 and so on.
				948
				949	The value 0 indicates that the corresponding operand value is used unmodified,
				950	the value 1 indicates that negative value of the operand must be used.
				951
				952	By default, operand values are used unmodified.
				953
				954	This modifier is valid for floating point operands only.
				955
				956	======================================== ==================================================================
				957	Syntax Description
				958	======================================== ==================================================================
				959	neg_lo:[{0..1}] Select affected operands for instructions with 1 source operand.
				960	neg_lo:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands.
				961	neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
				962	======================================== ==================================================================
				963
				964	.. _amdgpu_synid_neg_hi:
				965
				966	neg_hi
				967	~~~~~~
				968
				969	Specifies whether to change sign of operand values selected by
				970	:ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used
				971	as input to the operation which results in the upper-half of the destination.
				972
				973	The number of values specified with this modifier must match the number of source
				974	operands. First value controls src0, second value controls src1 and so on.
				975
				976	The value 0 indicates that the corresponding operand value is used unmodified,
				977	the value 1 indicates that negative value of the operand must be used.
				978
				979	By default, operand values are used unmodified.
				980
				981	This modifier is valid for floating point operands only.
				982
				983	======================================== ==================================================================
				984	Syntax Description
				985	======================================== ==================================================================
				986	neg_hi:[{0..1}] Select affected operands for instructions with 1 source operand.
				987	neg_hi:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands.
				988	neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands.
				989	======================================== ==================================================================
				990
				991	clamp
				992	~~~~~
				993
				994	See a description :ref:`here<amdgpu_synid_clamp>`.
				995
				996	.. _amdgpu_synid_mad_mix:
				997
				998	VOP3P V_MAD_MIX Modifiers
				999	-------------------------
				1000
				1001	These instructions use VOP3P format but have different modifiers.
				1002
				1003	GFX9 only.
				1004
Dmitry Preobrazhensky	c80b165	2018-07-27 14:17:15 +0000	[diff] [blame^]	1005	.. _amdgpu_synid_mad_mix_op_sel:
Dmitry Preobrazhensky	c6d31e6	2018-03-12 15:55:08 +0000	[diff] [blame]	1006
Dmitry Preobrazhensky	c80b165	2018-07-27 14:17:15 +0000	[diff] [blame^]	1007	mad_mix_op_sel
				1008	~~~~~~~~~~~~~~
Dmitry Preobrazhensky	c6d31e6	2018-03-12 15:55:08 +0000	[diff] [blame]	1009
				1010	This operand has meaning only for 16-bit source operands as indicated by
Dmitry Preobrazhensky	c80b165	2018-07-27 14:17:15 +0000	[diff] [blame^]	1011	:ref:`mad_mix_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`.
Dmitry Preobrazhensky	c6d31e6	2018-03-12 15:55:08 +0000	[diff] [blame]	1012	It specifies to select either the low [15:0] or high [31:16] operand bits
				1013	as input to the operation.
				1014
				1015	The value 0 indicates the low bits, the value 1 indicates the high 16 bits.
				1016	By default, low bits are used for all operands.
				1017
				1018	======================================== ================================================
				1019	Syntax Description
				1020	======================================== ================================================
Dmitry Preobrazhensky	c80b165	2018-07-27 14:17:15 +0000	[diff] [blame^]	1021	op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand.
				1022	======================================== ================================================
				1023
				1024	.. _amdgpu_synid_mad_mix_op_sel_hi:
				1025
				1026	mad_mix_op_sel_hi
				1027	~~~~~~~~~~~~~~~~~
				1028
				1029	Selects the size of source operands: either 32 bits or 16 bits.
				1030	By default, 32 bits are used for all source operands.
				1031
				1032	The value 0 indicates 32 bits, the value 1 indicates 16 bits.
				1033	The location of 16 bits in the operand may be specified by
				1034	:ref:`mad_mix_op_sel<amdgpu_synid_mad_mix_op_sel>`.
				1035
				1036	======================================== ================================================
				1037	Syntax Description
				1038	======================================== ================================================
				1039	op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand.
Dmitry Preobrazhensky	c6d31e6	2018-03-12 15:55:08 +0000	[diff] [blame]	1040	======================================== ================================================
				1041
				1042	abs
				1043	~~~
				1044
				1045	See a description :ref:`here<amdgpu_synid_abs>`.
				1046
				1047	neg
				1048	~~~
				1049
				1050	See a description :ref:`here<amdgpu_synid_neg>`.
				1051
				1052	clamp
				1053	~~~~~
				1054
				1055	See a description :ref:`here<amdgpu_synid_clamp>`.