Blame - llvm/docs/AMDGPUDwarfExtensionsForHeterogeneousDebugging.rst - toolchain/llvm-project

blob: 777e271423abed0d48b204413743e1385cefe122 [file] [log] [blame]

Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	1	.. _amdgpu-dwarf-extensions-for-heterogeneous-debugging:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	3	********************************************
				4	DWARF Extensions For Heterogeneous Debugging
				5	********************************************
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	6
				7	.. contents::
				8	:local:
				9
				10	.. warning::
				11
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	12	This document describes provisional extensions to DWARF Version 5
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	13	[:ref:`DWARF <amdgpu-dwarf-DWARF>`] to support heterogeneous debugging. It is
				14	not currently fully implemented and is subject to change.
				15
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	16	.. _amdgpu-dwarf-introduction:
				17
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	18	Introduction
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	19	============
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	20
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	21	AMD [:ref:`AMD <amdgpu-dwarf-AMD>`] has been working on supporting heterogeneous
				22	computing through the AMD Radeon Open Compute Platform (ROCm) [:ref:`AMD-ROCm
				23	<amdgpu-dwarf-AMD-ROCm>`]. A heterogeneous computing program can be written in a
				24	high level language such as C++ or Fortran with OpenMP pragmas, OpenCL, or HIP
				25	(a portable C++ programming environment for heterogeneous computing [:ref:`HIP
				26	<amdgpu-dwarf-HIP>`]). A heterogeneous compiler and runtime allows a program to
				27	execute on multiple devices within the same native process. Devices could
				28	include CPUs, GPUs, DSPs, FPGAs, or other special purpose accelerators.
				29	Currently HIP programs execute on systems with CPUs and GPUs.
				30
				31	ROCm is fully open sourced and includes contributions to open source projects
				32	such as LLVM for compilation [:ref:`LLVM <amdgpu-dwarf-LLVM>`] and GDB for
				33	debugging [:ref:`GDB <amdgpu-dwarf-GDB>`], as well as collaboration with other
				34	third party projects such as the GCC compiler [:ref:`GCC <amdgpu-dwarf-GCC>`]
				35	and the Perforce TotalView HPC debugger [:ref:`Perforce-TotalView
				36	<amdgpu-dwarf-Perforce-TotalView>`].
				37
				38	To support debugging heterogeneous programs several features that are not
				39	provided by current DWARF Version 5 [:ref:`DWARF <amdgpu-dwarf-DWARF>`] have
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	40	been identified. This document contains a collection of extensions to address
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	41	providing those features.
				42
				43	The :ref:`amdgpu-dwarf-motivation` section describes the issues that are being
				44	addressed for heterogeneous computing. That is followed by the
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	45	:ref:`amdgpu-dwarf-changes-relative-to-dwarf-version-5` section containing the
				46	textual changes for the extensions relative to the DWARF Version 5 standard.
				47	Then there is an :ref:`amdgpu-dwarf-examples` section that links to the AMD GPU
				48	specific usage of the extensions that includes an example. Finally, there is a
				49	:ref:`amdgpu-dwarf-references` section. There are a number of notes included
				50	that raise open questions, or provide alternative approaches considered. The
				51	extensions seek to be general in nature and backwards compatible with DWARF
				52	Version 5. The goal is to be applicable to meeting the needs of any
				53	heterogeneous system and not be vendor or architecture specific.
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	54
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	55	A fundamental aspect of the extensions is that it allows DWARF expression
				56	location descriptions as stack elements. The extensions are based on DWARF
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	57	Version 5 and maintains compatibility with DWARF Version 5. After attempting
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	58	several alternatives, the current thinking is that such extensions to DWARF
				59	Version 5 are the simplest and cleanest ways to support debugging optimized GPU
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	60	code. It also appears to be generally useful and may be able to address other
				61	reported DWARF issues, as well as being helpful in providing better optimization
				62	support for non-GPU code.
				63
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	64	General feedback on these extensions is sought, together with suggestions on how
				65	to clarify, simplify, or organize them. If their is general interest then some
				66	or all of these extensions could be submitted as future DWARF proposals.
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	67
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	68	We are in the process of modifying LLVM and GDB to support these extensions
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	69	which is providing experience and insights. We plan to upstream the changes to
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	70	those projects for any final form of the extensions.
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	71
				72	The author very much appreciates the input provided so far by many others which
				73	has been incorporated into this current version.
				74
				75	.. _amdgpu-dwarf-motivation:
				76
				77	Motivation
				78	==========
				79
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	80	This document presents a set of backwards compatible extensions to DWARF Version
				81	5 [:ref:`DWARF <amdgpu-dwarf-DWARF>`] to support heterogeneous debugging.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	82
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	83	The remainder of this section provides motivation for each extension in
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	84	terms of heterogeneous debugging on commercially available AMD GPU hardware
				85	(AMDGPU). The goal is to add support to the AMD [:ref:`AMD <amdgpu-dwarf-AMD>`]
				86	open source Radeon Open Compute Platform (ROCm) [:ref:`AMD-ROCm
				87	<amdgpu-dwarf-AMD-ROCm>`] which is an implementation of the industry standard
				88	for heterogeneous computing devices defined by the Heterogeneous System
				89	Architecture (HSA) Foundation [:ref:`HSA <amdgpu-dwarf-HSA>`]. ROCm includes the
				90	LLVM compiler [:ref:`LLVM <amdgpu-dwarf-LLVM>`] with upstreamed support for
				91	AMDGPU [:ref:`AMDGPU-LLVM <amdgpu-dwarf-AMDGPU-LLVM>`]. The goal is to also add
				92	the GDB debugger [:ref:`GDB <amdgpu-dwarf-GDB>`] with upstreamed support for
				93	AMDGPU [:ref:`AMD-ROCgdb <amdgpu-dwarf-AMD-ROCgdb>`]. In addition, the goal is
				94	to work with third parties to enable support for AMDGPU debugging in the GCC
				95	compiler [:ref:`GCC <amdgpu-dwarf-GCC>`] and the Perforce TotalView HPC debugger
				96	[:ref:`Perforce-TotalView <amdgpu-dwarf-Perforce-TotalView>`].
				97
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	98	However, the extensions are intended to be vendor and architecture neutral. They
Kazu Hirata	a31b389	2020-08-09 19:29:38 -0700	[diff] [blame]	99	are believed to apply to other heterogeneous hardware devices including GPUs,
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	100	DSPs, FPGAs, and other specialized hardware. These collectively include similar
				101	characteristics and requirements as AMDGPU devices. Some of the extension can
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	102	also apply to traditional CPU hardware that supports large vector registers.
				103	Compilers can map source languages and extensions that describe large scale
				104	parallel execution onto the lanes of the vector registers. This is common in
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	105	programming languages used in ML and HPC. The extensions also include improved
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	106	support for optimized code on any architecture. Some of the generalizations may
				107	also benefit other issues that have been raised.
				108
vnalamot	b9496ef	2020-08-24 23:55:34 +0530	[diff] [blame]	109	The extensions have evolved through collaboration with many individuals and
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	110	active prototyping within the GDB debugger and LLVM compiler. Input has also
				111	been very much appreciated from the developers working on the Perforce TotalView
				112	HPC Debugger and GCC compiler.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	113
				114	The AMDGPU has several features that require additional DWARF functionality in
				115	order to support optimized code.
				116
				117	AMDGPU optimized code may spill vector registers to non-global address space
				118	memory, and this spilling may be done only for lanes that are active on entry
				119	to the subprogram. To support this, a location description that can be created
				120	as a masked select is required. See ``DW_OP_LLVM_select_bit_piece``.
				121
				122	Since the active lane mask may be held in a register, a way to get the value
				123	of a register on entry to a subprogram is required. To support this an
				124	operation that returns the caller value of a register as specified by the Call
				125	Frame Information (CFI) is required. See ``DW_OP_LLVM_call_frame_entry_reg``
				126	and :ref:`amdgpu-dwarf-call-frame-information`.
				127
				128	Current DWARF uses an empty expression to indicate an undefined location
				129	description. Since the masked select composite location description operation
				130	takes more than one location description, it is necessary to have an explicit
				131	way to specify an undefined location description. Otherwise it is not possible
				132	to specify that a particular one of the input location descriptions is
				133	undefined. See ``DW_OP_LLVM_undefined``.
				134
				135	CFI describes restoring callee saved registers that are spilled. Currently CFI
				136	only allows a location description that is a register, memory address, or
				137	implicit location description. AMDGPU optimized code may spill scalar
				138	registers into portions of vector registers. This requires extending CFI to
				139	allow any location description. See
				140	:ref:`amdgpu-dwarf-call-frame-information`.
				141
				142	The vector registers of the AMDGPU are represented as their full wavefront
				143	size, meaning the wavefront size times the dword size. This reflects the
				144	actual hardware and allows the compiler to generate DWARF for languages that
				145	map a thread to the complete wavefront. It also allows more efficient DWARF to
				146	be generated to describe the CFI as only a single expression is required for
				147	the whole vector register, rather than a separate expression for each lane's
				148	dword of the vector register. It also allows the compiler to produce DWARF
				149	that indexes the vector register if it spills scalar registers into portions
vnalamot	b9496ef	2020-08-24 23:55:34 +0530	[diff] [blame]	150	of a vector register.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	151
				152	Since DWARF stack value entries have a base type and AMDGPU registers are a
				153	vector of dwords, the ability to specify that a base type is a vector is
				154	required. See ``DW_AT_LLVM_vector_size``.
				155
				156	If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner,
				157	then the variable DWARF location expressions must compute the location for a
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	158	single lane of the wavefront. Therefore, a DWARF operation is required to denote
				159	the current lane, much like ``DW_OP_push_object_address`` denotes the current
				160	object. The ``DW_OP_*piece`` operations only allow literal indices. Therefore, a
				161	way to use a computed offset of an arbitrary location description (such as a
				162	vector register) is required. See ``DW_OP_LLVM_push_lane``,
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	163	``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_uconst``, and
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	164	``DW_OP_LLVM_bit_offset``.
				165
				166	If the source language is mapped onto the AMDGPU wavefronts in a SIMT manner
				167	the compiler can use the AMDGPU execution mask register to control which lanes
				168	are active. To describe the conceptual location of non-active lanes a DWARF
				169	expression is needed that can compute a per lane PC. For efficiency, this is
				170	done for the wavefront as a whole. This expression benefits by having a masked
				171	select composite location description operation. This requires an attribute
				172	for source location of each lane. The AMDGPU may update the execution mask for
				173	whole wavefront operations and so needs an attribute that computes the current
				174	active lane mask. See ``DW_OP_LLVM_select_bit_piece``, ``DW_OP_LLVM_extend``,
				175	``DW_AT_LLVM_lane_pc``, and ``DW_AT_LLVM_active_lane``.
				176
				177	AMDGPU needs to be able to describe addresses that are in different kinds of
				178	memory. Optimized code may need to describe a variable that resides in pieces
				179	that are in different kinds of storage which may include parts of registers,
				180	memory that is in a mixture of memory kinds, implicit values, or be undefined.
				181	DWARF has the concept of segment addresses. However, the segment cannot be
				182	specified within a DWARF expression, which is only able to specify the offset
				183	portion of a segment address. The segment index is only provided by the entity
				184	that specifies the DWARF expression. Therefore, the segment index is a
				185	property that can only be put on complete objects, such as a variable. That
				186	makes it only suitable for describing an entity (such as variable or
				187	subprogram code) that is in a single kind of memory. Therefore, AMDGPU uses
				188	the DWARF concept of address spaces. For example, a variable may be allocated
				189	in a register that is partially spilled to the call stack which is in the
				190	private address space, and partially spilled to the local address space.
				191
				192	DWARF uses the concept of an address in many expression operations but does not
				193	define how it relates to address spaces. For example,
				194	``DW_OP_push_object_address`` pushes the address of an object. Other contexts
				195	implicitly push an address on the stack before evaluating an expression. For
				196	example, the ``DW_AT_use_location`` attribute of the
				197	``DW_TAG_ptr_to_member_type``. The expression that uses the address needs to
				198	do so in a general way and not need to be dependent on the address space of
				199	the address. For example, a pointer to member value may want to be applied to
				200	an object that may reside in any address space.
				201
				202	The number of registers and the cost of memory operations is much higher for
				203	AMDGPU than a typical CPU. The compiler attempts to optimize whole variables
				204	and arrays into registers. Currently DWARF only allows
				205	``DW_OP_push_object_address`` and related operations to work with a global
				206	memory location. To support AMDGPU optimized code it is required to generalize
				207	DWARF to allow any location description to be used. This allows registers, or
				208	composite location descriptions that may be a mixture of memory, registers, or
				209	even implicit values.
				210
				211	DWARF Version 5 does not allow location descriptions to be entries on the
				212	DWARF stack. They can only be the final result of the evaluation of a DWARF
				213	expression. However, by allowing a location description to be a first-class
				214	entry on the DWARF stack it becomes possible to compose expressions containing
				215	both values and location descriptions naturally. It allows objects to be
				216	located in any kind of memory address space, in registers, be implicit values,
				217	be undefined, or a composite of any of these. By extending DWARF carefully,
				218	all existing DWARF expressions can retain their current semantic meaning.
				219	DWARF has implicit conversions that convert from a value that represents an
				220	address in the default address space to a memory location description. This
				221	can be extended to allow a default address space memory location description
				222	to be implicitly converted back to its address value. This allows all DWARF
				223	Version 5 expressions to retain their same meaning, while adding the ability
				224	to explicitly create memory location descriptions in non-default address
				225	spaces and generalizing the power of composite location descriptions to any
				226	kind of location description. See :ref:`amdgpu-dwarf-operation-expressions`.
				227
				228	To allow composition of composite location descriptions, an explicit operation
				229	that indicates the end of the definition of a composite location description
				230	is required. This can be implied if the end of a DWARF expression is reached,
				231	allowing current DWARF expressions to remain legal. See
				232	``DW_OP_LLVM_piece_end``.
				233
				234	The ``DW_OP_plus`` and ``DW_OP_minus`` can be defined to operate on a memory
				235	location description in the default target architecture specific address space
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	236	and a generic type value to produce an updated memory location description. This
				237	allows them to continue to be used to offset an address. To generalize
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	238	offsetting to any location description, including location descriptions that
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	239	describe when bytes are in registers, are implicit, or a composite of these, the
				240	``DW_OP_LLVM_offset``, ``DW_OP_LLVM_offset_uconst``, and
				241	``DW_OP_LLVM_bit_offset`` offset operations are added. Unlike ``DW_OP_plus``,
				242	``DW_OP_plus_uconst``, and ``DW_OP_minus`` arithmetic operations, these do not
				243	define that integer overflow causes wrap-around. The offset operations can
				244	operate on location storage of any size. For example, implicit location storage
				245	could be any number of bits in size. It is simpler to define offsets that exceed
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	246	the size of the location storage as being an evaluation error, than having to
				247	force an implementation to support potentially infinite precision offsets to
				248	allow it to correctly track a series of positive and negative offsets that may
				249	transiently overflow or underflow, but end up in range. This is simple for the
				250	arithmetic operations as they are defined in terms of two's compliment
				251	arithmetic on a base type of a fixed size.
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	252
				253	Having the offset operations allows ``DW_OP_push_object_address`` to push a
				254	location description that may be in a register, or be an implicit value, and the
				255	DWARF expression of ``DW_TAG_ptr_to_member_type`` can contain them to offset
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	256	within it. ``DW_OP_LLVM_bit_offset`` generalizes DWARF to work with bit fields
				257	which is not possible in DWARF Version 5.
				258
				259	The DWARF ``DW_OP_xderef*`` operations allow a value to be converted into an
				260	address of a specified address space which is then read. But it provides no
				261	way to create a memory location description for an address in the non-default
				262	address space. For example, AMDGPU variables can be allocated in the local
				263	address space at a fixed address. It is required to have an operation to
				264	create an address in a specific address space that can be used to define the
				265	location description of the variable. Defining this operation to produce a
				266	location description allows the size of addresses in an address space to be
				267	larger than the generic type. See ``DW_OP_LLVM_form_aspace_address``.
				268
				269	If the ``DW_OP_LLVM_form_aspace_address`` operation had to produce a value
				270	that can be implicitly converted to a memory location description, then it
				271	would be limited to the size of the generic type which matches the size of the
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	272	default address space. Its value would be undefined and likely not match any
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	273	value in the actual program. By making the result a location description, it
				274	allows a consumer great freedom in how it implements it. The implicit
				275	conversion back to a value can be limited only to the default address space to
				276	maintain compatibility with DWARF Version 5. For other address spaces the
				277	producer can use the new operations that explicitly specify the address space.
				278
				279	``DW_OP_breg*`` treats the register as containing an address in the default
				280	address space. It is required to be able to specify the address space of the
				281	register value. See ``DW_OP_LLVM_aspace_bregx``.
				282
				283	Similarly, ``DW_OP_implicit_pointer`` treats its implicit pointer value as
				284	being in the default address space. It is required to be able to specify the
				285	address space of the pointer value. See
				286	``DW_OP_LLVM_aspace_implicit_pointer``.
				287
				288	Almost all uses of addresses in DWARF are limited to defining location
				289	descriptions, or to be dereferenced to read memory. The exception is
				290	``DW_CFA_val_offset`` which uses the address to set the value of a register.
				291	By defining the CFA DWARF expression as being a memory location description,
				292	it can maintain what address space it is, and that can be used to convert the
				293	offset address back to an address in that address space. See
				294	:ref:`amdgpu-dwarf-call-frame-information`.
				295
				296	This approach allows all existing DWARF to have the identical semantics. It
				297	allows the compiler to explicitly specify the address space it is using. For
				298	example, a compiler could choose to access private memory in a swizzled manner
				299	when mapping a source language to a wavefront in a SIMT manner, or to access
				300	it in an unswizzled manner if mapping the same language with the wavefront
				301	being the thread. It also allows the compiler to mix the address space it uses
				302	to access private memory. For example, for SIMT it can still spill entire
				303	vector registers in an unswizzled manner, while using a swizzled private
				304	memory for SIMT variable access. This approach allows memory location
				305	descriptions for different address spaces to be combined using the regular
				306	``DW_OP_*piece`` operations.
				307
				308	Location descriptions are an abstraction of storage, they give freedom to the
				309	consumer on how to implement them. They allow the address space to encode lane
				310	information so they can be used to read memory with only the memory
				311	description and no extra arguments. The same set of operations can operate on
				312	locations independent of their kind of storage. The ``DW_OP_deref*`` therefore
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	313	can be used on any storage kind. ``DW_OP_xderef*`` is unnecessary, except to
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	314	become a more compact way to convert a non-default address space address
				315	followed by dereferencing it.
				316
				317	In DWARF Version 5 a location description is defined as a single location
				318	description or a location list. A location list is defined as either
				319	effectively an undefined location description or as one or more single
				320	location descriptions to describe an object with multiple places. The
				321	``DW_OP_push_object_address`` and ``DW_OP_call*`` operations can put a
				322	location description on the stack. Furthermore, debugger information entry
				323	attributes such as ``DW_AT_data_member_location``, ``DW_AT_use_location``, and
				324	``DW_AT_vtable_elem_location`` are defined as pushing a location description
				325	on the expression stack before evaluating the expression. However, DWARF
				326	Version 5 only allows the stack to contain values and so only a single memory
				327	address can be on the stack which makes these incapable of handling location
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	328	descriptions with multiple places, or places other than memory. Since these
				329	extensions allow the stack to contain location descriptions, the operations are
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	330	generalized to support location descriptions that can have multiple places.
				331	This is backwards compatible with DWARF Version 5 and allows objects with
				332	multiple places to be supported. For example, the expression that describes
				333	how to access the field of an object can be evaluated with a location
				334	description that has multiple places and will result in a location description
				335	with multiple places as expected. With this change, the separate DWARF Version
				336	5 sections that described DWARF expressions and location lists have been
				337	unified into a single section that describes DWARF expressions in general.
				338	This unification seems to be a natural consequence and a necessity of allowing
				339	location descriptions to be part of the evaluation stack.
				340
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	341	For those familiar with the definition of location descriptions in DWARF Version
				342	5, the definitions in these extensions are presented differently, but does
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	343	in fact define the same concept with the same fundamental semantics. However,
				344	it does so in a way that allows the concept to extend to support address
				345	spaces, bit addressing, the ability for composite location descriptions to be
				346	composed of any kind of location description, and the ability to support
				347	objects located at multiple places. Collectively these changes expand the set
				348	of processors that can be supported and improves support for optimized code.
				349
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	350	Several approaches were considered, and the one presented appears to be the
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	351	cleanest and offers the greatest improvement of DWARF's ability to support
				352	optimized code. Examining the GDB debugger and LLVM compiler, it appears only
				353	to require modest changes as they both already have to support general use of
				354	location descriptions. It is anticipated that will also be the case for other
				355	debuggers and compilers.
				356
				357	As an experiment, GDB was modified to evaluate DWARF Version 5 expressions
				358	with location descriptions as stack entries and implicit conversions. All GDB
				359	tests have passed, except one that turned out to be an invalid test by DWARF
				360	Version 5 rules. The code in GDB actually became simpler as all evaluation was
				361	on the stack and there was no longer a need to maintain a separate structure
				362	for the location description result. This gives confidence of the backwards
				363	compatibility.
				364
				365	Since the AMDGPU supports languages such as OpenCL [:ref:`OpenCL
				366	<amdgpu-dwarf-OpenCL>`], there is a need to define source language address
				367	classes so they can be used in a consistent way by consumers. It would also be
				368	desirable to add support for using them in defining language types rather than
				369	the current target architecture specific address spaces. See
				370	:ref:`amdgpu-dwarf-segment_addresses`.
				371
				372	A ``DW_AT_LLVM_augmentation`` attribute is added to a compilation unit
				373	debugger information entry to indicate that there is additional target
				374	architecture specific information in the debugging information entries of that
				375	compilation unit. This allows a consumer to know what extensions are present
				376	in the debugger information entries as is possible with the augmentation
				377	string of other sections. The format that should be used for the augmentation
				378	string in the lookup by name table and CFI Common Information Entry is also
				379	recommended to allow a consumer to parse the string when it contains
				380	information from multiple vendors.
				381
				382	The AMDGPU supports programming languages that include online compilation
				383	where the source text may be created at runtime. Therefore, a way to embed the
				384	source text in the debug information is required. For example, the OpenCL
				385	language runtime supports online compilation. See
				386	:ref:`amdgpu-dwarf-line-number-information`.
				387
				388	Support to allow MD5 checksums to be optionally present in the line table is
				389	added. This allows linking together compilation units where some have MD5
				390	checksums and some do not. In DWARF Version 5 the file timestamp and file size
				391	can be optional, but if the MD5 checksum is present it must be valid for all
				392	files. See :ref:`amdgpu-dwarf-line-number-information`.
				393
				394	Support is added for the HIP programming language [:ref:`HIP
				395	<amdgpu-dwarf-HIP>`] which is supported by the AMDGPU. See
				396	:ref:`amdgpu-dwarf-language-names`.
				397
				398	The following sections provide the definitions for the additional operations,
				399	as well as clarifying how existing expression operations, CFI operations, and
				400	attributes behave with respect to generalized location descriptions that
				401	support address spaces and location descriptions that support multiple places.
				402	It has been defined such that it is backwards compatible with DWARF Version 5.
				403	The definitions are intended to fully define well-formed DWARF in a consistent
				404	style based on the DWARF Version 5 specification. Non-normative text is shown
				405	in italics.
				406
				407	The names for the new operations, attributes, and constants include "\
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	408	``LLVM``\ " and are encoded with vendor specific codes so these extensions can
				409	be implemented as an LLVM vendor extension to DWARF Version 5. If accepted these
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	410	names would not include the "\ ``LLVM``\ " and would not use encodings in the
				411	vendor range.
				412
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	413	The extensions are described in
				414	:ref:`amdgpu-dwarf-changes-relative-to-dwarf-version-5` and are
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	415	organized to follow the section ordering of DWARF Version 5. It includes notes
				416	to indicate the corresponding DWARF Version 5 sections to which they pertain.
				417	Other notes describe additional changes that may be worth considering, and to
				418	raise questions.
				419
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	420	.. _amdgpu-dwarf-changes-relative-to-dwarf-version-5:
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	421
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	422	Changes Relative to DWARF Version 5
				423	===================================
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	424
				425	General Description
				426	-------------------
				427
				428	Attribute Types
				429	~~~~~~~~~~~~~~~
				430
				431	.. note::
				432
				433	This augments DWARF Version 5 section 2.2 and Table 2.2.
				434
				435	The following table provides the additional attributes. See
				436	:ref:`amdgpu-dwarf-debugging-information-entry-attributes`.
				437
				438	.. table:: Attribute names
				439	:name: amdgpu-dwarf-attribute-names-table
				440
				441	=========================== ====================================
				442	Attribute Usage
				443	=========================== ====================================
				444	``DW_AT_LLVM_active_lane`` SIMD or SIMT active lanes
				445	``DW_AT_LLVM_augmentation`` Compilation unit augmentation string
				446	``DW_AT_LLVM_lane_pc`` SIMD or SIMT lane program location
				447	``DW_AT_LLVM_lanes`` SIMD or SIMT thread lane count
				448	``DW_AT_LLVM_vector_size`` Base type vector size
				449	=========================== ====================================
				450
				451	.. _amdgpu-dwarf-expressions:
				452
				453	DWARF Expressions
				454	~~~~~~~~~~~~~~~~~
				455
				456	.. note::
				457
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	458	This section, and its nested sections, replaces DWARF Version 5 section 2.5
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	459	and section 2.6. The new DWARF expression operation extensions are defined as
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	460	well as clarifying the extensions to already existing DWARF Version 5
				461	operations. It is based on the text of the existing DWARF Version 5 standard.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	462
				463	DWARF expressions describe how to compute a value or specify a location.
				464
				465	*The evaluation of a DWARF expression can provide the location of an object, the
				466	value of an array bound, the length of a dynamic string, the desired value
				467	itself, and so on.*
				468
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	469	If the evaluation of a DWARF expression does not encounter an error, then it can
				470	either result in a value (see :ref:`amdgpu-dwarf-expression-value`) or a
				471	location description (see :ref:`amdgpu-dwarf-location-description`). When a
				472	DWARF expression is evaluated, it may be specified whether a value or location
				473	description is required as the result kind.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	474
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	475	If a result kind is specified, and the result of the evaluation does not match
				476	the specified result kind, then the implicit conversions described in
				477	:ref:`amdgpu-dwarf-memory-location-description-operations` are performed if
				478	valid. Otherwise, the DWARF expression is ill-formed.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	479
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	480	If the evaluation of a DWARF expression encounters an evaluation error, then the
				481	result is an evaluation error.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	482
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	483	.. note::
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	484
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	485	Decided to define the concept of an evaluation error. An alternative is to
				486	introduce an undefined value base type in a similar way to location
				487	descriptions having an undefined location description. Then operations that
				488	encounter an evaluation error can return the undefined location description or
				489	value with an undefined base type.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	490
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	491	All operations that act on values would return an undefined entity if given an
				492	undefined value. The expression would then always evaluate to completion, and
				493	can be tested to determine if it is an undefined entity.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	494
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	495	However, this would add considerable additional complexity and does not match
				496	that GDB throws an exception when these evaluation errors occur.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	497
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	498	If a DWARF expression is ill-formed, then the result is undefined.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	499
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	500	The following sections detail the rules for when a DWARF expression is
				501	ill-formed or results in an evaluation error.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	502
				503	A DWARF expression can either be encoded as a operation expression (see
				504	:ref:`amdgpu-dwarf-operation-expressions`), or as a location list expression
				505	(see :ref:`amdgpu-dwarf-location-list-expressions`).
				506
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	507	.. _amdgpu-dwarf-expression-evaluation-context:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	508
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	509	DWARF Expression Evaluation Context
				510	+++++++++++++++++++++++++++++++++++
				511
				512	A DWARF expression is evaluated in a context that can include a number of
				513	context elements. If multiple context elements are specified then they must be
				514	self consistent or the result of the evaluation is undefined. The context
				515	elements that can be specified are:
				516
				517	A current result kind
				518
				519	The kind of result required by the DWARF expression evaluation. If specified
				520	it can be a location description or a value.
				521
				522	A current thread
				523
				524	The target architecture thread identifier of the source program thread of
				525	execution for which a user presented expression is currently being evaluated.
				526
				527	It is required for operations that are related to target architecture threads.
				528
				529	For example, the ``DW_OP_form_tls_address`` operation and
				530	``DW_OP_LLVM_form_aspace_address`` *operation when given an address space that
				531	is thread specific.*
				532
				533	A current lane
				534
				535	The target architecture lane identifier of the source program thread of
				536	execution for which a user presented expression is currently being evaluated.
				537	This applies to languages that are implemented using a SIMD or SIMT execution
				538	model.
				539
				540	It is required for operations that are related to target architecture lanes.
				541
				542	For example, the ``DW_OP_LLVM_push_lane`` operation and
				543	``DW_OP_LLVM_form_aspace_address`` *operation when given an address space that
				544	is lane specific.*
				545
				546	If specified, it must be consistent with any specified current thread and
				547	current target architecture. It is consistent with a thread if it identifies a
				548	lane of the thread. It is consistent with a target architecture if it is a
				549	valid lane identifier of the target architecture. Otherwise the result is
				550	undefined.
				551
				552	A current call frame
				553
				554	The target architecture call frame identifier. It identifies a call frame that
				555	corresponds to an active invocation of a subprogram in the current thread. It
				556	is identified by its address on the call stack. The address is referred to as
				557	the Canonical Frame Address (CFA). The call frame information is used to
				558	determine the CFA for the call frames of the current thread's call stack (see
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	559	:ref:`amdgpu-dwarf-call-frame-information`).
				560
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	561	It is required for operations that specify target architecture registers to
				562	support virtual unwinding of the call stack.
				563
				564	For example, the ``DW_OP_reg`` operations.
				565
				566	If specified, it must be an active call frame in the current thread. If the
				567	current lane is specified, then that lane must have been active on entry to
				568	the call frame (see the ``DW_AT_LLVM_lane_pc`` attribute). Otherwise the
				569	result is undefined.
				570
				571	If it is the currently executing call frame, then it is termed the top call
				572	frame.
				573
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	574	A current program location
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	575
				576	The target architecture program location corresponding to the current call
				577	frame of the current thread.
				578
				579	The program location of the top call frame is the target architecture program
				580	counter for the current thread. The call frame information is used to obtain
				581	the value of the return address register to determine the program location of
				582	the other call frames (see :ref:`amdgpu-dwarf-call-frame-information`).
				583
				584	It is required for the evaluation of location list expressions to select
				585	amongst multiple program location ranges. It is required for operations that
				586	specify target architecture registers to support virtual unwinding of the call
				587	stack (see :ref:`amdgpu-dwarf-call-frame-information`).
				588
				589	If specified:
				590
				591	* If the current lane is not specified:
				592
				593	* If the current call frame is the top call frame, it must be the current
				594	target architecture program location.
				595
				596	* If the current call frame F is not the top call frame, it must be the
				597	program location associated with the call site in the current caller frame
				598	F that invoked the callee frame.
				599
				600	* If the current lane is specified and the architecture program location LPC
				601	computed by the ``DW_AT_LLVM_lane_pc`` attribute for the current lane is not
				602	the undefined location description (indicating the lane was not active on
				603	entry to the call frame), it must be LPC.
				604
				605	* Otherwise the result is undefined.
				606
				607	A current compilation unit
				608
				609	The compilation unit debug information entry that contains the DWARF expression
				610	being evaluated.
				611
				612	It is required for operations that reference debug information associated with
				613	the same compilation unit, including indicating if such references use the
				614	32-bit or 64-bit DWARF format. It can also provide the default address space
				615	address size if no current target architecture is specified.
				616
				617	For example, the ``DW_OP_constx`` and ``DW_OP_addrx`` operations.
				618
				619	*Note that this compilation unit may not be the same as the compilation unit
				620	determined from the loaded code object corresponding to the current program
				621	location. For example, the evaluation of the expression E associated with a
				622	``DW_AT_location`` attribute of the debug information entry operand of the
				623	``DW_OP_call*`` operations is evaluated with the compilation unit that
				624	contains E and not the one that contains the ``DW_OP_call*`` operation
				625	expression.*
				626
				627	A current target architecture
				628
				629	The target architecture.
				630
				631	It is required for operations that specify target architecture specific
				632	entities.
				633
				634	*For example, target architecture specific entities include DWARF register
				635	identifiers, DWARF lane identifiers, DWARF address space identifiers, the
				636	default address space, and the address space address sizes.*
				637
				638	If specified:
				639
				640	* If the current thread is specified, then the current target architecture
				641	must be the same as the target architecture of the current thread.
				642
				643	* If the current compilation unit is specified, then the current target
				644	architecture default address space address size must be the same as he
				645	``address_size`` field in the header of the current compilation unit and any
				646	associated entry in the ``.debug_aranges`` section.
				647
				648	* If the current program location is specified, then the current target
				649	architecture must be the same as the target architecture of any line number
				650	information entry (see :ref:`amdgpu-dwarf-line-number-information`)
				651	corresponding to the current program location.
				652
				653	* If the current program location is specified, then the current target
				654	architecture default address space address size must be the same as he
				655	``address_size`` field in the header of any entry corresponding to the
				656	current program location in the ``.debug_addr``, ``.debug_line``,
				657	``.debug_rnglists``, ``.debug_rnglists.dwo``, ``.debug_loclists``, and
				658	``.debug_loclists.dwo`` sections.
				659
				660	* Otherwise the result is undefined.
				661
				662	A current object
				663
				664	The location description of a program object.
				665
				666	It is required for the ``DW_OP_push_object_address`` operation.
				667
				668	For example, the ``DW_AT_data_location`` *attribute on type debug
				669	information entries specifies the the program object corresponding to a
				670	runtime descriptor as the current object when it evaluates its associated
				671	expression.*
				672
				673	The result is undefined if the location descriptor is invalid (see
				674	:ref:`amdgpu-dwarf-location-description`).
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	675
				676	An initial stack
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	677
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	678	This is a list of values or location descriptions that will be pushed on the
				679	operation expression evaluation stack in the order provided before evaluation
				680	of an operation expression starts.
				681
				682	Some debugger information entries have attributes that evaluate their DWARF
				683	expression value with initial stack entries. In all other cases the initial
				684	stack is empty.
				685
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	686	The result is undefined if any location descriptors are invalid (see
				687	:ref:`amdgpu-dwarf-location-description`).
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	688
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	689	If the evaluation requires a context element that is not specified, then the
				690	result of the evaluation is an error.
				691
				692	*A DWARF expression for the location description may be able to be evaluated
				693	without a thread, lane, call frame, program location, or architecture context.
				694	For example, the location of a global variable may be able to be evaluated
				695	without such context. If the expression evaluates with an error then it may
				696	indicate the variable has been optimized and so requires more context.*
				697
				698	*The DWARF expression for call frame information (see
				699	:ref:`amdgpu-dwarf-call-frame-information`) operations are restricted to those
				700	that do not require the compilation unit context to be specified.*
				701
				702	The DWARF is ill-formed if all the ``address_size`` fields in the headers of all
				703	the entries in the ``.debug_info``, ``.debug_addr``, ``.debug_line``,
				704	``.debug_rnglists``, ``.debug_rnglists.dwo``, ``.debug_loclists``, and
				705	``.debug_loclists.dwo`` sections corresponding to any given program location do
				706	not match.
				707
				708	.. _amdgpu-dwarf-expression-value:
				709
				710	DWARF Expression Value
				711	++++++++++++++++++++++
				712
				713	A value has a type and a literal value. It can represent a literal value of any
				714	supported base type of the target architecture. The base type specifies the size
				715	and encoding of the literal value.
				716
				717	.. note::
				718
				719	It may be desirable to add an implicit pointer base type encoding. It would be
				720	used for the type of the value that is produced when the ``DW_OP_deref*``
				721	operation retrieves the full contents of an implicit pointer location storage
				722	created by the ``DW_OP_implicit_pointer`` or
				723	``DW_OP_LLVM_aspace_implicit_pointer`` operations. The literal value would
				724	record the debugging information entry and byte displacement specified by the
				725	associated ``DW_OP_implicit_pointer`` or
				726	``DW_OP_LLVM_aspace_implicit_pointer`` operations.
				727
				728	There is a distinguished base type termed the generic type, which is an integral
				729	type that has the size of an address in the target architecture default address
				730	space and unspecified signedness.
				731
				732	*The generic type is the same as the unspecified type used for stack operations
				733	defined in DWARF Version 4 and before.*
				734
				735	An integral type is a base type that has an encoding of ``DW_ATE_signed``,
				736	``DW_ATE_signed_char``, ``DW_ATE_unsigned``, ``DW_ATE_unsigned_char``,
				737	``DW_ATE_boolean``, or any target architecture defined integral encoding in the
				738	inclusive range ``DW_ATE_lo_user`` to ``DW_ATE_hi_user``.
				739
				740	.. note::
				741
				742	It is unclear if ``DW_ATE_address`` is an integral type. GDB does not seem to
				743	consider it as integral.
				744
				745	.. _amdgpu-dwarf-location-description:
				746
				747	DWARF Location Description
				748	++++++++++++++++++++++++++
				749
				750	*Debugging information must provide consumers a way to find the location of
				751	program variables, determine the bounds of dynamic arrays and strings, and
				752	possibly to find the base address of a subprogram’s call frame or the return
				753	address of a subprogram. Furthermore, to meet the needs of recent computer
				754	architectures and optimization techniques, debugging information must be able to
				755	describe the location of an object whose location changes over the object’s
				756	lifetime, and may reside at multiple locations simultaneously during parts of an
				757	object's lifetime.*
				758
				759	Information about the location of program objects is provided by location
				760	descriptions.
				761
				762	Location descriptions can consist of one or more single location descriptions.
				763
				764	A single location description specifies the location storage that holds a
				765	program object and a position within the location storage where the program
				766	object starts. The position within the location storage is expressed as a bit
				767	offset relative to the start of the location storage.
				768
				769	A location storage is a linear stream of bits that can hold values. Each
				770	location storage has a size in bits and can be accessed using a zero-based bit
				771	offset. The ordering of bits within a location storage uses the bit numbering
				772	and direction conventions that are appropriate to the current language on the
				773	target architecture.
				774
				775	There are five kinds of location storage:
				776
				777	memory location storage
				778	Corresponds to the target architecture memory address spaces.
				779
				780	register location storage
				781	Corresponds to the target architecture registers.
				782
				783	implicit location storage
				784	Corresponds to fixed values that can only be read.
				785
				786	undefined location storage
				787	Indicates no value is available and therefore cannot be read or written.
				788
				789	composite location storage
				790	Allows a mixture of these where some bits come from one location storage and
				791	some from another location storage, or from disjoint parts of the same
				792	location storage.
				793
				794	.. note::
				795
				796	It may be better to add an implicit pointer location storage kind used by the
				797	``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_aspace_implicit_pointer``
				798	operations. It would specify the debugger information entry and byte offset
				799	provided by the operations.
				800
				801	*Location descriptions are a language independent representation of addressing
				802	rules. They are created using DWARF operation expressions of arbitrary
				803	complexity. They can be the result of evaluating a debugger information entry
				804	attribute that specifies an operation expression. In this usage they can
				805	describe the location of an object as long as its lifetime is either static or
				806	the same as the lexical block (see DWARF Version 5 section 3.5) that owns it,
				807	and it does not move during its lifetime. They can be the result of evaluating a
				808	debugger information entry attribute that specifies a location list expression.
				809	In this usage they can describe the location of an object that has a limited
				810	lifetime, changes its location during its lifetime, or has multiple locations
				811	over part or all of its lifetime.*
				812
				813	If a location description has more than one single location description, the
				814	DWARF expression is ill-formed if the object value held in each single location
				815	description's position within the associated location storage is not the same
				816	value, except for the parts of the value that are uninitialized.
				817
				818	*A location description that has more than one single location description can
				819	only be created by a location list expression that has overlapping program
				820	location ranges, or certain expression operations that act on a location
				821	description that has more than one single location description. There are no
				822	operation expression operations that can directly create a location description
				823	with more than one single location description.*
				824
				825	*A location description with more than one single location description can be
				826	used to describe objects that reside in more than one piece of storage at the
				827	same time. An object may have more than one location as a result of
				828	optimization. For example, a value that is only read may be promoted from memory
				829	to a register for some region of code, but later code may revert to reading the
				830	value from memory as the register may be used for other purposes. For the code
				831	region where the value is in a register, any change to the object value must be
				832	made in both the register and the memory so both regions of code will read the
				833	updated value.*
				834
				835	*A consumer of a location description with more than one single location
				836	description can read the object's value from any of the single location
				837	descriptions (since they all refer to location storage that has the same value),
				838	but must write any changed value to all the single location descriptions.*
				839
				840	The evaluation of an expression may require context elements to create a
				841	location description. If such a location description is accessed, the storage it
				842	denotes is that associated with the context element values specified when the
				843	location description was created, which may differ from the context at the time
				844	it is accessed.
				845
				846	*For example, creating a register location description requires the thread
				847	context: the location storage is for the specified register of that thread.
				848	Creating a memory location description for an address space may required a
				849	thread and a lane context: the location storage is the memory associated with
				850	that thread and lane.*
				851
				852	If any of the context elements required to create a location description change,
				853	the location description becomes invalid and accessing it is undefined.
				854
				855	Examples of context that can invalidate a location description are:
				856
				857	* The thread context is required and execution causes the thread to terminate.
				858	* *The call frame context is required and further execution causes the call
				859	frame to return to the calling frame.*
				860	* *The program location is required and further execution of the thread occurs.
				861	That could change the location list entry or call frame information entry that
				862	applies.*
				863	* An operation uses call frame information:
				864
				865	* Any of the frames used in the virtual call frame unwinding return.
				866	* *The top call frame is used, the program location is used to select the call
				867	frame information entry, and further execution of the thread occurs.*
				868
				869	*A DWARF expression can be used to compute a location description for an object.
				870	A subsequent DWARF expression evaluation can be given the object location
				871	description as the object context or initial stack context to compute a
				872	component of the object. The final result is undefined if the object location
				873	description becomes invalid between the two expression evaluations.*
				874
				875	A change of a thread's program location may not make a location description
				876	invalid, yet may still render it as no longer meaningful. Accessing such a
				877	location description, or using it as the object context or initial stack context
				878	of an expression evaluation, may produce an undefined result.
				879
				880	*For example, a location description may specify a register that no longer holds
				881	the intended program object after a program location change. One way to avoid
				882	such problems is to recompute location descriptions associated with threads when
				883	their program locations change.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	884
				885	.. _amdgpu-dwarf-operation-expressions:
				886
				887	DWARF Operation Expressions
				888	+++++++++++++++++++++++++++
				889
				890	An operation expression is comprised of a stream of operations, each consisting
				891	of an opcode followed by zero or more operands. The number of operands is
				892	implied by the opcode.
				893
				894	Operations represent a postfix operation on a simple stack machine. Each stack
				895	entry can hold either a value or a location description. Operations can act on
				896	entries on the stack, including adding entries and removing entries. If the kind
				897	of a stack entry does not match the kind required by the operation and is not
				898	implicitly convertible to the required kind (see
				899	:ref:`amdgpu-dwarf-memory-location-description-operations`), then the DWARF
				900	operation expression is ill-formed.
				901
				902	Evaluation of an operation expression starts with an empty stack on which the
				903	entries from the initial stack provided by the context are pushed in the order
				904	provided. Then the operations are evaluated, starting with the first operation
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	905	of the stream. Evaluation continues until either an operation has an evaluation
				906	error, or until one past the last operation of the stream is reached.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	907
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	908	The result of the evaluation is:
				909
				910	* If an operation has an evaluation error, or an operation evaluates an
				911	expression that has an evaluation error, then the result is an evaluation
				912	error.
				913
				914	* If the current result kind specifies a location description, then:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	915
				916	* If the stack is empty, the result is a location description with one
				917	undefined location description.
				918
				919	*This rule is for backwards compatibility with DWARF Version 5 which has no
				920	explicit operation to create an undefined location description, and uses an
				921	empty operation expression for this purpose.*
				922
				923	* If the top stack entry is a location description, or can be converted
Tony	31fdcf6	2020-07-01 20:21:58 +0000	[diff] [blame]	924	to one (see :ref:`amdgpu-dwarf-memory-location-description-operations`),
				925	then the result is that, possibly converted, location description. Any other
				926	entries on the stack are discarded.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	927
				928	* Otherwise the DWARF expression is ill-formed.
				929
				930	.. note::
				931
				932	Could define this case as returning an implicit location description as
				933	if the ``DW_OP_implicit`` operation is performed.
				934
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	935	* If the current result kind specifies a value, then:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	936
Tony	31fdcf6	2020-07-01 20:21:58 +0000	[diff] [blame]	937	* If the top stack entry is a value, or can be converted to one (see
				938	:ref:`amdgpu-dwarf-memory-location-description-operations`), then the result
				939	is that, possibly converted, value. Any other entries on the stack are
				940	discarded.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	941
				942	* Otherwise the DWARF expression is ill-formed.
				943
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	944	* If the current result kind is not specified, then:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	945
				946	* If the stack is empty, the result is a location description with one
				947	undefined location description.
				948
				949	*This rule is for backwards compatibility with DWARF Version 5 which has no
				950	explicit operation to create an undefined location description, and uses an
				951	empty operation expression for this purpose.*
				952
				953	.. note::
				954
				955	This rule is consistent with the rule above for when a location
				956	description is requested. However, GDB appears to report this as an error
				957	and no GDB tests appear to cause an empty stack for this case.
				958
				959	* Otherwise, the top stack entry is returned. Any other entries on the stack
				960	are discarded.
				961
				962	An operation expression is encoded as a byte block with some form of prefix that
				963	specifies the byte count. It can be used:
				964
				965	* as the value of a debugging information entry attribute that is encoded using
				966	class ``exprloc`` (see DWARF Version 5 section 7.5.5),
				967
				968	* as the operand to certain operation expression operations,
				969
				970	* as the operand to certain call frame information operations (see
				971	:ref:`amdgpu-dwarf-call-frame-information`),
				972
				973	* and in location list entries (see
				974	:ref:`amdgpu-dwarf-location-list-expressions`).
				975
				976	.. _amdgpu-dwarf-stack-operations:
				977
				978	Stack Operations
				979	################
				980
				981	The following operations manipulate the DWARF stack. Operations that index the
				982	stack assume that the top of the stack (most recently added entry) has index 0.
				983	They allow the stack entries to be either a value or location description.
				984
				985	If any stack entry accessed by a stack operation is an incomplete composite
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	986	location description (see
				987	:ref:`amdgpu-dwarf-composite-location-description-operations`), then the DWARF
				988	expression is ill-formed.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	989
				990	.. note::
				991
				992	These operations now support stack entries that are values and location
				993	descriptions.
				994
				995	.. note::
				996
				997	If it is desired to also make them work with incomplete composite location
				998	descriptions, then would need to define that the composite location storage
				999	specified by the incomplete composite location description is also replicated
				1000	when a copy is pushed. This ensures that each copy of the incomplete composite
				1001	location description can update the composite location storage they specify
				1002	independently.
				1003
				1004	1. ``DW_OP_dup``
				1005
				1006	``DW_OP_dup`` duplicates the stack entry at the top of the stack.
				1007
				1008	2. ``DW_OP_drop``
				1009
				1010	``DW_OP_drop`` pops the stack entry at the top of the stack and discards it.
				1011
				1012	3. ``DW_OP_pick``
				1013
				1014	``DW_OP_pick`` has a single unsigned 1-byte operand that represents an index
				1015	I. A copy of the stack entry with index I is pushed onto the stack.
				1016
				1017	4. ``DW_OP_over``
				1018
				1019	``DW_OP_over`` pushes a copy of the entry with index 1.
				1020
				1021	This is equivalent to a ``DW_OP_pick 1`` operation.
				1022
				1023	5. ``DW_OP_swap``
				1024
				1025	``DW_OP_swap`` swaps the top two stack entries. The entry at the top of the
				1026	stack becomes the second stack entry, and the second stack entry becomes the
				1027	top of the stack.
				1028
				1029	6. ``DW_OP_rot``
				1030
				1031	``DW_OP_rot`` rotates the first three stack entries. The entry at the top of
				1032	the stack becomes the third stack entry, the second entry becomes the top of
				1033	the stack, and the third entry becomes the second entry.
				1034
				1035	.. _amdgpu-dwarf-control-flow-operations:
				1036
				1037	Control Flow Operations
				1038	#######################
				1039
				1040	The following operations provide simple control of the flow of a DWARF operation
				1041	expression.
				1042
				1043	1. ``DW_OP_nop``
				1044
				1045	``DW_OP_nop`` is a place holder. It has no effect on the DWARF stack
				1046	entries.
				1047
				1048	2. ``DW_OP_le``, ``DW_OP_ge``, ``DW_OP_eq``, ``DW_OP_lt``, ``DW_OP_gt``,
				1049	``DW_OP_ne``
				1050
				1051	.. note::
				1052
				1053	The same as in DWARF Version 5 section 2.5.1.5.
				1054
				1055	3. ``DW_OP_skip``
				1056
				1057	``DW_OP_skip`` is an unconditional branch. Its single operand is a 2-byte
				1058	signed integer constant. The 2-byte constant is the number of bytes of the
				1059	DWARF expression to skip forward or backward from the current operation,
				1060	beginning after the 2-byte constant.
				1061
				1062	If the updated position is at one past the end of the last operation, then
				1063	the operation expression evaluation is complete.
				1064
				1065	Otherwise, the DWARF expression is ill-formed if the updated operation
				1066	position is not in the range of the first to last operation inclusive, or
				1067	not at the start of an operation.
				1068
				1069	4. ``DW_OP_bra``
				1070
				1071	``DW_OP_bra`` is a conditional branch. Its single operand is a 2-byte signed
				1072	integer constant. This operation pops the top of stack. If the value popped
				1073	is not the constant 0, the 2-byte constant operand is the number of bytes of
				1074	the DWARF operation expression to skip forward or backward from the current
				1075	operation, beginning after the 2-byte constant.
				1076
				1077	If the updated position is at one past the end of the last operation, then
				1078	the operation expression evaluation is complete.
				1079
				1080	Otherwise, the DWARF expression is ill-formed if the updated operation
				1081	position is not in the range of the first to last operation inclusive, or
				1082	not at the start of an operation.
				1083
				1084	5. ``DW_OP_call2, DW_OP_call4, DW_OP_call_ref``
				1085
				1086	``DW_OP_call2``, ``DW_OP_call4``, and ``DW_OP_call_ref`` perform DWARF
				1087	procedure calls during evaluation of a DWARF expression.
				1088
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1089	``DW_OP_call2`` and ``DW_OP_call4``, have one operand that is, respectively,
				1090	a 2-byte or 4-byte unsigned offset DR that represents the byte offset of a
				1091	debugging information entry D relative to the beginning of the current
				1092	compilation unit.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1093
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1094	``DW_OP_call_ref`` has one operand that is a 4-byte unsigned value in the
				1095	32-bit DWARF format, or an 8-byte unsigned value in the 64-bit DWARF format,
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1096	that represents the byte offset DR of a debugging information entry D
				1097	relative to the beginning of the ``.debug_info`` section that contains the
				1098	current compilation unit. D may not be in the current compilation unit.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1099
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1100	.. note:
				1101
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1102	DWARF Version 5 states that DR can be an offset in a ``.debug_info``
				1103	section other than the one that contains the current compilation unit. It
				1104	states that relocation of references from one executable or shared object
				1105	file to another must be performed by the consumer. But given that DR is
				1106	defined as an offset in a ``.debug_info`` section this seems impossible.
				1107	If DR was defined as an implementation defined value, then the consumer
				1108	could choose to interpret the value in an implementation defined manner to
				1109	reference a debug information in another executable or shared object.
				1110
				1111	In ELF the ``.debug_info`` section is in a non-\ ``PT_LOAD`` segment so
				1112	standard dynamic relocations cannot be used. But even if they were loaded
				1113	segments and dynamic relocations were used, DR would need to be the
				1114	address of D, not an offset in a ``.debug_info`` section. That would also
				1115	need DR to be the size of a global address. So it would not be possible to
				1116	use the 32-bit DWARF format in a 64-bit global address space. In addition,
				1117	the consumer would need to determine what executable or shared object the
				1118	relocated address was in so it could determine the containing compilation
				1119	unit.
				1120
				1121	GDB only interprets DR as an offset in the ``.debug_info`` section that
				1122	contains the current compilation unit.
				1123
				1124	This comment also applies to ``DW_OP_implicit_pointer`` and
				1125	``DW_OP_LLVM_aspace_implicit_pointer``.
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1126
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1127	Operand interpretation of ``DW_OP_call2``\ , ``DW_OP_call4``\ , and
				1128	``DW_OP_call_ref`` is exactly like that for ``DW_FORM_ref2``\ *,
				1129	``DW_FORM_ref4``\ , and ``DW_FORM_ref_addr``\ , respectively.
				1130
				1131	The call operation is evaluated by:
				1132
				1133	* If D has a ``DW_AT_location`` attribute that is encoded as a ``exprloc``
				1134	that specifies an operation expression E, then execution of the current
				1135	operation expression continues from the first operation of E. Execution
				1136	continues until one past the last operation of E is reached, at which
				1137	point execution continues with the operation following the call operation.
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1138	The operations of E are evaluated with the same current context, except
				1139	current compilation unit is the one that contains D and the stack is the
				1140	same as that being used by the call operation. After the call operation
				1141	has been evaluated, the stack is therefore as it is left by the evaluation
				1142	of the operations of E. Since E is evaluated on the same stack as the call
				1143	operation, E can use, and/or remove entries already on the stack, and can
				1144	add new entries to the stack.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1145
				1146	*Values on the stack at the time of the call may be used as parameters by
				1147	the called expression and values left on the stack by the called expression
				1148	may be used as return values by prior agreement between the calling and
				1149	called expressions.*
				1150
				1151	* If D has a ``DW_AT_location`` attribute that is encoded as a ``loclist`` or
				1152	``loclistsptr``, then the specified location list expression E is
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1153	evaluated. The evaluation of E uses the current context, except the result
				1154	kind is a location description, the compilation unit is the one that
				1155	contains D, and the initial stack is empty. The location description
				1156	result is pushed on the stack.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1157
				1158	.. note::
				1159
				1160	This rule avoids having to define how to execute a matched location list
				1161	entry operation expression on the same stack as the call when there are
				1162	multiple matches. But it allows the call to obtain the location
				1163	description for a variable or formal parameter which may use a location
				1164	list expression.
				1165
				1166	An alternative is to treat the case when D has a ``DW_AT_location``
				1167	attribute that is encoded as a ``loclist`` or ``loclistsptr``, and the
				1168	specified location list expression E' matches a single location list
				1169	entry with operation expression E, the same as the ``exprloc`` case and
				1170	evaluate on the same stack.
				1171
				1172	But this is not attractive as if the attribute is for a variable that
				1173	happens to end with a non-singleton stack, it will not simply put a
				1174	location description on the stack. Presumably the intent of using
				1175	``DW_OP_call*`` on a variable or formal parameter debugger information
				1176	entry is to push just one location description on the stack. That
				1177	location description may have more than one single location description.
				1178
				1179	The previous rule for ``exprloc`` also has the same problem as normally
				1180	a variable or formal parameter location expression may leave multiple
				1181	entries on the stack and only return the top entry.
				1182
				1183	GDB implements ``DW_OP_call*`` by always executing E on the same stack.
				1184	If the location list has multiple matching entries, it simply picks the
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1185	first one and ignores the rest. This seems fundamentally at odds with
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1186	the desire to supporting multiple places for variables.
				1187
				1188	So, it feels like ``DW_OP_call*`` should both support pushing a location
				1189	description on the stack for a variable or formal parameter, and also
				1190	support being able to execute an operation expression on the same stack.
				1191	Being able to specify a different operation expression for different
				1192	program locations seems a desirable feature to retain.
				1193
				1194	A solution to that is to have a distinct ``DW_AT_LLVM_proc`` attribute
				1195	for the ``DW_TAG_dwarf_procedure`` debugging information entry. Then the
				1196	``DW_AT_location`` attribute expression is always executed separately
				1197	and pushes a location description (that may have multiple single
				1198	location descriptions), and the ``DW_AT_LLVM_proc`` attribute expression
				1199	is always executed on the same stack and can leave anything on the
				1200	stack.
				1201
				1202	The ``DW_AT_LLVM_proc`` attribute could have the new classes
				1203	``exprproc``, ``loclistproc``, and ``loclistsptrproc`` to indicate that
				1204	the expression is executed on the same stack. ``exprproc`` is the same
				1205	encoding as ``exprloc``. ``loclistproc`` and ``loclistsptrproc`` are the
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1206	same encoding as their non-\ ``proc`` counterparts, except the DWARF is
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1207	ill-formed if the location list does not match exactly one location list
				1208	entry and a default entry is required. These forms indicate explicitly
				1209	that the matched single operation expression must be executed on the
				1210	same stack. This is better than ad hoc special rules for ``loclistproc``
				1211	and ``loclistsptrproc`` which are currently clearly defined to always
				1212	return a location description. The producer then explicitly indicates
				1213	the intent through the attribute classes.
				1214
				1215	Such a change would be a breaking change for how GDB implements
				1216	``DW_OP_call*``. However, are the breaking cases actually occurring in
				1217	practice? GDB could implement the current approach for DWARF Version 5,
				1218	and the new semantics for DWARF Version 6 which has been done for some
				1219	other features.
				1220
				1221	Another option is to limit the execution to be on the same stack only to
				1222	the evaluation of an expression E that is the value of a
				1223	``DW_AT_location`` attribute of a ``DW_TAG_dwarf_procedure`` debugging
				1224	information entry. The DWARF would be ill-formed if E is a location list
				1225	expression that does not match exactly one location list entry. In all
				1226	other cases the evaluation of an expression E that is the value of a
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1227	``DW_AT_location`` attribute would evaluate E with the current context,
				1228	except the result kind is a location description, the compilation unit
				1229	is the one that contains D, and the initial stack is empty. The location
				1230	description result is pushed on the stack.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1231
				1232	* If D has a ``DW_AT_const_value`` attribute with a value V, then it is as
				1233	if a ``DW_OP_implicit_value V`` operation was executed.
				1234
				1235	*This allows a call operation to be used to compute the location
				1236	description for any variable or formal parameter regardless of whether the
				1237	producer has optimized it to a constant. This is consistent with the
				1238	``DW_OP_implicit_pointer`` operation.*
				1239
				1240	.. note::
				1241
				1242	Alternatively, could deprecate using ``DW_AT_const_value`` for
				1243	``DW_TAG_variable`` and ``DW_TAG_formal_parameter`` debugger information
				1244	entries that are constants and instead use ``DW_AT_location`` with an
				1245	operation expression that results in a location description with one
				1246	implicit location description. Then this rule would not be required.
				1247
				1248	* Otherwise, there is no effect and no changes are made to the stack.
				1249
				1250	.. note::
				1251
				1252	In DWARF Version 5, if D does not have a ``DW_AT_location`` then
				1253	``DW_OP_call*`` is defined to have no effect. It is unclear that this is
				1254	the right definition as a producer should be able to rely on using
				1255	``DW_OP_call*`` to get a location description for any non-\
				1256	``DW_TAG_dwarf_procedure`` debugging information entries. Also, the
				1257	producer should not be creating DWARF with ``DW_OP_call*`` to a
				1258	``DW_TAG_dwarf_procedure`` that does not have a ``DW_AT_location``
				1259	attribute. So, should this case be defined as an ill-formed DWARF
				1260	expression?
				1261
				1262	The ``DW_TAG_dwarf_procedure`` *debugging information entry can be used to
				1263	define DWARF procedures that can be called.*
				1264
				1265	.. _amdgpu-dwarf-value-operations:
				1266
				1267	Value Operations
				1268	################
				1269
				1270	This section describes the operations that push values on the stack.
				1271
				1272	Each value stack entry has a type and a literal value and can represent a
				1273	literal value of any supported base type of the target architecture. The base
				1274	type specifies the size and encoding of the literal value.
				1275
				1276	Instead of a base type, value stack entries can have a distinguished generic
				1277	type, which is an integral type that has the size of an address in the target
				1278	architecture default address space and unspecified signedness.
				1279
				1280	*The generic type is the same as the unspecified type used for stack operations
				1281	defined in DWARF Version 4 and before.*
				1282
				1283	An integral type is a base type that has an encoding of ``DW_ATE_signed``,
				1284	``DW_ATE_signed_char``, ``DW_ATE_unsigned``, ``DW_ATE_unsigned_char``,
				1285	``DW_ATE_boolean``, or any target architecture defined integral encoding in the
				1286	inclusive range ``DW_ATE_lo_user`` to ``DW_ATE_hi_user``.
				1287
				1288	.. note::
				1289
				1290	Unclear if ``DW_ATE_address`` is an integral type. GDB does not seem to
				1291	consider it as integral.
				1292
				1293	.. _amdgpu-dwarf-literal-operations:
				1294
				1295	Literal Operations
				1296	^^^^^^^^^^^^^^^^^^
				1297
				1298	The following operations all push a literal value onto the DWARF stack.
				1299
				1300	Operations other than ``DW_OP_const_type`` push a value V with the generic type.
				1301	If V is larger than the generic type, then V is truncated to the generic type
				1302	size and the low-order bits used.
				1303
				1304	1. ``DW_OP_lit0``, ``DW_OP_lit1``, ..., ``DW_OP_lit31``
				1305
				1306	``DW_OP_lit<N>`` operations encode an unsigned literal value N from 0
				1307	through 31, inclusive. They push the value N with the generic type.
				1308
				1309	2. ``DW_OP_const1u``, ``DW_OP_const2u``, ``DW_OP_const4u``, ``DW_OP_const8u``
				1310
				1311	``DW_OP_const<N>u`` operations have a single operand that is a 1, 2, 4, or
				1312	8-byte unsigned integer constant U, respectively. They push the value U with
				1313	the generic type.
				1314
				1315	3. ``DW_OP_const1s``, ``DW_OP_const2s``, ``DW_OP_const4s``, ``DW_OP_const8s``
				1316
				1317	``DW_OP_const<N>s`` operations have a single operand that is a 1, 2, 4, or
				1318	8-byte signed integer constant S, respectively. They push the value S with
				1319	the generic type.
				1320
				1321	4. ``DW_OP_constu``
				1322
				1323	``DW_OP_constu`` has a single unsigned LEB128 integer operand N. It pushes
				1324	the value N with the generic type.
				1325
				1326	5. ``DW_OP_consts``
				1327
				1328	``DW_OP_consts`` has a single signed LEB128 integer operand N. It pushes the
				1329	value N with the generic type.
				1330
				1331	6. ``DW_OP_constx``
				1332
				1333	``DW_OP_constx`` has a single unsigned LEB128 integer operand that
				1334	represents a zero-based index into the ``.debug_addr`` section relative to
				1335	the value of the ``DW_AT_addr_base`` attribute of the associated compilation
				1336	unit. The value N in the ``.debug_addr`` section has the size of the generic
				1337	type. It pushes the value N with the generic type.
				1338
				1339	The ``DW_OP_constx`` *operation is provided for constants that require
				1340	link-time relocation but should not be interpreted by the consumer as a
				1341	relocatable address (for example, offsets to thread-local storage).*
				1342
				1343	9. ``DW_OP_const_type``
				1344
				1345	``DW_OP_const_type`` has three operands. The first is an unsigned LEB128
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1346	integer DR that represents the byte offset of a debugging information entry
				1347	D relative to the beginning of the current compilation unit, that provides
				1348	the type T of the constant value. The second is a 1-byte unsigned integral
				1349	constant S. The third is a block of bytes B, with a length equal to S.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1350
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1351	TS is the bit size of the type T. The least significant TS bits of B are
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1352	interpreted as a value V of the type D. It pushes the value V with the type
				1353	D.
				1354
				1355	The DWARF is ill-formed if D is not a ``DW_TAG_base_type`` debugging
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1356	information entry in the current compilation unit, or if TS divided by 8
				1357	(the byte size) and rounded up to a whole number is not equal to S.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1358
				1359	*While the size of the byte block B can be inferred from the type D
				1360	definition, it is encoded explicitly into the operation so that the
				1361	operation can be parsed easily without reference to the* ``.debug_info``
				1362	section.
				1363
				1364	10. ``DW_OP_LLVM_push_lane`` New
				1365
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1366	``DW_OP_LLVM_push_lane`` pushes the target architecture lane identifier of
				1367	the current lane as a value with the generic type.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1368
				1369	*For languages that are implemented using a SIMD or SIMT execution model,
				1370	this is the lane number that corresponds to the source language thread of
				1371	execution upon which the user is focused.*
				1372
				1373	.. _amdgpu-dwarf-arithmetic-logical-operations:
				1374
				1375	Arithmetic and Logical Operations
				1376	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1377
				1378	.. note::
				1379
				1380	This section is the same as DWARF Version 5 section 2.5.1.4.
				1381
				1382	.. _amdgpu-dwarf-type-conversions-operations:
				1383
				1384	Type Conversion Operations
				1385	^^^^^^^^^^^^^^^^^^^^^^^^^^
				1386
				1387	.. note::
				1388
				1389	This section is the same as DWARF Version 5 section 2.5.1.6.
				1390
				1391	.. _amdgpu-dwarf-general-operations:
				1392
				1393	Special Value Operations
				1394	^^^^^^^^^^^^^^^^^^^^^^^^
				1395
				1396	There are these special value operations currently defined:
				1397
				1398	1. ``DW_OP_regval_type``
				1399
				1400	``DW_OP_regval_type`` has two operands. The first is an unsigned LEB128
				1401	integer that represents a register number R. The second is an unsigned
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1402	LEB128 integer DR that represents the byte offset of a debugging information
				1403	entry D relative to the beginning of the current compilation unit, that
				1404	provides the type T of the register value.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1405
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1406	The operation is equivalent to performing ``DW_OP_regx R; DW_OP_deref_type
				1407	DR``.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1408
				1409	.. note::
				1410
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1411	Should DWARF allow the type T to be a larger size than the size of the
				1412	register R? Restricting a larger bit size avoids any issue of conversion
				1413	as the, possibly truncated, bit contents of the register is simply
				1414	interpreted as a value of T. If a conversion is wanted it can be done
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1415	explicitly using a ``DW_OP_convert`` operation.
				1416
				1417	GDB has a per register hook that allows a target specific conversion on a
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1418	register by register basis. It defaults to truncation of bigger registers.
				1419	Removing use of the target hook does not cause any test failures in common
				1420	architectures. If the compiler for a target architecture did want some
				1421	form of conversion, including a larger result type, it could always
				1422	explicitly used the ``DW_OP_convert`` operation.
				1423
				1424	If T is a larger type than the register size, then the default GDB
				1425	register hook reads bytes from the next register (or reads out of bounds
				1426	for the last register!). Removing use of the target hook does not cause
				1427	any test failures in common architectures (except an illegal hand written
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	1428	assembly test). If a target architecture requires this behavior, these
				1429	extensions allow a composite location description to be used to combine
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1430	multiple registers.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1431
				1432	2. ``DW_OP_deref``
				1433
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1434	S is the bit size of the generic type divided by 8 (the byte size) and
				1435	rounded up to a whole number. DR is the offset of a hypothetical debug
				1436	information entry D in the current compilation unit for a base type of the
				1437	generic type.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1438
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1439	The operation is equivalent to performing ``DW_OP_deref_type S, DR``.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1440
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1441	3. ``DW_OP_deref_size``
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1442
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1443	``DW_OP_deref_size`` has a single 1-byte unsigned integral constant that
				1444	represents a byte result size S.
				1445
				1446	TS is the smaller of the generic type bit size and S scaled by 8 (the byte
				1447	size). If TS is smaller than the generic type bit size then T is an unsigned
				1448	integral type of bit size TS, otherwise T is the generic type. DR is the
				1449	offset of a hypothetical debug information entry D in the current
				1450	compilation unit for a base type T.
				1451
				1452	.. note::
				1453
				1454	Truncating the value when S is larger than the generic type matches what
				1455	GDB does. This allows the generic type size to not be an integral byte
				1456	size. It does allow S to be arbitrarily large. Should S be restricted to
				1457	the size of the generic type rounded up to a multiple of 8?
				1458
				1459	The operation is equivalent to performing ``DW_OP_deref_type S, DR``, except
				1460	if T is not the generic type, the value V pushed is zero-extended to the
				1461	generic type bit size and its type changed to the generic type.
				1462
				1463	4. ``DW_OP_deref_type``
				1464
				1465	``DW_OP_deref_type`` has two operands. The first is a 1-byte unsigned
				1466	integral constant S. The second is an unsigned LEB128 integer DR that
				1467	represents the byte offset of a debugging information entry D relative to
				1468	the beginning of the current compilation unit, that provides the type T of
				1469	the result value.
				1470
				1471	TS is the bit size of the type T.
				1472
				1473	*While the size of the pushed value V can be inferred from the type T, it is
				1474	encoded explicitly as the operand S so that the operation can be parsed
				1475	easily without reference to the* ``.debug_info`` section.
				1476
				1477	.. note::
				1478
				1479	It is unclear why the operand S is needed. Unlike ``DW_OP_const_type``,
				1480	the size is not needed for parsing. Any evaluation needs to get the base
				1481	type T to push with the value to know its encoding and bit size.
				1482
				1483	It pops one stack entry that must be a location description L.
				1484
				1485	A value V of TS bits is retrieved from the location storage LS specified by
				1486	one of the single location descriptions SL of L.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1487
				1488	*If L, or the location description of any composite location description
				1489	part that is a subcomponent of L, has more than one single location
				1490	description, then any one of them can be selected as they are required to
				1491	all have the same value. For any single location description SL, bits are
				1492	retrieved from the associated storage location starting at the bit offset
				1493	specified by SL. For a composite location description, the retrieved bits
				1494	are the concatenation of the N bits from each composite location part PL,
				1495	where N is limited to the size of PL.*
				1496
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1497	V is pushed on the stack with the type T.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1498
				1499	.. note::
				1500
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1501	This definition makes it an evaluation error if L is a register location
				1502	description that has less than TS bits remaining in the register storage.
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	1503	Particularly since these extensions extend location descriptions to have
				1504	a bit offset, it would be odd to define this as performing sign extension
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1505	based on the type, or be target architecture dependent, as the number of
				1506	remaining bits could be any number. This matches the GDB implementation
				1507	for ``DW_OP_deref_type``.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1508
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	1509	These extensions define ``DW_OP_breg`` in terms of
				1510	``DW_OP_regval_type``. ``DW_OP_regval_type`` is defined in terms of
				1511	``DW_OP_regx``, which uses a 0 bit offset, and ``DW_OP_deref_type``.
				1512	Therefore, it requires the register size to be greater or equal to the
				1513	address size of the address space. This matches the GDB implementation for
				1514	``DW_OP_breg``.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1515
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1516	The DWARF is ill-formed if D is not in the current compilation unit, D is
				1517	not a ``DW_TAG_base_type`` debugging information entry, or if TS divided by
				1518	8 (the byte size) and rounded up to a whole number is not equal to S.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1519
				1520	.. note::
				1521
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1522	This definition allows the base type to be a bit size since there seems no
				1523	reason to restrict it.
				1524
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1525	It is an evaluation error if any bit of the value is retrieved from the
				1526	undefined location storage or the offset of any bit exceeds the size of the
				1527	location storage LS specified by any single location description SL of L.
				1528
				1529	See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
				1530	concerning implicit location descriptions created by the
				1531	``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_implicit_aspace_pointer``
				1532	operations.
				1533
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1534	5. ``DW_OP_xderef`` Deprecated
				1535
				1536	``DW_OP_xderef`` pops two stack entries. The first must be an integral type
				1537	value that represents an address A. The second must be an integral type
				1538	value that represents a target architecture specific address space
				1539	identifier AS.
				1540
				1541	The operation is equivalent to performing ``DW_OP_swap;
				1542	DW_OP_LLVM_form_aspace_address; DW_OP_deref``. The value V retrieved is left
				1543	on the stack with the generic type.
				1544
				1545	This operation is deprecated as the ``DW_OP_LLVM_form_aspace_address``
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1546	operation can be used and provides greater expressiveness.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1547
				1548	6. ``DW_OP_xderef_size`` Deprecated
				1549
				1550	``DW_OP_xderef_size`` has a single 1-byte unsigned integral constant that
				1551	represents a byte result size S.
				1552
				1553	It pops two stack entries. The first must be an integral type value that
				1554	represents an address A. The second must be an integral type value that
				1555	represents a target architecture specific address space identifier AS.
				1556
				1557	The operation is equivalent to performing ``DW_OP_swap;
				1558	DW_OP_LLVM_form_aspace_address; DW_OP_deref_size S``. The zero-extended
				1559	value V retrieved is left on the stack with the generic type.
				1560
				1561	This operation is deprecated as the ``DW_OP_LLVM_form_aspace_address``
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1562	operation can be used and provides greater expressiveness.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1563
				1564	7. ``DW_OP_xderef_type`` Deprecated
				1565
				1566	``DW_OP_xderef_type`` has two operands. The first is a 1-byte unsigned
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1567	integral constant S. The second operand is an unsigned LEB128 integer DR
				1568	that represents the byte offset of a debugging information entry D relative
				1569	to the beginning of the current compilation unit, that provides the type T
				1570	of the result value.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1571
				1572	It pops two stack entries. The first must be an integral type value that
				1573	represents an address A. The second must be an integral type value that
				1574	represents a target architecture specific address space identifier AS.
				1575
				1576	The operation is equivalent to performing ``DW_OP_swap;
				1577	DW_OP_LLVM_form_aspace_address; DW_OP_deref_type S R``. The value V
				1578	retrieved is left on the stack with the type D.
				1579
				1580	This operation is deprecated as the ``DW_OP_LLVM_form_aspace_address``
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1581	operation can be used and provides greater expressiveness.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1582
				1583	8. ``DW_OP_entry_value`` Deprecated
				1584
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1585	``DW_OP_entry_value`` pushes the value of an expression that is evaluated in
				1586	the context of the calling frame.
				1587
				1588	*It may be used to determine the value of arguments on entry to the current
				1589	call frame provided they are not clobbered.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1590
				1591	It has two operands. The first is an unsigned LEB128 integer S. The second
				1592	is a block of bytes, with a length equal S, interpreted as a DWARF
				1593	operation expression E.
				1594
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1595	E is evaluated with the current context, except the result kind is
				1596	unspecified, the call frame is the one that called the current frame, the
				1597	program location is the call site in the calling frame, the object is
				1598	unspecified, and the initial stack is empty. The calling frame information
				1599	is obtained by virtually unwinding the current call frame using the call
				1600	frame information (see :ref:`amdgpu-dwarf-call-frame-information`).
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1601
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1602	If the result of E is a location description L (see
				1603	:ref:`amdgpu-dwarf-register-location-descriptions`), and the last operation
				1604	executed by E is a ``DW_OP_reg*`` for register R with a target architecture
				1605	specific base type of T, then the contents of the register are retrieved as
				1606	if a ``DW_OP_deref_type DR`` operation was performed where DR is the offset
				1607	of a hypothetical debug information entry in the current compilation unit
				1608	for T. The resulting value V s pushed on the stack.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1609
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1610	Using ``DW_OP_reg`` provides a more compact form for the case where the
				1611	value was in a register on entry to the subprogram.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1612
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1613	.. note:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1614
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1615	It is unclear how this provides a more compact expression, as
				1616	``DW_OP_regval_type`` could be used which is marginally larger.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1617
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1618	If the result of E is a value V, then V is pushed on the stack.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1619
				1620	Otherwise, the DWARF expression is ill-formed.
				1621
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1622	The ``DW_OP_entry_value`` *operation is deprecated as its main usage is
				1623	provided by other means. DWARF Version 5 added the*
				1624	``DW_TAG_call_site_parameter`` *debugger information entry for call sites
				1625	that has* ``DW_AT_call_value``\ , ``DW_AT_call_data_location``\ , and
				1626	``DW_AT_call_data_value`` *attributes that provide DWARF expressions to
				1627	compute actual parameter values at the time of the call, and requires the
				1628	producer to ensure the expressions are valid to evaluate even when virtually
				1629	unwound. The* ``DW_OP_LLVM_call_frame_entry_reg`` *operation provides access
				1630	to registers in the virtually unwound calling frame.*
				1631
				1632	.. note::
				1633
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1634	GDB only implements ``DW_OP_entry_value`` when E is exactly
				1635	``DW_OP_reg`` or ``DW_OP_breg; DW_OP_deref*``.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1636
				1637	.. _amdgpu-dwarf-location-description-operations:
				1638
				1639	Location Description Operations
				1640	###############################
				1641
				1642	This section describes the operations that push location descriptions on the
				1643	stack.
				1644
				1645	General Location Description Operations
				1646	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1647
				1648	1. ``DW_OP_LLVM_offset`` New
				1649
				1650	``DW_OP_LLVM_offset`` pops two stack entries. The first must be an integral
				1651	type value that represents a byte displacement B. The second must be a
				1652	location description L.
				1653
				1654	It adds the value of B scaled by 8 (the byte size) to the bit offset of each
				1655	single location description SL of L, and pushes the updated L.
				1656
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1657	It is an evaluation error if the updated bit offset of any SL is less than 0
				1658	or greater than or equal to the size of the location storage specified by
				1659	SL.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1660
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1661	2. ``DW_OP_LLVM_offset_uconst`` New
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1662
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1663	``DW_OP_LLVM_offset_uconst`` has a single unsigned LEB128 integer operand
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1664	that represents a byte displacement B.
				1665
				1666	The operation is equivalent to performing ``DW_OP_constu B;
				1667	DW_OP_LLVM_offset``.
				1668
				1669	*This operation is supplied specifically to be able to encode more field
				1670	displacements in two bytes than can be done with* ``DW_OP_lit*;
				1671	DW_OP_LLVM_offset``\ .
				1672
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1673	.. note::
				1674
				1675	Should this be named ``DW_OP_LLVM_offset_uconst`` to match
				1676	``DW_OP_plus_uconst``, or ``DW_OP_LLVM_offset_constu`` to match
				1677	``DW_OP_constu``?
				1678
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1679	3. ``DW_OP_LLVM_bit_offset`` New
				1680
				1681	``DW_OP_LLVM_bit_offset`` pops two stack entries. The first must be an
				1682	integral type value that represents a bit displacement B. The second must be
				1683	a location description L.
				1684
				1685	It adds the value of B to the bit offset of each single location description
				1686	SL of L, and pushes the updated L.
				1687
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1688	It is an evaluation error if the updated bit offset of any SL is less than 0
				1689	or greater than or equal to the size of the location storage specified by
				1690	SL.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1691
				1692	4. ``DW_OP_push_object_address``
				1693
				1694	``DW_OP_push_object_address`` pushes the location description L of the
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1695	current object.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1696
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1697	*This object may correspond to an independent variable that is part of a
				1698	user presented expression that is being evaluated. The object location
				1699	description may be determined from the variable's own debugging information
				1700	entry or it may be a component of an array, structure, or class whose
				1701	address has been dynamically determined by an earlier step during user
				1702	expression evaluation.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1703
				1704	*This operation provides explicit functionality (especially for arrays
				1705	involving descriptions) that is analogous to the implicit push of the base
				1706	location description of a structure prior to evaluation of a
				1707	``DW_AT_data_member_location`` to access a data member of a structure.*
				1708
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1709	.. note::
				1710
				1711	This operation could be removed and the object location description
				1712	specified as the initial stack as for ``DW_AT_data_member_location``.
				1713
				1714	The only attribute that specifies a current object is
				1715	``DW_AT_data_location`` so the non-normative text seems to overstate how
				1716	this is being used. Or are there other attributes that need to state they
				1717	pass an object?
				1718
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1719	5. ``DW_OP_LLVM_call_frame_entry_reg`` New
				1720
				1721	``DW_OP_LLVM_call_frame_entry_reg`` has a single unsigned LEB128 integer
				1722	operand that represents a target architecture register number R.
				1723
				1724	It pushes a location description L that holds the value of register R on
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1725	entry to the current subprogram as defined by the call frame information
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1726	(see :ref:`amdgpu-dwarf-call-frame-information`).
				1727
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1728	*If there is no call frame information defined, then the default rules for
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1729	the target architecture are used. If the register rule is* undefined\ *, then
				1730	the undefined location description is pushed. If the register rule is* same
				1731	value\ , then a register location description for R is pushed.
				1732
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1733	.. _amdgpu-dwarf-undefined-location-description-operations:
				1734
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1735	Undefined Location Description Operations
				1736	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1737
				1738	*The undefined location storage represents a piece or all of an object that is
				1739	present in the source but not in the object code (perhaps due to optimization).
				1740	Neither reading nor writing to the undefined location storage is meaningful.*
				1741
				1742	An undefined location description specifies the undefined location storage.
				1743	There is no concept of the size of the undefined location storage, nor of a bit
				1744	offset for an undefined location description. The ``DW_OP_LLVM_*offset``
				1745	operations leave an undefined location description unchanged. The
				1746	``DW_OP_*piece`` operations can explicitly or implicitly specify an undefined
				1747	location description, allowing any size and offset to be specified, and results
				1748	in a part with all undefined bits.
				1749
				1750	1. ``DW_OP_LLVM_undefined`` New
				1751
				1752	``DW_OP_LLVM_undefined`` pushes a location description L that comprises one
				1753	undefined location description SL.
				1754
				1755	.. _amdgpu-dwarf-memory-location-description-operations:
				1756
				1757	Memory Location Description Operations
				1758	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1759
				1760	Each of the target architecture specific address spaces has a corresponding
				1761	memory location storage that denotes the linear addressable memory of that
				1762	address space. The size of each memory location storage corresponds to the range
				1763	of the addresses in the corresponding address space.
				1764
				1765	*It is target architecture defined how address space location storage maps to
				1766	target architecture physical memory. For example, they may be independent
				1767	memory, or more than one location storage may alias the same physical memory
				1768	possibly at different offsets and with different interleaving. The mapping may
				1769	also be dictated by the source language address classes.*
				1770
				1771	A memory location description specifies a memory location storage. The bit
				1772	offset corresponds to a bit position within a byte of the memory. Bits accessed
				1773	using a memory location description, access the corresponding target
				1774	architecture memory starting at the bit position within the byte specified by
				1775	the bit offset.
				1776
				1777	A memory location description that has a bit offset that is a multiple of 8 (the
				1778	byte size) is defined to be a byte address memory location description. It has a
				1779	memory byte address A that is equal to the bit offset divided by 8.
				1780
				1781	A memory location description that does not have a bit offset that is a multiple
				1782	of 8 (the byte size) is defined to be a bit field memory location description.
				1783	It has a bit position B equal to the bit offset modulo 8, and a memory byte
				1784	address A equal to the bit offset minus B that is then divided by 8.
				1785
				1786	The address space AS of a memory location description is defined to be the
				1787	address space that corresponds to the memory location storage associated with
				1788	the memory location description.
				1789
				1790	A location description that is comprised of one byte address memory location
				1791	description SL is defined to be a memory byte address location description. It
				1792	has a byte address equal to A and an address space equal to AS of the
				1793	corresponding SL.
				1794
				1795	``DW_ASPACE_none`` is defined as the target architecture default address space.
				1796
				1797	If a stack entry is required to be a location description, but it is a value V
				1798	with the generic type, then it is implicitly converted to a location description
				1799	L with one memory location description SL. SL specifies the memory location
				1800	storage that corresponds to the target architecture default address space with a
				1801	bit offset equal to V scaled by 8 (the byte size).
				1802
				1803	.. note::
				1804
				1805	If it is wanted to allow any integral type value to be implicitly converted to
				1806	a memory location description in the target architecture default address
				1807	space:
				1808
				1809	If a stack entry is required to be a location description, but is a value V
				1810	with an integral type, then it is implicitly converted to a location
				1811	description L with a one memory location description SL. If the type size of
				1812	V is less than the generic type size, then the value V is zero extended to
				1813	the size of the generic type. The least significant generic type size bits
				1814	are treated as a twos-complement unsigned value to be used as an address A.
				1815	SL specifies memory location storage corresponding to the target
				1816	architecture default address space with a bit offset equal to A scaled by 8
				1817	(the byte size).
				1818
				1819	The implicit conversion could also be defined as target architecture specific.
				1820	For example, GDB checks if V is an integral type. If it is not it gives an
				1821	error. Otherwise, GDB zero-extends V to 64 bits. If the GDB target defines a
				1822	hook function, then it is called. The target specific hook function can modify
				1823	the 64-bit value, possibly sign extending based on the original value type.
				1824	Finally, GDB treats the 64-bit value V as a memory location address.
				1825
				1826	If a stack entry is required to be a location description, but it is an implicit
				1827	pointer value IPV with the target architecture default address space, then it is
				1828	implicitly converted to a location description with one single location
				1829	description specified by IPV. See
				1830	:ref:`amdgpu-dwarf-implicit-location-descriptions`.
				1831
				1832	.. note::
				1833
				1834	Is this rule required for DWARF Version 5 backwards compatibility? If not, it
				1835	can be eliminated, and the producer can use
				1836	``DW_OP_LLVM_form_aspace_address``.
				1837
				1838	If a stack entry is required to be a value, but it is a location description L
				1839	with one memory location description SL in the target architecture default
				1840	address space with a bit offset B that is a multiple of 8, then it is implicitly
				1841	converted to a value equal to B divided by 8 (the byte size) with the generic
				1842	type.
				1843
				1844	1. ``DW_OP_addr``
				1845
				1846	``DW_OP_addr`` has a single byte constant value operand, which has the size
				1847	of the generic type, that represents an address A.
				1848
				1849	It pushes a location description L with one memory location description SL
				1850	on the stack. SL specifies the memory location storage corresponding to the
				1851	target architecture default address space with a bit offset equal to A
				1852	scaled by 8 (the byte size).
				1853
				1854	*If the DWARF is part of a code object, then A may need to be relocated. For
				1855	example, in the ELF code object format, A must be adjusted by the difference
				1856	between the ELF segment virtual address and the virtual address at which the
				1857	segment is loaded.*
				1858
				1859	2. ``DW_OP_addrx``
				1860
				1861	``DW_OP_addrx`` has a single unsigned LEB128 integer operand that represents
				1862	a zero-based index into the ``.debug_addr`` section relative to the value of
				1863	the ``DW_AT_addr_base`` attribute of the associated compilation unit. The
				1864	address value A in the ``.debug_addr`` section has the size of the generic
				1865	type.
				1866
				1867	It pushes a location description L with one memory location description SL
				1868	on the stack. SL specifies the memory location storage corresponding to the
				1869	target architecture default address space with a bit offset equal to A
				1870	scaled by 8 (the byte size).
				1871
				1872	*If the DWARF is part of a code object, then A may need to be relocated. For
				1873	example, in the ELF code object format, A must be adjusted by the difference
				1874	between the ELF segment virtual address and the virtual address at which the
				1875	segment is loaded.*
				1876
				1877	3. ``DW_OP_LLVM_form_aspace_address`` New
				1878
				1879	``DW_OP_LLVM_form_aspace_address`` pops top two stack entries. The first
				1880	must be an integral type value that represents a target architecture
				1881	specific address space identifier AS. The second must be an integral type
				1882	value that represents an address A.
				1883
				1884	The address size S is defined as the address bit size of the target
				1885	architecture specific address space that corresponds to AS.
				1886
				1887	A is adjusted to S bits by zero extending if necessary, and then treating the
				1888	least significant S bits as a twos-complement unsigned value A'.
				1889
				1890	It pushes a location description L with one memory location description SL
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1891	on the stack. SL specifies the memory location storage LS that corresponds
				1892	to AS with a bit offset equal to A' scaled by 8 (the byte size).
				1893
				1894	If AS is an address space that is specific to context elements, then LS
				1895	corresponds to the location storage associated with the current context.
				1896
				1897	*For example, if AS is for per thread storage then LS is the location
				1898	storage for the current thread. For languages that are implemented using a
				1899	SIMD or SIMT execution model, then if AS is for per lane storage then LS is
				1900	the location storage for the current lane of the current thread. Therefore,
				1901	if L is accessed by an operation, the location storage selected when the
				1902	location description was created is accessed, and not the location storage
				1903	associated with the current context of the access operation.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1904
				1905	The DWARF expression is ill-formed if AS is not one of the values defined by
				1906	the target architecture specific ``DW_ASPACE_*`` values.
				1907
				1908	See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
				1909	concerning implicit pointer values produced by dereferencing implicit
				1910	location descriptions created by the ``DW_OP_implicit_pointer`` and
				1911	``DW_OP_LLVM_implicit_aspace_pointer`` operations.
				1912
				1913	4. ``DW_OP_form_tls_address``
				1914
				1915	``DW_OP_form_tls_address`` pops one stack entry that must be an integral
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1916	type value and treats it as a thread-local storage address TA.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1917
				1918	It pushes a location description L with one memory location description SL
				1919	on the stack. SL is the target architecture specific memory location
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1920	description that corresponds to the thread-local storage address TA.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1921
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1922	The meaning of the thread-local storage address TA is defined by the
				1923	run-time environment. If the run-time environment supports multiple
				1924	thread-local storage blocks for a single thread, then the block
				1925	corresponding to the executable or shared library containing this DWARF
				1926	expression is used.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1927
				1928	*Some implementations of C, C++, Fortran, and other languages support a
				1929	thread-local storage class. Variables with this storage class have distinct
				1930	values and addresses in distinct threads, much as automatic variables have
				1931	distinct values and addresses in each subprogram invocation. Typically,
				1932	there is a single block of storage containing all thread-local variables
				1933	declared in the main executable, and a separate block for the variables
				1934	declared in each shared library. Each thread-local variable can then be
				1935	accessed in its block using an identifier. This identifier is typically a
				1936	byte offset into the block and pushed onto the DWARF stack by one of the*
				1937	``DW_OP_const`` operations prior to the* ``DW_OP_form_tls_address``
				1938	*operation. Computing the address of the appropriate block can be complex
				1939	(in some cases, the compiler emits a function call to do it), and difficult
				1940	to describe using ordinary DWARF location descriptions. Instead of forcing
				1941	complex thread-local storage calculations into the DWARF expressions, the*
				1942	``DW_OP_form_tls_address`` *allows the consumer to perform the computation
				1943	based on the target architecture specific run-time environment.*
				1944
				1945	5. ``DW_OP_call_frame_cfa``
				1946
				1947	``DW_OP_call_frame_cfa`` pushes the location description L of the Canonical
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1948	Frame Address (CFA) of the current subprogram, obtained from the call frame
				1949	information on the stack. See :ref:`amdgpu-dwarf-call-frame-information`.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1950
				1951	Although the value of the ``DW_AT_frame_base`` *attribute of the debugger
				1952	information entry corresponding to the current subprogram can be computed
				1953	using a location list expression, in some cases this would require an
				1954	extensive location list because the values of the registers used in
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1955	computing the CFA change during a subprogram execution. If the call frame
				1956	information is present, then it already encodes such changes, and it is
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1957	space efficient to reference that using the* ``DW_OP_call_frame_cfa``
				1958	operation.
				1959
				1960	6. ``DW_OP_fbreg``
				1961
				1962	``DW_OP_fbreg`` has a single signed LEB128 integer operand that represents a
				1963	byte displacement B.
				1964
				1965	The location description L for the frame base of the current subprogram is
				1966	obtained from the ``DW_AT_frame_base`` attribute of the debugger information
				1967	entry corresponding to the current subprogram as described in
				1968	:ref:`amdgpu-dwarf-debugging-information-entry-attributes`.
				1969
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	1970	The location description L is updated as if the ``DW_OP_LLVM_offset_uconst
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1971	B`` operation was applied. The updated L is pushed on the stack.
				1972
				1973	7. ``DW_OP_breg0``, ``DW_OP_breg1``, ..., ``DW_OP_breg31``
				1974
				1975	The ``DW_OP_breg<N>`` operations encode the numbers of up to 32 registers,
				1976	numbered from 0 through 31, inclusive. The register number R corresponds to
				1977	the N in the operation name.
				1978
				1979	They have a single signed LEB128 integer operand that represents a byte
				1980	displacement B.
				1981
				1982	The address space identifier AS is defined as the one corresponding to the
				1983	target architecture specific default address space.
				1984
				1985	The address size S is defined as the address bit size of the target
				1986	architecture specific address space corresponding to AS.
				1987
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	1988	The contents of the register specified by R are retrieved as if a
				1989	``DW_OP_regval_type R, DR`` operation was performed where DR is the offset
				1990	of a hypothetical debug information entry in the current compilation unit
				1991	for an unsigned integral base type of size S bits. B is added and the least
				1992	significant S bits are treated as an unsigned value to be used as an address
				1993	A.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	1994
				1995	They push a location description L comprising one memory location
				1996	description LS on the stack. LS specifies the memory location storage that
				1997	corresponds to AS with a bit offset equal to A scaled by 8 (the byte size).
				1998
				1999	8. ``DW_OP_bregx``
				2000
				2001	``DW_OP_bregx`` has two operands. The first is an unsigned LEB128 integer
				2002	that represents a register number R. The second is a signed LEB128
				2003	integer that represents a byte displacement B.
				2004
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2005	The action is the same as for ``DW_OP_breg<N>``, except that R is used as
				2006	the register number and B is used as the byte displacement.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2007
				2008	9. ``DW_OP_LLVM_aspace_bregx`` New
				2009
				2010	``DW_OP_LLVM_aspace_bregx`` has two operands. The first is an unsigned
				2011	LEB128 integer that represents a register number R. The second is a signed
				2012	LEB128 integer that represents a byte displacement B. It pops one stack
				2013	entry that is required to be an integral type value that represents a target
				2014	architecture specific address space identifier AS.
				2015
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2016	The action is the same as for ``DW_OP_breg<N>``, except that R is used as
				2017	the register number, B is used as the byte displacement, and AS is used as
				2018	the address space identifier.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2019
				2020	The DWARF expression is ill-formed if AS is not one of the values defined by
				2021	the target architecture specific ``DW_ASPACE_*`` values.
				2022
				2023	.. note::
				2024
				2025	Could also consider adding ``DW_OP_aspace_breg0, DW_OP_aspace_breg1, ...,
				2026	DW_OP_aspace_bref31`` which would save encoding size.
				2027
				2028	.. _amdgpu-dwarf-register-location-descriptions:
				2029
				2030	Register Location Description Operations
				2031	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2032
				2033	There is a register location storage that corresponds to each of the target
				2034	architecture registers. The size of each register location storage corresponds
				2035	to the size of the corresponding target architecture register.
				2036
				2037	A register location description specifies a register location storage. The bit
				2038	offset corresponds to a bit position within the register. Bits accessed using a
				2039	register location description access the corresponding target architecture
				2040	register starting at the specified bit offset.
				2041
				2042	1. ``DW_OP_reg0``, ``DW_OP_reg1``, ..., ``DW_OP_reg31``
				2043
				2044	``DW_OP_reg<N>`` operations encode the numbers of up to 32 registers,
				2045	numbered from 0 through 31, inclusive. The target architecture register
				2046	number R corresponds to the N in the operation name.
				2047
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2048	The operation is equivalent to performing ``DW_OP_regx R``.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2049
				2050	2. ``DW_OP_regx``
				2051
				2052	``DW_OP_regx`` has a single unsigned LEB128 integer operand that represents
				2053	a target architecture register number R.
				2054
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2055	If the current call frame is the top call frame, it pushes a location
				2056	description L that specifies one register location description SL on the
				2057	stack. SL specifies the register location storage that corresponds to R with
				2058	a bit offset of 0 for the current thread.
				2059
				2060	If the current call frame is not the top call frame, call frame information
				2061	(see :ref:`amdgpu-dwarf-call-frame-information`) is used to determine the
				2062	location description that holds the register for the current call frame and
				2063	current program location of the current thread. The resulting location
				2064	description L is pushed.
				2065
				2066	*Note that if call frame information is used, the resulting location
				2067	description may be register, memory, or undefined.*
				2068
				2069	*An implementation may evaluate the call frame information immediately, or
				2070	may defer evaluation until L is accessed by an operation. If evaluation is
Kazu Hirata	a31b389	2020-08-09 19:29:38 -0700	[diff] [blame]	2071	deferred, R and the current context can be recorded in L. When accessed, the
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2072	recorded context is used to evaluate the call frame information, not the
				2073	current context of the access operation.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2074
				2075	*These operations obtain a register location. To fetch the contents of a
				2076	register, it is necessary to use* ``DW_OP_regval_type``\ , use one of the
				2077	``DW_OP_breg`` register-based addressing operations, or use* ``DW_OP_deref*``
				2078	on a register location description.
				2079
				2080	.. _amdgpu-dwarf-implicit-location-descriptions:
				2081
				2082	Implicit Location Description Operations
				2083	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2084
				2085	Implicit location storage represents a piece or all of an object which has no
				2086	actual location in the program but whose contents are nonetheless known, either
				2087	as a constant or can be computed from other locations and values in the program.
				2088
				2089	An implicit location description specifies an implicit location storage. The bit
				2090	offset corresponds to a bit position within the implicit location storage. Bits
				2091	accessed using an implicit location description, access the corresponding
				2092	implicit storage value starting at the bit offset.
				2093
				2094	1. ``DW_OP_implicit_value``
				2095
				2096	``DW_OP_implicit_value`` has two operands. The first is an unsigned LEB128
				2097	integer that represents a byte size S. The second is a block of bytes with a
				2098	length equal to S treated as a literal value V.
				2099
				2100	An implicit location storage LS is created with the literal value V and a
				2101	size of S.
				2102
				2103	It pushes location description L with one implicit location description SL
				2104	on the stack. SL specifies LS with a bit offset of 0.
				2105
				2106	2. ``DW_OP_stack_value``
				2107
				2108	``DW_OP_stack_value`` pops one stack entry that must be a value V.
				2109
				2110	An implicit location storage LS is created with the literal value V and a
				2111	size equal to V's base type size.
				2112
				2113	It pushes a location description L with one implicit location description SL
				2114	on the stack. SL specifies LS with a bit offset of 0.
				2115
				2116	The ``DW_OP_stack_value`` *operation specifies that the object does not
				2117	exist in memory, but its value is nonetheless known. In this form, the
				2118	location description specifies the actual value of the object, rather than
				2119	specifying the memory or register storage that holds the value.*
				2120
				2121	See :ref:`amdgpu-dwarf-implicit-location-descriptions` for special rules
				2122	concerning implicit pointer values produced by dereferencing implicit
				2123	location descriptions created by the ``DW_OP_implicit_pointer`` and
				2124	``DW_OP_LLVM_implicit_aspace_pointer`` operations.
				2125
				2126	.. note::
				2127
				2128	Since location descriptions are allowed on the stack, the
				2129	``DW_OP_stack_value`` operation no longer terminates the DWARF operation
				2130	expression execution as in DWARF Version 5.
				2131
				2132	3. ``DW_OP_implicit_pointer``
				2133
				2134	*An optimizing compiler may eliminate a pointer, while still retaining the
				2135	value that the pointer addressed.* ``DW_OP_implicit_pointer`` *allows a
				2136	producer to describe this value.*
				2137
				2138	``DW_OP_implicit_pointer`` *specifies an object is a pointer to the target
				2139	architecture default address space that cannot be represented as a real
				2140	pointer, even though the value it would point to can be described. In this
				2141	form, the location description specifies a debugging information entry that
				2142	represents the actual location description of the object to which the
				2143	pointer would point. Thus, a consumer of the debug information would be able
				2144	to access the dereferenced pointer, even when it cannot access the pointer
				2145	itself.*
				2146
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2147	``DW_OP_implicit_pointer`` has two operands. The first operand is a 4-byte
				2148	unsigned value in the 32-bit DWARF format, or an 8-byte unsigned value in
				2149	the 64-bit DWARF format, that represents the byte offset DR of a debugging
				2150	information entry D relative to the beginning of the ``.debug_info`` section
				2151	that contains the current compilation unit. The second operand is a signed
				2152	LEB128 integer that represents a byte displacement B.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2153
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2154	Note that D may not be in the current compilation unit.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2155
				2156	The first operand interpretation is exactly like that for
				2157	``DW_FORM_ref_addr``\ .
				2158
				2159	The address space identifier AS is defined as the one corresponding to the
				2160	target architecture specific default address space.
				2161
				2162	The address size S is defined as the address bit size of the target
				2163	architecture specific address space corresponding to AS.
				2164
				2165	An implicit location storage LS is created with the debugging information
				2166	entry D, address space AS, and size of S.
				2167
				2168	It pushes a location description L that comprises one implicit location
				2169	description SL on the stack. SL specifies LS with a bit offset of 0.
				2170
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2171	It is an evaluation error if a ``DW_OP_deref*`` operation pops a location
				2172	description L', and retrieves S bits, such that any retrieved bits come from
				2173	an implicit location storage that is the same as LS, unless both the
				2174	following conditions are met:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2175
				2176	1. All retrieved bits come from an implicit location description that
				2177	refers to an implicit location storage that is the same as LS.
				2178
				2179	*Note that all bits do not have to come from the same implicit location
				2180	description, as L' may involve composite location descriptors.*
				2181
				2182	2. The bits come from consecutive ascending offsets within their respective
				2183	implicit location storage.
				2184
				2185	These rules are equivalent to retrieving the complete contents of LS.
				2186
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2187	If both the above conditions are met, then the value V pushed by the
				2188	``DW_OP_deref*`` operation is an implicit pointer value IPV with a target
				2189	architecture specific address space of AS, a debugging information entry of
				2190	D, and a base type of T. If AS is the target architecture default address
				2191	space, then T is the generic type. Otherwise, T is a target architecture
				2192	specific integral type with a bit size equal to S.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2193
				2194	If IPV is either implicitly converted to a location description (only done
				2195	if AS is the target architecture default address space) or used by
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2196	``DW_OP_LLVM_form_aspace_address`` (only done if the address space popped by
				2197	``DW_OP_LLVM_form_aspace_address`` is AS), then the resulting location
				2198	description RL is:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2199
				2200	* If D has a ``DW_AT_location`` attribute, the DWARF expression E from the
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2201	``DW_AT_location`` attribute is evaluated with the current context, except
				2202	that the result kind is a location description, the compilation unit is
				2203	the one that contains D, the object is unspecified, and the initial stack
				2204	is empty. RL is the expression result.
				2205
				2206	*Note that E is evaluated with the context of the expression accessing
				2207	IPV, and not the context of the expression that contained the*
				2208	``DW_OP_implicit_pointer`` or ``DW_OP_LLVM_aspace_implicit_pointer``
				2209	operation that created L.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2210
				2211	* If D has a ``DW_AT_const_value`` attribute, then an implicit location
				2212	storage RLS is created from the ``DW_AT_const_value`` attribute's value
				2213	with a size matching the size of the ``DW_AT_const_value`` attribute's
				2214	value. RL comprises one implicit location description SRL. SRL specifies
				2215	RLS with a bit offset of 0.
				2216
				2217	.. note::
				2218
				2219	If using ``DW_AT_const_value`` for variables and formal parameters is
				2220	deprecated and instead ``DW_AT_location`` is used with an implicit
				2221	location description, then this rule would not be required.
				2222
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2223	* Otherwise, it is an evaluation error.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2224
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	2225	The bit offset of RL is updated as if the ``DW_OP_LLVM_offset_uconst B``
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2226	operation was applied.
				2227
				2228	If a ``DW_OP_stack_value`` operation pops a value that is the same as IPV,
				2229	then it pushes a location description that is the same as L.
				2230
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2231	It is an evaluation error if LS or IPV is accessed in any other manner.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2232
				2233	*The restrictions on how an implicit pointer location description created
				2234	by* ``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_aspace_implicit_pointer``
				2235	*can be used are to simplify the DWARF consumer. Similarly, for an implicit
				2236	pointer value created by* ``DW_OP_deref`` and* ``DW_OP_stack_value``\ .*
				2237
				2238	4. ``DW_OP_LLVM_aspace_implicit_pointer`` New
				2239
				2240	``DW_OP_LLVM_aspace_implicit_pointer`` has two operands that are the same as
				2241	for ``DW_OP_implicit_pointer``.
				2242
				2243	It pops one stack entry that must be an integral type value that represents
				2244	a target architecture specific address space identifier AS.
				2245
				2246	The location description L that is pushed on the stack is the same as for
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2247	``DW_OP_implicit_pointer``, except that the address space identifier used is
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2248	AS.
				2249
				2250	The DWARF expression is ill-formed if AS is not one of the values defined by
				2251	the target architecture specific ``DW_ASPACE_*`` values.
				2252
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2253	.. note::
				2254
				2255	This definition of ``DW_OP_LLVM_aspace_implicit_pointer`` may change when
				2256	full support for address classes is added as required for languages such
				2257	as OpenCL/SyCL.
				2258
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2259	Typically a ``DW_OP_implicit_pointer`` or
				2260	``DW_OP_LLVM_aspace_implicit_pointer`` *operation is used in a DWARF expression
				2261	E\ :sub:`1` of a* ``DW_TAG_variable`` or ``DW_TAG_formal_parameter``
				2262	debugging information entry D\ :sub:`1`\ 's ``DW_AT_location`` *attribute.
				2263	The debugging information entry referenced by the* ``DW_OP_implicit_pointer``
				2264	or ``DW_OP_LLVM_aspace_implicit_pointer`` operations is typically itself a
				2265	``DW_TAG_variable`` or ``DW_TAG_formal_parameter`` *debugging information
				2266	entry D\ :sub:`2` whose* ``DW_AT_location`` *attribute gives a second DWARF
				2267	expression E\ :sub:`2`\ .*
				2268
				2269	D\ :sub:`1` and E\ :sub:`1` *are describing the location of a pointer type
				2270	object. D\ :sub:`2` and E\ :sub:`2` are describing the location of the
				2271	object pointed to by that pointer object.*
				2272
				2273	However, D\ :sub:`2` may be any debugging information entry that contains a
				2274	``DW_AT_location`` or ``DW_AT_const_value`` attribute (for example,
				2275	``DW_TAG_dwarf_procedure``\ ). By using E\ :sub:`2`\ *, a consumer can
				2276	reconstruct the value of the object when asked to dereference the pointer
				2277	described by E\ :sub:`1` which contains the* ``DW_OP_implicit_pointer`` or
				2278	``DW_OP_LLVM_aspace_implicit_pointer`` operation.
				2279
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	2280	.. _amdgpu-dwarf-composite-location-description-operations:
				2281
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2282	Composite Location Description Operations
				2283	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2284
				2285	A composite location storage represents an object or value which may be
				2286	contained in part of another location storage or contained in parts of more
				2287	than one location storage.
				2288
				2289	Each part has a part location description L and a part bit size S. L can have
				2290	one or more single location descriptions SL. If there are more than one SL then
				2291	that indicates that part is located in more than one place. The bits of each
				2292	place of the part comprise S contiguous bits from the location storage LS
				2293	specified by SL starting at the bit offset specified by SL. All the bits must
				2294	be within the size of LS or the DWARF expression is ill-formed.
				2295
				2296	A composite location storage can have zero or more parts. The parts are
				2297	contiguous such that the zero-based location storage bit index will range over
				2298	each part with no gaps between them. Therefore, the size of a composite location
				2299	storage is the sum of the size of its parts. The DWARF expression is ill-formed
				2300	if the size of the contiguous location storage is larger than the size of the
				2301	memory location storage corresponding to the largest target architecture
				2302	specific address space.
				2303
				2304	A composite location description specifies a composite location storage. The bit
				2305	offset corresponds to a bit position within the composite location storage.
				2306
				2307	There are operations that create a composite location storage.
				2308
				2309	There are other operations that allow a composite location storage to be
				2310	incrementally created. Each part is created by a separate operation. There may
				2311	be one or more operations to create the final composite location storage. A
				2312	series of such operations describes the parts of the composite location storage
				2313	that are in the order that the associated part operations are executed.
				2314
				2315	To support incremental creation, a composite location storage can be in an
				2316	incomplete state. When an incremental operation operates on an incomplete
				2317	composite location storage, it adds a new part, otherwise it creates a new
				2318	composite location storage. The ``DW_OP_LLVM_piece_end`` operation explicitly
				2319	makes an incomplete composite location storage complete.
				2320
				2321	A composite location description that specifies a composite location storage
				2322	that is incomplete is termed an incomplete composite location description. A
				2323	composite location description that specifies a composite location storage that
				2324	is complete is termed a complete composite location description.
				2325
				2326	If the top stack entry is a location description that has one incomplete
				2327	composite location description SL after the execution of an operation expression
				2328	has completed, SL is converted to a complete composite location description.
				2329
				2330	*Note that this conversion does not happen after the completion of an operation
				2331	expression that is evaluated on the same stack by the* ``DW_OP_call*``
				2332	*operations. Such executions are not a separate evaluation of an operation
				2333	expression, but rather the continued evaluation of the same operation expression
				2334	that contains the* ``DW_OP_call`` operation.*
				2335
				2336	If a stack entry is required to be a location description L, but L has an
				2337	incomplete composite location description, then the DWARF expression is
				2338	ill-formed. The exception is for the operations involved in incrementally
				2339	creating a composite location description as described below.
				2340
				2341	*Note that a DWARF operation expression may arbitrarily compose composite
				2342	location descriptions from any other location description, including those that
				2343	have multiple single location descriptions, and those that have composite
				2344	location descriptions.*
				2345
				2346	*The incremental composite location description operations are defined to be
				2347	compatible with the definitions in DWARF Version 5.*
				2348
				2349	1. ``DW_OP_piece``
				2350
				2351	``DW_OP_piece`` has a single unsigned LEB128 integer that represents a byte
				2352	size S.
				2353
				2354	The action is based on the context:
				2355
				2356	* If the stack is empty, then a location description L comprised of one
				2357	incomplete composite location description SL is pushed on the stack.
				2358
				2359	An incomplete composite location storage LS is created with a single part
				2360	P. P specifies a location description PL and has a bit size of S scaled by
				2361	8 (the byte size). PL is comprised of one undefined location description
				2362	PSL.
				2363
				2364	SL specifies LS with a bit offset of 0.
				2365
				2366	* Otherwise, if the top stack entry is a location description L comprised of
				2367	one incomplete composite location description SL, then the incomplete
				2368	composite location storage LS that SL specifies is updated to append a new
				2369	part P. P specifies a location description PL and has a bit size of S
				2370	scaled by 8 (the byte size). PL is comprised of one undefined location
				2371	description PSL. L is left on the stack.
				2372
				2373	* Otherwise, if the top stack entry is a location description or can be
				2374	converted to one, then it is popped and treated as a part location
				2375	description PL. Then:
				2376
				2377	* If the top stack entry (after popping PL) is a location description L
				2378	comprised of one incomplete composite location description SL, then the
				2379	incomplete composite location storage LS that SL specifies is updated to
				2380	append a new part P. P specifies the location description PL and has a
				2381	bit size of S scaled by 8 (the byte size). L is left on the stack.
				2382
				2383	* Otherwise, a location description L comprised of one incomplete
				2384	composite location description SL is pushed on the stack.
				2385
				2386	An incomplete composite location storage LS is created with a single
				2387	part P. P specifies the location description PL and has a bit size of S
				2388	scaled by 8 (the byte size).
				2389
				2390	SL specifies LS with a bit offset of 0.
				2391
				2392	* Otherwise, the DWARF expression is ill-formed
				2393
				2394	*Many compilers store a single variable in sets of registers or store a
				2395	variable partially in memory and partially in registers.* ``DW_OP_piece``
				2396	provides a way of describing where a part of a variable is located.
				2397
				2398	If a non-0 byte displacement is required, the ``DW_OP_LLVM_offset``
				2399	*operation can be used to update the location description before using it as
				2400	the part location description of a* ``DW_OP_piece`` operation.
				2401
				2402	The evaluation rules for the ``DW_OP_piece`` *operation allow it to be
				2403	compatible with the DWARF Version 5 definition.*
				2404
				2405	.. note::
				2406
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	2407	Since these extensions allow location descriptions to be entries on the
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2408	stack, a simpler operation to create composite location descriptions. For
				2409	example, just one operation that specifies how many parts, and pops pairs
				2410	of stack entries for the part size and location description. Not only
				2411	would this be a simpler operation and avoid the complexities of incomplete
				2412	composite location descriptions, but it may also have a smaller encoding
				2413	in practice. However, the desire for compatibility with DWARF Version 5 is
				2414	likely a stronger consideration.
				2415
				2416	2. ``DW_OP_bit_piece``
				2417
				2418	``DW_OP_bit_piece`` has two operands. The first is an unsigned LEB128
				2419	integer that represents the part bit size S. The second is an unsigned
				2420	LEB128 integer that represents a bit displacement B.
				2421
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2422	The action is the same as for ``DW_OP_piece``, except that any part created
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2423	has the bit size S, and the location description PL of any created part is
				2424	updated as if the ``DW_OP_constu B; DW_OP_LLVM_bit_offset`` operations were
				2425	applied.
				2426
				2427	``DW_OP_bit_piece`` is used instead of ``DW_OP_piece`` *when the piece to
				2428	be assembled is not byte-sized or is not at the start of the part location
				2429	description.*
				2430
				2431	If a computed bit displacement is required, the ``DW_OP_LLVM_bit_offset``
				2432	*operation can be used to update the location description before using it as
				2433	the part location description of a* ``DW_OP_bit_piece`` operation.
				2434
				2435	.. note::
				2436
				2437	The bit offset operand is not needed as ``DW_OP_LLVM_bit_offset`` can be
				2438	used on the part's location description.
				2439
				2440	3. ``DW_OP_LLVM_piece_end`` New
				2441
				2442	If the top stack entry is not a location description L comprised of one
				2443	incomplete composite location description SL, then the DWARF expression is
				2444	ill-formed.
				2445
				2446	Otherwise, the incomplete composite location storage LS specified by SL is
				2447	updated to be a complete composite location description with the same parts.
				2448
				2449	4. ``DW_OP_LLVM_extend`` New
				2450
				2451	``DW_OP_LLVM_extend`` has two operands. The first is an unsigned LEB128
				2452	integer that represents the element bit size S. The second is an unsigned
				2453	LEB128 integer that represents a count C.
				2454
				2455	It pops one stack entry that must be a location description and is treated
				2456	as the part location description PL.
				2457
				2458	A location description L comprised of one complete composite location
				2459	description SL is pushed on the stack.
				2460
				2461	A complete composite location storage LS is created with C identical parts
				2462	P. Each P specifies PL and has a bit size of S.
				2463
				2464	SL specifies LS with a bit offset of 0.
				2465
				2466	The DWARF expression is ill-formed if the element bit size or count are 0.
				2467
				2468	5. ``DW_OP_LLVM_select_bit_piece`` New
				2469
				2470	``DW_OP_LLVM_select_bit_piece`` has two operands. The first is an unsigned
				2471	LEB128 integer that represents the element bit size S. The second is an
				2472	unsigned LEB128 integer that represents a count C.
				2473
				2474	It pops three stack entries. The first must be an integral type value that
				2475	represents a bit mask value M. The second must be a location description
				2476	that represents the one-location description L1. The third must be a
				2477	location description that represents the zero-location description L0.
				2478
				2479	A complete composite location storage LS is created with C parts P\ :sub:`N`
				2480	ordered in ascending N from 0 to C-1 inclusive. Each P\ :sub:`N` specifies
				2481	location description PL\ :sub:`N` and has a bit size of S.
				2482
				2483	PL\ :sub:`N` is as if the ``DW_OP_LLVM_bit_offset N*S`` operation was
				2484	applied to PLX\ :sub:`N`\ .
				2485
				2486	PLX\ :sub:`N` is the same as L0 if the N\ :sup:`th` least significant bit of
				2487	M is a zero, otherwise it is the same as L1.
				2488
				2489	A location description L comprised of one complete composite location
				2490	description SL is pushed on the stack. SL specifies LS with a bit offset of
				2491	0.
				2492
				2493	The DWARF expression is ill-formed if S or C are 0, or if the bit size of M
				2494	is less than C.
				2495
				2496	.. _amdgpu-dwarf-location-list-expressions:
				2497
				2498	DWARF Location List Expressions
				2499	+++++++++++++++++++++++++++++++
				2500
				2501	*To meet the needs of recent computer architectures and optimization techniques,
				2502	debugging information must be able to describe the location of an object whose
				2503	location changes over the object’s lifetime, and may reside at multiple
				2504	locations during parts of an object's lifetime. Location list expressions are
				2505	used in place of operation expressions whenever the object whose location is
				2506	being described has these requirements.*
				2507
				2508	A location list expression consists of a series of location list entries. Each
				2509	location list entry is one of the following kinds:
				2510
				2511	Bounded location description
				2512
				2513	This kind of location list entry provides an operation expression that
				2514	evaluates to the location description of an object that is valid over a
				2515	lifetime bounded by a starting and ending address. The starting address is the
				2516	lowest address of the address range over which the location is valid. The
				2517	ending address is the address of the first location past the highest address
				2518	of the address range.
				2519
				2520	The location list entry matches when the current program location is within
				2521	the given range.
				2522
				2523	There are several kinds of bounded location description entries which differ
				2524	in the way that they specify the starting and ending addresses.
				2525
				2526	Default location description
				2527
				2528	This kind of location list entry provides an operation expression that
				2529	evaluates to the location description of an object that is valid when no
				2530	bounded location description entry applies.
				2531
				2532	The location list entry matches when the current program location is not
				2533	within the range of any bounded location description entry.
				2534
				2535	Base address
				2536
				2537	This kind of location list entry provides an address to be used as the base
				2538	address for beginning and ending address offsets given in certain kinds of
				2539	bounded location description entries. The applicable base address of a bounded
				2540	location description entry is the address specified by the closest preceding
				2541	base address entry in the same location list. If there is no preceding base
				2542	address entry, then the applicable base address defaults to the base address
				2543	of the compilation unit (see DWARF Version 5 section 3.1.1).
				2544
				2545	In the case of a compilation unit where all of the machine code is contained
				2546	in a single contiguous section, no base address entry is needed.
				2547
				2548	End-of-list
				2549
				2550	This kind of location list entry marks the end of the location list
				2551	expression.
				2552
				2553	The address ranges defined by the bounded location description entries of a
				2554	location list expression may overlap. When they do, they describe a situation in
				2555	which an object exists simultaneously in more than one place.
				2556
				2557	If all of the address ranges in a given location list expression do not
				2558	collectively cover the entire range over which the object in question is
				2559	defined, and there is no following default location description entry, it is
				2560	assumed that the object is not available for the portion of the range that is
				2561	not covered.
				2562
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2563	The result of the evaluation of a DWARF location list expression is:
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2564
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2565	* If the current program location is not specified, then it is an evaluation
				2566	error.
				2567
				2568	.. note::
				2569
				2570	If the location list only has a single default entry, should that be
				2571	considered a match if there is no program location? If there are non-default
				2572	entries then it seems it has to be an evaluation error when there is no
				2573	program location as that indicates the location depends on the program
				2574	location which is not known.
				2575
				2576	* If there are no matching location list entries, then the result is a location
				2577	description that comprises one undefined location description.
				2578
				2579	* Otherwise, the operation expression E of each matching location list entry is
				2580	evaluated with the current context, except that the result kind is a location
				2581	description, the object is unspecified, and the initial stack is empty. The
				2582	location list entry result is the location description returned by the
				2583	evaluation of E.
				2584
				2585	The result is a location description that is comprised of the union of the
				2586	single location descriptions of the location description result of each
				2587	matching location list entry.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2588
				2589	A location list expression can only be used as the value of a debugger
				2590	information entry attribute that is encoded using class ``loclist`` or
				2591	``loclistsptr`` (see DWARF Version 5 section 7.5.5). The value of the attribute
				2592	provides an index into a separate object file section called ``.debug_loclists``
				2593	or ``.debug_loclists.dwo`` (for split DWARF object files) that contains the
				2594	location list entries.
				2595
				2596	A ``DW_OP_call*`` and ``DW_OP_implicit_pointer`` operation can be used to
				2597	specify a debugger information entry attribute that has a location list
				2598	expression. Several debugger information entry attributes allow DWARF
				2599	expressions that are evaluated with an initial stack that includes a location
				2600	description that may originate from the evaluation of a location list
				2601	expression.
				2602
				2603	This location list representation, the ``loclist`` and ``loclistsptr``
				2604	class, and the related ``DW_AT_loclists_base`` *attribute are new in DWARF
				2605	Version 5. Together they eliminate most, or all of the code object relocations
				2606	previously needed for location list expressions.*
				2607
				2608	.. note::
				2609
				2610	The rest of this section is the same as DWARF Version 5 section 2.6.2.
				2611
				2612	.. _amdgpu-dwarf-segment_addresses:
				2613
				2614	Segmented Addresses
				2615	~~~~~~~~~~~~~~~~~~~
				2616
				2617	.. note::
				2618
				2619	This augments DWARF Version 5 section 2.12.
				2620
				2621	DWARF address classes are used for source languages that have the concept of
				2622	memory spaces. They are used in the ``DW_AT_address_class`` attribute for
				2623	pointer type, reference type, subprogram, and subprogram type debugger
				2624	information entries.
				2625
				2626	Each DWARF address class is conceptually a separate source language memory space
				2627	with its own lifetime and aliasing rules. DWARF address classes are used to
				2628	specify the source language memory spaces that pointer type and reference type
				2629	values refer, and to specify the source language memory space in which variables
				2630	are allocated.
				2631
				2632	The set of currently defined source language DWARF address classes, together
				2633	with source language mappings, is given in
				2634	:ref:`amdgpu-dwarf-address-class-table`.
				2635
				2636	Vendor defined source language address classes may be defined using codes in the
				2637	range ``DW_ADDR_LLVM_lo_user`` to ``DW_ADDR_LLVM_hi_user``.
				2638
				2639	.. table:: Address class
				2640	:name: amdgpu-dwarf-address-class-table
				2641
				2642	========================= ============ ========= ========= =========
				2643	Address Class Name Meaning C/C++ OpenCL CUDA/HIP
				2644	========================= ============ ========= ========= =========
				2645	``DW_ADDR_none`` generic default generic default
				2646	``DW_ADDR_LLVM_global`` global global
				2647	``DW_ADDR_LLVM_constant`` constant constant constant
				2648	``DW_ADDR_LLVM_group`` thread-group local shared
				2649	``DW_ADDR_LLVM_private`` thread private
				2650	``DW_ADDR_LLVM_lo_user``
				2651	``DW_ADDR_LLVM_hi_user``
				2652	========================= ============ ========= ========= =========
				2653
				2654	DWARF address spaces correspond to target architecture specific linear
				2655	addressable memory areas. They are used in DWARF expression location
				2656	descriptions to describe in which target architecture specific memory area data
				2657	resides.
				2658
				2659	*Target architecture specific DWARF address spaces may correspond to hardware
				2660	supported facilities such as memory utilizing base address registers, scratchpad
				2661	memory, and memory with special interleaving. The size of addresses in these
				2662	address spaces may vary. Their access and allocation may be hardware managed
				2663	with each thread or group of threads having access to independent storage. For
				2664	these reasons they may have properties that do not allow them to be viewed as
				2665	part of the unified global virtual address space accessible by all threads.*
				2666
				2667	*It is target architecture specific whether multiple DWARF address spaces are
				2668	supported and how source language DWARF address classes map to target
				2669	architecture specific DWARF address spaces. A target architecture may map
				2670	multiple source language DWARF address classes to the same target architecture
				2671	specific DWARF address class. Optimization may determine that variable lifetime
				2672	and access pattern allows them to be allocated in faster scratchpad memory
				2673	represented by a different DWARF address space.*
				2674
				2675	Although DWARF address space identifiers are target architecture specific,
				2676	``DW_ASPACE_none`` is a common address space supported by all target
				2677	architectures.
				2678
				2679	DWARF address space identifiers are used by:
				2680
YangZhihui	f2bb4b8	2020-09-11 17:51:36 +0200	[diff] [blame^]	2681	* The DWARF expression operations: ``DW_OP_LLVM_aspace_bregx``,
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2682	``DW_OP_LLVM_form_aspace_address``, ``DW_OP_LLVM_implicit_aspace_pointer``,
				2683	and ``DW_OP_xderef*``.
				2684
				2685	* The CFI instructions: ``DW_CFA_def_aspace_cfa`` and
				2686	``DW_CFA_def_aspace_cfa_sf``.
				2687
				2688	.. note::
				2689
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	2690	With the definition of DWARF address classes and DWARF address spaces in these
				2691	extensions, DWARF Version 5 table 2.7 needs to be updated. It seems it is an
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2692	example of DWARF address spaces and not DWARF address classes.
				2693
				2694	.. note::
				2695
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	2696	With the expanded support for DWARF address spaces in these extensions, it may
				2697	be worth examining if DWARF segments can be eliminated and DWARF address
				2698	spaces used instead.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2699
				2700	That may involve extending DWARF address spaces to also be used to specify
				2701	code locations. In target architectures that use different memory areas for
				2702	code and data this would seem a natural use for DWARF address spaces. This
				2703	would allow DWARF expression location descriptions to be used to describe the
				2704	location of subprograms and entry points that are used in expressions
				2705	involving subprogram pointer type values.
				2706
				2707	Currently, DWARF expressions assume data and code resides in the same default
				2708	DWARF address space, and only the address ranges in DWARF location list
				2709	entries and in the ``.debug_aranges`` section for accelerated access for
				2710	addresses allow DWARF segments to be used to distinguish.
				2711
				2712	.. note::
				2713
				2714	Currently, DWARF defines address class values as being target architecture
				2715	specific. It is unclear how language specific memory spaces are intended to be
				2716	represented in DWARF using these.
				2717
				2718	For example, OpenCL defines memory spaces (called address spaces in OpenCL)
				2719	for ``global``, ``local``, ``constant``, and ``private``. These are part of
				2720	the type system and are modifiers to pointer types. In addition, OpenCL
				2721	defines ``generic`` pointers that can reference either the ``global``,
				2722	``local``, or ``private`` memory spaces. To support the OpenCL language the
				2723	debugger would want to support casting pointers between the ``generic`` and
				2724	other memory spaces, querying what memory space a ``generic`` pointer value is
				2725	currently referencing, and possibly using pointer casting to form an address
				2726	for a specific memory space out of an integral value.
				2727
				2728	The method to use to dereference a pointer type or reference type value is
				2729	defined in DWARF expressions using ``DW_OP_xderef*`` which uses a target
				2730	architecture specific address space.
				2731
				2732	DWARF defines the ``DW_AT_address_class`` attribute on pointer type and
				2733	reference type debugger information entries. It specifies the method to use to
				2734	dereference them. Why is the value of this not the same as the address space
				2735	value used in ``DW_OP_xderef*``? In both cases it is target architecture
				2736	specific and the architecture presumably will use the same set of methods to
				2737	dereference pointers in both cases.
				2738
				2739	Since ``DW_AT_address_class`` uses a target architecture specific value, it
				2740	cannot in general capture the source language memory space type modifier
				2741	concept. On some architectures all source language memory space modifiers may
				2742	actually use the same method for dereferencing pointers.
				2743
				2744	One possibility is for DWARF to add an ``DW_TAG_LLVM_address_class_type``
				2745	debugger information entry type modifier that can be applied to a pointer type
				2746	and reference type. The ``DW_AT_address_class`` attribute could be re-defined
				2747	to not be target architecture specific and instead define generalized language
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	2748	values (as presented above for DWARF address classes in the table
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2749	:ref:`amdgpu-dwarf-address-class-table`) that will support OpenCL and other
				2750	languages using memory spaces. The ``DW_AT_address_class`` attribute could be
				2751	defined to not be applied to pointer types or reference types, but instead
				2752	only to the new ``DW_TAG_LLVM_address_class_type`` type modifier debugger
				2753	information entry.
				2754
				2755	If a pointer type or reference type is not modified by
				2756	``DW_TAG_LLVM_address_class_type`` or if ``DW_TAG_LLVM_address_class_type``
				2757	has no ``DW_AT_address_class`` attribute, then the pointer type or reference
				2758	type would be defined to use the ``DW_ADDR_none`` address class as currently.
				2759	Since modifiers can be chained, it would need to be defined if multiple
				2760	``DW_TAG_LLVM_address_class_type`` modifiers were legal, and if so if the
				2761	outermost one is the one that takes precedence.
				2762
				2763	A target architecture implementation that supports multiple address spaces
				2764	would need to map ``DW_ADDR_none`` appropriately to support CUDA-like
				2765	languages that have no address classes in the type system but do support
				2766	variable allocation in address classes. Such variable allocation would result
				2767	in the variable's location description needing an address space.
				2768
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	2769	The approach presented in :ref:`amdgpu-dwarf-address-class-table` is to define
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2770	the default ``DW_ADDR_none`` to be the generic address class and not the
				2771	global address class. This matches how CLANG and LLVM have added support for
				2772	CUDA-like languages on top of existing C++ language support. This allows all
				2773	addresses to be generic by default which matches CUDA-like languages.
				2774
				2775	An alternative approach is to define ``DW_ADDR_none`` as being the global
				2776	address class and then change ``DW_ADDR_LLVM_global`` to
				2777	``DW_ADDR_LLVM_generic``. This would match the reality that languages that do
				2778	not support multiple memory spaces only have one default global memory space.
				2779	Generally, in these languages if they expose that the target architecture
				2780	supports multiple address spaces, the default one is still the global memory
				2781	space. Then a language that does support multiple memory spaces has to
				2782	explicitly indicate which pointers have the added ability to reference more
				2783	than the global memory space. However, compilers generating DWARF for
				2784	CUDA-like languages would then have to define every CUDA-like language pointer
				2785	type or reference type using ``DW_TAG_LLVM_address_class_type`` with a
				2786	``DW_AT_address_class`` attribute of ``DW_ADDR_LLVM_generic`` to match the
				2787	language semantics.
				2788
				2789	A new ``DW_AT_LLVM_address_space`` attribute could be defined that can be
				2790	applied to pointer type, reference type, subprogram, and subprogram type to
				2791	describe how objects having the given type are dereferenced or called (the
				2792	role that ``DW_AT_address_class`` currently provides). The values of
				2793	``DW_AT_address_space`` would be target architecture specific and the same as
				2794	used in ``DW_OP_xderef*``.
				2795
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2796	.. note::
				2797
				2798	Some additional changes will be made to support languages such as OpenCL/SyCL
				2799	that allow address class pointer casting and queries.
				2800
				2801	This requires the compiler to provide the mapping from address space to
				2802	address class which may be runtime and not target architecture dependent. Some
				2803	implementations may have a one-to-one mapping from source language address
				2804	class to target architecture address space, and some may have a many-to-one
				2805	mapping which requires knowledge of the address class when determining if
				2806	pointer address class casts are allowed.
				2807
				2808	The changes will likely add an attribute that has an expression provided by
				2809	the compiler to map from address class to address space. The
				2810	``DW_OP_implicit_pointer`` and ``DW_OP_LLVM_aspace_implicit_pointer``
				2811	operations may be changed as the current IPV definition may not provide enough
				2812	information when used to cast between address classes. Other attributes and
				2813	operations may be needed. The legal casts between address classes may need to
				2814	be defined on a per language address class basis.
				2815
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2816	.. _amdgpu-dwarf-debugging-information-entry-attributes:
				2817
				2818	Debugging Information Entry Attributes
				2819	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				2820
				2821	.. note::
				2822
				2823	This section provides changes to existing debugger information entry
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	2824	attributes and defines attributes added by these extensions. These would be
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2825	incorporated into the appropriate DWARF Version 5 chapter 2 sections.
				2826
				2827	1. ``DW_AT_location``
				2828
				2829	Any debugging information entry describing a data object (which includes
				2830	variables and parameters) or common blocks may have a ``DW_AT_location``
				2831	attribute, whose value is a DWARF expression E.
				2832
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2833	The result of the attribute is obtained by evaluating E with a context that
				2834	has a result kind of a location description, an unspecified object, the
				2835	compilation unit that contains E, an empty initial stack, and other context
				2836	elements corresponding to the source language thread of execution upon which
				2837	the user is focused, if any. The result of the evaluation is the location
				2838	description of the base of the data object.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2839
				2840	See :ref:`amdgpu-dwarf-control-flow-operations` for special evaluation rules
				2841	used by the ``DW_OP_call*`` operations.
				2842
				2843	.. note::
				2844
				2845	Delete the description of how the ``DW_OP_call*`` operations evaluate a
				2846	``DW_AT_location`` attribute as that is now described in the operations.
				2847
				2848	.. note::
				2849
				2850	See the discussion about the ``DW_AT_location`` attribute in the
				2851	``DW_OP_call*`` operation. Having each attribute only have a single
				2852	purpose and single execution semantics seems desirable. It makes it easier
				2853	for the consumer that no longer have to track the context. It makes it
				2854	easier for the producer as it can rely on a single semantics for each
				2855	attribute.
				2856
				2857	For that reason, limiting the ``DW_AT_location`` attribute to only
				2858	supporting evaluating the location description of an object, and using a
				2859	different attribute and encoding class for the evaluation of DWARF
				2860	expression procedures on the same operation expression stack seems
				2861	desirable.
				2862
				2863	2. ``DW_AT_const_value``
				2864
				2865	.. note::
				2866
				2867	Could deprecate using the ``DW_AT_const_value`` attribute for
				2868	``DW_TAG_variable`` or ``DW_TAG_formal_parameter`` debugger information
				2869	entries that have been optimized to a constant. Instead,
				2870	``DW_AT_location`` could be used with a DWARF expression that produces an
				2871	implicit location description now that any location description can be
				2872	used within a DWARF expression. This allows the ``DW_OP_call*`` operations
				2873	to be used to push the location description of any variable regardless of
				2874	how it is optimized.
				2875
				2876	3. ``DW_AT_frame_base``
				2877
				2878	A ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information entry
				2879	may have a ``DW_AT_frame_base`` attribute, whose value is a DWARF expression
				2880	E.
				2881
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2882	The result of the attribute is obtained by evaluating E with a context that
				2883	has a result kind of a location description, an unspecified object, the
				2884	compilation unit that contains E, an empty initial stack, and other context
				2885	elements corresponding to the source language thread of execution upon which
				2886	the user is focused, if any.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2887
				2888	The DWARF is ill-formed if E contains an ``DW_OP_fbreg`` operation, or the
				2889	resulting location description L is not comprised of one single location
				2890	description SL.
				2891
				2892	If SL a register location description for register R, then L is replaced
				2893	with the result of evaluating a ``DW_OP_bregx R, 0`` operation. This
				2894	computes the frame base memory location description in the target
				2895	architecture default address space.
				2896
				2897	This allows the more compact ``DW_OPreg`` to be used instead of*
				2898	``DW_OP_breg* 0``\ .
				2899
				2900	.. note::
				2901
				2902	This rule could be removed and require the producer to create the required
				2903	location description directly using ``DW_OP_call_frame_cfa``,
				2904	``DW_OP_breg*``, or ``DW_OP_LLVM_aspace_bregx``. This would also then
				2905	allow a target to implement the call frames within a large register.
				2906
				2907	Otherwise, the DWARF is ill-formed if SL is not a memory location
				2908	description in any of the target architecture specific address spaces.
				2909
				2910	The resulting L is the frame base for the subprogram or entry point.
				2911
				2912	Typically, E will use the ``DW_OP_call_frame_cfa`` *operation or be a
				2913	stack pointer register plus or minus some offset.*
				2914
				2915	4. ``DW_AT_data_member_location``
				2916
				2917	For a ``DW_AT_data_member_location`` attribute there are two cases:
				2918
				2919	1. If the attribute is an integer constant B, it provides the offset in
				2920	bytes from the beginning of the containing entity.
				2921
				2922	The result of the attribute is obtained by evaluating a
				2923	``DW_OP_LLVM_offset B`` operation with an initial stack comprising the
				2924	location description of the beginning of the containing entity. The
				2925	result of the evaluation is the location description of the base of the
				2926	member entry.
				2927
				2928	*If the beginning of the containing entity is not byte aligned, then the
				2929	beginning of the member entry has the same bit displacement within a
				2930	byte.*
				2931
				2932	2. Otherwise, the attribute must be a DWARF expression E which is evaluated
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2933	with a context that has a result kind of a location description, an
				2934	unspecified object, the compilation unit that contains E, an initial
				2935	stack comprising the location description of the beginning of the
				2936	containing entity, and other context elements corresponding to the
				2937	source language thread of execution upon which the user is focused, if
				2938	any. The result of the evaluation is the location description of the
				2939	base of the member entry.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2940
				2941	.. note::
				2942
				2943	The beginning of the containing entity can now be any location
				2944	description, including those with more than one single location
				2945	description, and those with single location descriptions that are of any
				2946	kind and have any bit offset.
				2947
				2948	5. ``DW_AT_use_location``
				2949
				2950	The ``DW_TAG_ptr_to_member_type`` debugging information entry has a
				2951	``DW_AT_use_location`` attribute whose value is a DWARF expression E. It is
				2952	used to compute the location description of the member of the class to which
				2953	the pointer to member entry points.
				2954
				2955	*The method used to find the location description of a given member of a
				2956	class, structure, or union is common to any instance of that class,
				2957	structure, or union and to any instance of the pointer to member type. The
				2958	method is thus associated with the pointer to member type, rather than with
				2959	each object that has a pointer to member type.*
				2960
				2961	The ``DW_AT_use_location`` DWARF expression is used in conjunction with the
				2962	location description for a particular object of the given pointer to member
				2963	type and for a particular structure or class instance.
				2964
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2965	The result of the attribute is obtained by evaluating E with a context that
				2966	has a result kind of a location description, an unspecified object, the
				2967	compilation unit that contains E, an initial stack comprising two entries,
				2968	and other context elements corresponding to the source language thread of
				2969	execution upon which the user is focused, if any. The first stack entry is
				2970	the value of the pointer to member object itself. The second stack entry is
				2971	the location description of the base of the entire class, structure, or
				2972	union instance containing the member whose location is being calculated. The
				2973	result of the evaluation is the location description of the member of the
				2974	class to which the pointer to member entry points.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2975
				2976	6. ``DW_AT_data_location``
				2977
				2978	The ``DW_AT_data_location`` attribute may be used with any type that
				2979	provides one or more levels of hidden indirection and/or run-time parameters
				2980	in its representation. Its value is a DWARF operation expression E which
				2981	computes the location description of the data for an object. When this
				2982	attribute is omitted, the location description of the data is the same as
				2983	the location description of the object.
				2984
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	2985	The result of the attribute is obtained by evaluating E with a context that
				2986	has a result kind of a location description, an object that is the location
				2987	description of the data descriptor, the compilation unit that contains E, an
				2988	empty initial stack, and other context elements corresponding to the source
				2989	language thread of execution upon which the user is focused, if any. The
				2990	result of the evaluation is the location description of the base of the
				2991	member entry.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	2992
				2993	E will typically involve an operation expression that begins with a
				2994	``DW_OP_push_object_address`` *operation which loads the location
				2995	description of the object which can then serve as a description in
				2996	subsequent calculation.*
				2997
				2998	.. note::
				2999
				3000	Since ``DW_AT_data_member_location``, ``DW_AT_use_location``, and
				3001	``DW_AT_vtable_elem_location`` allow both operation expressions and
				3002	location list expressions, why does ``DW_AT_data_location`` not allow
				3003	both? In all cases they apply to data objects so less likely that
				3004	optimization would cause different operation expressions for different
				3005	program location ranges. But if supporting for some then should be for
				3006	all.
				3007
				3008	It seems odd this attribute is not the same as
				3009	``DW_AT_data_member_location`` in having an initial stack with the
				3010	location description of the object since the expression has to need it.
				3011
				3012	7. ``DW_AT_vtable_elem_location``
				3013
				3014	An entry for a virtual function also has a ``DW_AT_vtable_elem_location``
				3015	attribute whose value is a DWARF expression E.
				3016
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3017	The result of the attribute is obtained by evaluating E with a context that
				3018	has a result kind of a location description, an unspecified object, the
				3019	compilation unit that contains E, an initial stack comprising the location
				3020	description of the object of the enclosing type, and other context elements
				3021	corresponding to the source language thread of execution upon which the user
				3022	is focused, if any. The result of the evaluation is the location description
				3023	of the slot for the function within the virtual function table for the
				3024	enclosing class.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3025
				3026	8. ``DW_AT_static_link``
				3027
				3028	If a ``DW_TAG_subprogram`` or ``DW_TAG_entry_point`` debugger information
				3029	entry is lexically nested, it may have a ``DW_AT_static_link`` attribute,
				3030	whose value is a DWARF expression E.
				3031
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3032	The result of the attribute is obtained by evaluating E with a context that
				3033	has a result kind of a location description, an unspecified object, the
				3034	compilation unit that contains E, an empty initial stack, and other context
				3035	elements corresponding to the source language thread of execution upon which
				3036	the user is focused, if any. The result of the evaluation is the location
				3037	description L of the canonical frame address (see
				3038	:ref:`amdgpu-dwarf-call-frame-information`) of the relevant call frame of
				3039	the subprogram instance that immediately lexically encloses the current call
				3040	frame's subprogram or entry point.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3041
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3042	The DWARF is ill-formed if L is is not comprised of one memory location
				3043	description for one of the target architecture specific address spaces.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3044
				3045	9. ``DW_AT_return_addr``
				3046
				3047	A ``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
				3048	``DW_TAG_entry_point`` debugger information entry may have a
				3049	``DW_AT_return_addr`` attribute, whose value is a DWARF expression E.
				3050
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3051	The result of the attribute is obtained by evaluating E with a context that
				3052	has a result kind of a location description, an unspecified object, the
				3053	compilation unit that contains E, an empty initial stack, and other context
				3054	elements corresponding to the source language thread of execution upon which
				3055	the user is focused, if any. The result of the evaluation is the location
				3056	description L of the place where the return address for the current call
				3057	frame's subprogram or entry point is stored.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3058
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3059	The DWARF is ill-formed if L is not comprised of one memory location
				3060	description for one of the target architecture specific address spaces.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3061
				3062	.. note::
				3063
				3064	It is unclear why ``DW_TAG_inlined_subroutine`` has a
				3065	``DW_AT_return_addr`` attribute but not a ``DW_AT_frame_base`` or
				3066	``DW_AT_static_link`` attribute. Seems it would either have all of them or
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3067	none. Since inlined subprograms do not have a call frame it seems they
				3068	would have none of these attributes.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3069
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3070	10. ``DW_AT_call_value``, ``DW_AT_call_data_location``, and
				3071	``DW_AT_call_data_value``
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3072
				3073	A ``DW_TAG_call_site_parameter`` debugger information entry may have a
				3074	``DW_AT_call_value`` attribute, whose value is a DWARF operation expression
				3075	E\ :sub:`1`\ .
				3076
				3077	The result of the ``DW_AT_call_value`` attribute is obtained by evaluating
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3078	E\ :sub:`1` with a context that has a result kind of a value, an unspecified
				3079	object, the compilation unit that contains E, an empty initial stack, and
				3080	other context elements corresponding to the source language thread of
				3081	execution upon which the user is focused, if any. The resulting value V\
				3082	:sub:`1` is the value of the parameter at the time of the call made by the
				3083	call site.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3084
				3085	For parameters passed by reference, where the code passes a pointer to a
				3086	location which contains the parameter, or for reference type parameters, the
				3087	``DW_TAG_call_site_parameter`` debugger information entry may also have a
				3088	``DW_AT_call_data_location`` attribute whose value is a DWARF operation
				3089	expression E\ :sub:`2`\ , and a ``DW_AT_call_data_value`` attribute whose
				3090	value is a DWARF operation expression E\ :sub:`3`\ .
				3091
				3092	The value of the ``DW_AT_call_data_location`` attribute is obtained by
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3093	evaluating E\ :sub:`2` with a context that has a result kind of a location
				3094	description, an unspecified object, the compilation unit that contains E, an
				3095	empty initial stack, and other context elements corresponding to the source
				3096	language thread of execution upon which the user is focused, if any. The
				3097	resulting location description L\ :sub:`2` is the location where the
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3098	referenced parameter lives during the call made by the call site. If E\
				3099	:sub:`2` would just be a ``DW_OP_push_object_address``, then the
				3100	``DW_AT_call_data_location`` attribute may be omitted.
				3101
				3102	The value of the ``DW_AT_call_data_value`` attribute is obtained by
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3103	evaluating E\ :sub:`3` with a context that has a result kind of a value, an
				3104	unspecified object, the compilation unit that contains E, an empty initial
				3105	stack, and other context elements corresponding to the source language
				3106	thread of execution upon which the user is focused, if any. The resulting
				3107	value V\ :sub:`3` is the value in L\ :sub:`2` at the time of the call made
				3108	by the call site.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3109
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3110	The result of these attributes is undefined if the current call frame is
				3111	not for the subprogram containing the ``DW_TAG_call_site_parameter``
				3112	debugger information entry or the current program location is not for the
				3113	call site containing the ``DW_TAG_call_site_parameter`` debugger information
				3114	entry in the current call frame.
				3115
				3116	The consumer may have to virtually unwind to the call site (see
				3117	:ref:`amdgpu-dwarf-call-frame-information`\ *) in order to evaluate these
				3118	attributes. This will ensure the source language thread of execution upon
				3119	which the user is focused corresponds to the call site needed to evaluate
				3120	the expression.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3121
				3122	If it is not possible to avoid the expressions of these attributes from
				3123	accessing registers or memory locations that might be clobbered by the
				3124	subprogram being called by the call site, then the associated attribute
				3125	should not be provided.
				3126
				3127	*The reason for the restriction is that the parameter may need to be
				3128	accessed during the execution of the callee. The consumer may virtually
				3129	unwind from the called subprogram back to the caller and then evaluate the
				3130	attribute expressions. The call frame information (see*
				3131	:ref:`amdgpu-dwarf-call-frame-information`\ *) will not be able to restore
				3132	registers that have been clobbered, and clobbered memory will no longer have
				3133	the value at the time of the call.*
				3134
				3135	11. ``DW_AT_LLVM_lanes`` New
				3136
				3137	For languages that are implemented using a SIMD or SIMT execution model, a
				3138	``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
				3139	``DW_TAG_entry_point`` debugger information entry may have a
				3140	``DW_AT_LLVM_lanes`` attribute whose value is an integer constant that is
				3141	the number of lanes per thread. This is the static number of lanes per
				3142	thread. It is not the dynamic number of lanes with which the thread was
				3143	initiated, for example, due to smaller or partial work-groups.
				3144
				3145	If not present, the default value of 1 is used.
				3146
				3147	The DWARF is ill-formed if the value is 0.
				3148
				3149	12. ``DW_AT_LLVM_lane_pc`` New
				3150
				3151	For languages that are implemented using a SIMD or SIMT execution model, a
				3152	``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
				3153	``DW_TAG_entry_point`` debugging information entry may have a
				3154	``DW_AT_LLVM_lane_pc`` attribute whose value is a DWARF expression E.
				3155
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3156	The result of the attribute is obtained by evaluating E with a context that
				3157	has a result kind of a location description, an unspecified object, the
				3158	compilation unit that contains E, an empty initial stack, and other context
				3159	elements corresponding to the source language thread of execution upon which
				3160	the user is focused, if any.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3161
				3162	The resulting location description L is for a thread lane count sized vector
				3163	of generic type elements. The thread lane count is the value of the
				3164	``DW_AT_LLVM_lanes`` attribute. Each element holds the conceptual program
				3165	location of the corresponding lane, where the least significant element
				3166	corresponds to the first target architecture specific lane identifier and so
				3167	forth. If the lane was not active when the current subprogram was called,
				3168	its element is an undefined location description.
				3169
				3170	``DW_AT_LLVM_lane_pc`` *allows the compiler to indicate conceptually where
				3171	each lane of a SIMT thread is positioned even when it is in divergent
				3172	control flow that is not active.*
				3173
				3174	*Typically, the result is a location description with one composite location
				3175	description with each part being a location description with either one
				3176	undefined location description or one memory location description.*
				3177
				3178	If not present, the thread is not being used in a SIMT manner, and the
				3179	thread's current program location is used.
				3180
				3181	13. ``DW_AT_LLVM_active_lane`` New
				3182
				3183	For languages that are implemented using a SIMD or SIMT execution model, a
				3184	``DW_TAG_subprogram``, ``DW_TAG_inlined_subroutine``, or
				3185	``DW_TAG_entry_point`` debugger information entry may have a
				3186	``DW_AT_LLVM_active_lane`` attribute whose value is a DWARF expression E.
				3187
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3188	The result of the attribute is obtained by evaluating E with a context that
				3189	has a result kind of a value, an unspecified object, the compilation unit
				3190	that contains E, an empty initial stack, and other context elements
				3191	corresponding to the source language thread of execution upon which the user
				3192	is focused, if any.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3193
				3194	The DWARF is ill-formed if the resulting value V is not an integral value.
				3195
				3196	The resulting V is a bit mask of active lanes for the current program
				3197	location. The N\ :sup:`th` least significant bit of the mask corresponds to
				3198	the N\ :sup:`th` lane. If the bit is 1 the lane is active, otherwise it is
				3199	inactive.
				3200
				3201	*Some targets may update the target architecture execution mask for regions
				3202	of code that must execute with different sets of lanes than the current
				3203	active lanes. For example, some code must execute with all lanes made
				3204	temporarily active.* ``DW_AT_LLVM_active_lane`` *allows the compiler to
				3205	provide the means to determine the source language active lanes.*
				3206
				3207	If not present and ``DW_AT_LLVM_lanes`` is greater than 1, then the target
				3208	architecture execution mask is used.
				3209
				3210	14. ``DW_AT_LLVM_vector_size`` New
				3211
				3212	A ``DW_TAG_base_type`` debugger information entry for a base type T may have
				3213	a ``DW_AT_LLVM_vector_size`` attribute whose value is an integer constant
				3214	that is the vector type size N.
				3215
				3216	The representation of a vector base type is as N contiguous elements, each
				3217	one having the representation of a base type T' that is the same as T
				3218	without the ``DW_AT_LLVM_vector_size`` attribute.
				3219
				3220	If a ``DW_TAG_base_type`` debugger information entry does not have a
				3221	``DW_AT_LLVM_vector_size`` attribute, then the base type is not a vector
				3222	type.
				3223
				3224	The DWARF is ill-formed if N is not greater than 0.
				3225
				3226	.. note::
				3227
				3228	LLVM has mention of a non-upstreamed debugger information entry that is
				3229	intended to support vector types. However, that was not for a base type so
				3230	would not be suitable as the type of a stack value entry. But perhaps that
				3231	could be replaced by using this attribute.
				3232
				3233	15. ``DW_AT_LLVM_augmentation`` New
				3234
				3235	A ``DW_TAG_compile_unit`` debugger information entry for a compilation unit
				3236	may have a ``DW_AT_LLVM_augmentation`` attribute, whose value is an
				3237	augmentation string.
				3238
				3239	*The augmentation string allows producers to indicate that there is
				3240	additional vendor or target specific information in the debugging
				3241	information entries. For example, this might be information about the
				3242	version of vendor specific extensions that are being used.*
				3243
				3244	If not present, or if the string is empty, then the compilation unit has no
				3245	augmentation string.
				3246
				3247	The format for the augmentation string is:
				3248
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	3249	\| ``[``\ vendor\ ``:v``\ X\ ``.``\ Y\ [\ ``:``\ options\ ]\ ``]``\ *
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3250
				3251	Where vendor is the producer, ``vX.Y`` specifies the major X and minor Y
				3252	version number of the extensions used, and options is an optional string
				3253	providing additional information about the extensions. The version number
				3254	must conform to semantic versioning [:ref:`SEMVER <amdgpu-dwarf-SEMVER>`].
				3255	The options string must not contain the "\ ``]``\ " character.
				3256
				3257	For example:
				3258
				3259	::
				3260
				3261	[abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
				3262
				3263	Program Scope Entities
				3264	----------------------
				3265
				3266	.. _amdgpu-dwarf-language-names:
				3267
				3268	Unit Entities
				3269	~~~~~~~~~~~~~
				3270
				3271	.. note::
				3272
				3273	This augments DWARF Version 5 section 3.1.1 and Table 3.1.
				3274
				3275	Additional language codes defined for use with the ``DW_AT_language`` attribute
				3276	are defined in :ref:`amdgpu-dwarf-language-names-table`.
				3277
				3278	.. table:: Language Names
				3279	:name: amdgpu-dwarf-language-names-table
				3280
				3281	==================== =============================
				3282	Language Name Meaning
				3283	==================== =============================
				3284	``DW_LANG_LLVM_HIP`` HIP Language.
				3285	==================== =============================
				3286
				3287	The HIP language [:ref:`HIP <amdgpu-dwarf-HIP>`] can be supported by extending
				3288	the C++ language.
				3289
				3290	Other Debugger Information
				3291	--------------------------
				3292
				3293	Accelerated Access
				3294	~~~~~~~~~~~~~~~~~~
				3295
				3296	.. _amdgpu-dwarf-lookup-by-name:
				3297
				3298	Lookup By Name
				3299	++++++++++++++
				3300
				3301	Contents of the Name Index
				3302	##########################
				3303
				3304	.. note::
				3305
				3306	The following provides changes to DWARF Version 5 section 6.1.1.1.
				3307
				3308	The rule for debugger information entries included in the name index in the
				3309	optional ``.debug_names`` section is extended to also include named
				3310	``DW_TAG_variable`` debugging information entries with a ``DW_AT_location``
				3311	attribute that includes a ``DW_OP_LLVM_form_aspace_address`` operation.
				3312
				3313	The name index must contain an entry for each debugging information entry that
				3314	defines a named subprogram, label, variable, type, or namespace, subject to the
				3315	following rules:
				3316
				3317	* ``DW_TAG_variable`` debugging information entries with a ``DW_AT_location``
				3318	attribute that includes a ``DW_OP_addr``, ``DW_OP_LLVM_form_aspace_address``,
				3319	or ``DW_OP_form_tls_address`` operation are included; otherwise, they are
				3320	excluded.
				3321
				3322	Data Representation of the Name Index
				3323	#####################################
				3324
				3325	Section Header
				3326	^^^^^^^^^^^^^^
				3327
				3328	.. note::
				3329
				3330	The following provides an addition to DWARF Version 5 section 6.1.1.4.1 item
				3331	14 ``augmentation_string``.
				3332
				3333	A null-terminated UTF-8 vendor specific augmentation string, which provides
				3334	additional information about the contents of this index. If provided, the
				3335	recommended format for augmentation string is:
				3336
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	3337	\| ``[``\ vendor\ ``:v``\ X\ ``.``\ Y\ [\ ``:``\ options\ ]\ ``]``\ *
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3338
				3339	Where vendor is the producer, ``vX.Y`` specifies the major X and minor Y
				3340	version number of the extensions used in the DWARF of the compilation unit, and
				3341	options is an optional string providing additional information about the
				3342	extensions. The version number must conform to semantic versioning [:ref:`SEMVER
				3343	<amdgpu-dwarf-SEMVER>`]. The options string must not contain the "\ ``]``\ "
				3344	character.
				3345
				3346	For example:
				3347
				3348	::
				3349
				3350	[abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
				3351
				3352	.. note::
				3353
				3354	This is different to the definition in DWARF Version 5 but is consistent with
				3355	the other augmentation strings and allows multiple vendor extensions to be
				3356	supported.
				3357
				3358	.. _amdgpu-dwarf-line-number-information:
				3359
				3360	Line Number Information
				3361	~~~~~~~~~~~~~~~~~~~~~~~
				3362
				3363	The Line Number Program Header
				3364	++++++++++++++++++++++++++++++
				3365
				3366	Standard Content Descriptions
				3367	#############################
				3368
				3369	.. note::
				3370
				3371	This augments DWARF Version 5 section 6.2.4.1.
				3372
				3373	.. _amdgpu-dwarf-line-number-information-dw-lnct-llvm-source:
				3374
				3375	1. ``DW_LNCT_LLVM_source``
				3376
				3377	The component is a null-terminated UTF-8 source text string with "\ ``\n``\
				3378	" line endings. This content code is paired with the same forms as
				3379	``DW_LNCT_path``. It can be used for file name entries.
				3380
				3381	The value is an empty null-terminated string if no source is available. If
				3382	the source is available but is an empty file then the value is a
				3383	null-terminated single "\ ``\n``\ ".
				3384
				3385	*When the source field is present, consumers can use the embedded source
				3386	instead of attempting to discover the source on disk using the file path
				3387	provided by the* ``DW_LNCT_path`` *field. When the source field is absent,
				3388	consumers can access the file to get the source text.*
				3389
YangZhihui	f2bb4b8	2020-09-11 17:51:36 +0200	[diff] [blame^]	3390	*This is particularly useful for programming languages that support runtime
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3391	compilation and runtime generation of source text. In these cases, the
				3392	source text does not reside in any permanent file. For example, the OpenCL
				3393	language [:ref:`OpenCL <amdgpu-dwarf-OpenCL>`] supports online compilation.*
				3394
				3395	2. ``DW_LNCT_LLVM_is_MD5``
				3396
				3397	``DW_LNCT_LLVM_is_MD5`` indicates if the ``DW_LNCT_MD5`` content kind, if
				3398	present, is valid: when 0 it is not valid and when 1 it is valid. If
				3399	``DW_LNCT_LLVM_is_MD5`` content kind is not present, and ``DW_LNCT_MD5``
				3400	content kind is present, then the MD5 checksum is valid.
				3401
				3402	``DW_LNCT_LLVM_is_MD5`` is always paired with the ``DW_FORM_udata`` form.
				3403
				3404	*This allows a compilation unit to have a mixture of files with and without
				3405	MD5 checksums. This can happen when multiple relocatable files are linked
				3406	together.*
				3407
				3408	.. _amdgpu-dwarf-call-frame-information:
				3409
				3410	Call Frame Information
				3411	~~~~~~~~~~~~~~~~~~~~~~
				3412
				3413	.. note::
				3414
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3415	This section provides changes to existing call frame information and defines
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	3416	instructions added by these extensions. Additional support is added for
				3417	address spaces. Register unwind DWARF expressions are generalized to allow any
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3418	location description, including those with composite and implicit location
				3419	descriptions.
				3420
				3421	These changes would be incorporated into the DWARF Version 5 section 6.1.
				3422
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3423	.. _amdgpu-dwarf-structure_of-call-frame-information:
				3424
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3425	Structure of Call Frame Information
				3426	+++++++++++++++++++++++++++++++++++
				3427
				3428	The register rules are:
				3429
				3430	undefined
				3431	A register that has this rule has no recoverable value in the previous frame.
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3432	The previous value of this register is the undefined location description (see
				3433	:ref:`amdgpu-dwarf-undefined-location-description-operations`).
				3434
				3435	By convention, the register is not preserved by a callee.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3436
				3437	same value
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3438	This register has not been modified from the previous caller frame.
				3439
				3440	If the current frame is the top frame, then the previous value of this
				3441	register is the location description L that specifies one register location
				3442	description SL. SL specifies the register location storage that corresponds to
				3443	the register with a bit offset of 0 for the current thread.
				3444
				3445	If the current frame is not the top frame, then the previous value of this
				3446	register is the location description obtained using the call frame information
				3447	for the callee frame and callee program location invoked by the current caller
				3448	frame for the same register.
				3449
				3450	*By convention, the register is preserved by the callee, but the callee has
				3451	not modified it.*
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3452
				3453	offset(N)
				3454	N is a signed byte offset. The previous value of this register is saved at the
				3455	location description computed as if the DWARF operation expression
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3456	``DW_OP_LLVM_offset N`` is evaluated with the current context, except the
				3457	result kind is a location description, the compilation unit is unspecified,
				3458	the object is unspecified, and an initial stack comprising the location
				3459	description of the current CFA (see
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3460	:ref:`amdgpu-dwarf-operation-expressions`).
				3461
				3462	val_offset(N)
				3463	N is a signed byte offset. The previous value of this register is the memory
				3464	byte address of the location description computed as if the DWARF operation
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3465	expression ``DW_OP_LLVM_offset N`` is evaluated with the current context,
				3466	except the result kind is a location description, the compilation unit is
				3467	unspecified, the object is unspecified, and an initial stack comprising the
				3468	location description of the current CFA (see
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3469	:ref:`amdgpu-dwarf-operation-expressions`).
				3470
				3471	The DWARF is ill-formed if the CFA location description is not a memory byte
				3472	address location description, or if the register size does not match the size
				3473	of an address in the address space of the current CFA location description.
				3474
				3475	*Since the CFA location description is required to be a memory byte address
				3476	location description, the value of val_offset(N) will also be a memory byte
				3477	address location description since it is offsetting the CFA location
				3478	description by N bytes. Furthermore, the value of val_offset(N) will be a
				3479	memory byte address in the same address space as the CFA location
				3480	description.*
				3481
				3482	.. note::
				3483
				3484	Should DWARF allow the address size to be a different size to the size of
				3485	the register? Requiring them to be the same bit size avoids any issue of
				3486	conversion as the bit contents of the register is simply interpreted as a
				3487	value of the address.
				3488
				3489	GDB has a per register hook that allows a target specific conversion on a
				3490	register by register basis. It defaults to truncation of bigger registers,
				3491	and to actually reading bytes from the next register (or reads out of bounds
				3492	for the last register) for smaller registers. There are no GDB tests that
				3493	read a register out of bounds (except an illegal hand written assembly
				3494	test).
				3495
				3496	register(R)
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3497	This register has been stored in another register numbered R.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3498
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3499	The previous value of this register is the location description obtained using
				3500	the call frame information for the current frame and current program location
				3501	for register R.
				3502
				3503	The DWARF is ill-formed if the size of this register does not match the size
				3504	of register R or if there is a cyclic dependency in the call frame
				3505	information.
				3506
				3507	.. note::
				3508
				3509	Should this also allow R to be larger than this register? If so is the value
				3510	stored in the low order bits and it is undefined what is stored in the
				3511	extra upper bits?
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3512
				3513	expression(E)
				3514	The previous value of this register is located at the location description
				3515	produced by evaluating the DWARF operation expression E (see
				3516	:ref:`amdgpu-dwarf-operation-expressions`).
				3517
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3518	E is evaluated with the current context, except the result kind is a location
				3519	description, the compilation unit is unspecified, the object is unspecified,
				3520	and an initial stack comprising the location description of the current CFA
				3521	(see :ref:`amdgpu-dwarf-operation-expressions`).
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3522
				3523	val_expression(E)
				3524	The previous value of this register is the value produced by evaluating the
				3525	DWARF operation expression E (see :ref:`amdgpu-dwarf-operation-expressions`).
				3526
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3527	E is evaluated with the current context, except the result kind is a value,
				3528	the compilation unit is unspecified, the object is unspecified, and an initial
				3529	stack comprising the location description of the current CFA (see
				3530	:ref:`amdgpu-dwarf-operation-expressions`).
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3531
				3532	The DWARF is ill-formed if the resulting value type size does not match the
				3533	register size.
				3534
				3535	.. note::
				3536
				3537	This has limited usefulness as the DWARF expression E can only produce
				3538	values up to the size of the generic type. This is due to not allowing any
				3539	operations that specify a type in a CFI operation expression. This makes it
				3540	unusable for registers that are larger than the generic type. However,
				3541	expression(E) can be used to create an implicit location description of
				3542	any size.
				3543
				3544	architectural
				3545	The rule is defined externally to this specification by the augmenter.
				3546
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3547	A Common Information Entry (CIE) holds information that is shared among many
				3548	Frame Description Entries (FDE). There is at least one CIE in every non-empty
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3549	``.debug_frame`` section. A CIE contains the following fields, in order:
				3550
				3551	1. ``length`` (initial length)
				3552
				3553	A constant that gives the number of bytes of the CIE structure, not
				3554	including the length field itself. The size of the length field plus the
				3555	value of length must be an integral multiple of the address size specified
				3556	in the ``address_size`` field.
				3557
				3558	2. ``CIE_id`` (4 or 8 bytes, see
				3559	:ref:`amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats`)
				3560
				3561	A constant that is used to distinguish CIEs from FDEs.
				3562
				3563	In the 32-bit DWARF format, the value of the CIE id in the CIE header is
				3564	0xffffffff; in the 64-bit DWARF format, the value is 0xffffffffffffffff.
				3565
				3566	3. ``version`` (ubyte)
				3567
				3568	A version number. This number is specific to the call frame information and
				3569	is independent of the DWARF version number.
				3570
				3571	The value of the CIE version number is 4.
				3572
				3573	.. note::
				3574
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	3575	Would this be increased to 5 to reflect the changes in these extensions?
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3576
				3577	4. ``augmentation`` (sequence of UTF-8 characters)
				3578
				3579	A null-terminated UTF-8 string that identifies the augmentation to this CIE
				3580	or to the FDEs that use it. If a reader encounters an augmentation string
				3581	that is unexpected, then only the following fields can be read:
				3582
				3583	* CIE: length, CIE_id, version, augmentation
				3584	* FDE: length, CIE_pointer, initial_location, address_range
				3585
				3586	If there is no augmentation, this value is a zero byte.
				3587
				3588	*The augmentation string allows users to indicate that there is additional
				3589	vendor and target architecture specific information in the CIE or FDE which
				3590	is needed to virtually unwind a stack frame. For example, this might be
				3591	information about dynamically allocated data which needs to be freed on exit
				3592	from the routine.*
				3593
				3594	Because the ``.debug_frame`` section is useful independently of any
				3595	``.debug_info`` *section, the augmentation string always uses UTF-8
				3596	encoding.*
				3597
				3598	The recommended format for the augmentation string is:
				3599
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	3600	\| ``[``\ vendor\ ``:v``\ X\ ``.``\ Y\ [\ ``:``\ options\ ]\ ``]``\ *
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3601
				3602	Where vendor is the producer, ``vX.Y`` specifies the major X and minor Y
				3603	version number of the extensions used, and options is an optional string
				3604	providing additional information about the extensions. The version number
				3605	must conform to semantic versioning [:ref:`SEMVER <amdgpu-dwarf-SEMVER>`].
				3606	The options string must not contain the "\ ``]``\ " character.
				3607
				3608	For example:
				3609
				3610	::
				3611
				3612	[abc:v0.0][def:v1.2:feature-a=on,feature-b=3]
				3613
				3614	5. ``address_size`` (ubyte)
				3615
				3616	The size of a target address in this CIE and any FDEs that use it, in bytes.
				3617	If a compilation unit exists for this frame, its address size must match the
				3618	address size here.
				3619
				3620	6. ``segment_selector_size`` (ubyte)
				3621
				3622	The size of a segment selector in this CIE and any FDEs that use it, in
				3623	bytes.
				3624
				3625	7. ``code_alignment_factor`` (unsigned LEB128)
				3626
				3627	A constant that is factored out of all advance location instructions (see
				3628	:ref:`amdgpu-dwarf-row-creation-instructions`). The resulting value is
				3629	``(operand * code_alignment_factor)``.
				3630
				3631	8. ``data_alignment_factor`` (signed LEB128)
				3632
				3633	A constant that is factored out of certain offset instructions (see
				3634	:ref:`amdgpu-dwarf-cfa-definition-instructions` and
				3635	:ref:`amdgpu-dwarf-register-rule-instructions`). The resulting value is
				3636	``(operand * data_alignment_factor)``.
				3637
				3638	9. ``return_address_register`` (unsigned LEB128)
				3639
				3640	An unsigned LEB128 constant that indicates which column in the rule table
				3641	represents the return address of the subprogram. Note that this column might
				3642	not correspond to an actual machine register.
				3643
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3644	The value of the return address register is used to determine the program
				3645	location of the caller frame. The program location of the top frame is the
				3646	target architecture program counter value of the current thread.
				3647
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3648	10. ``initial_instructions`` (array of ubyte)
				3649
				3650	A sequence of rules that are interpreted to create the initial setting of
				3651	each column in the table.
				3652
				3653	The default rule for all columns before interpretation of the initial
				3654	instructions is the undefined rule. However, an ABI authoring body or a
				3655	compilation system authoring body may specify an alternate default value for
				3656	any or all columns.
				3657
				3658	11. ``padding`` (array of ubyte)
				3659
				3660	Enough ``DW_CFA_nop`` instructions to make the size of this entry match the
				3661	length value above.
				3662
				3663	An FDE contains the following fields, in order:
				3664
				3665	1. ``length`` (initial length)
				3666
				3667	A constant that gives the number of bytes of the header and instruction
				3668	stream for this subprogram, not including the length field itself. The size
				3669	of the length field plus the value of length must be an integral multiple of
				3670	the address size.
				3671
				3672	2. ``CIE_pointer`` (4 or 8 bytes, see
				3673	:ref:`amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats`)
				3674
				3675	A constant offset into the ``.debug_frame`` section that denotes the CIE
				3676	that is associated with this FDE.
				3677
				3678	3. ``initial_location`` (segment selector and target address)
				3679
				3680	The address of the first location associated with this table entry. If the
				3681	segment_selector_size field of this FDE’s CIE is non-zero, the initial
				3682	location is preceded by a segment selector of the given length.
				3683
				3684	4. ``address_range`` (target address)
				3685
				3686	The number of bytes of program instructions described by this entry.
				3687
				3688	5. ``instructions`` (array of ubyte)
				3689
				3690	A sequence of table defining instructions that are described in
				3691	:ref:`amdgpu-dwarf-call-frame-instructions`.
				3692
				3693	6. ``padding`` (array of ubyte)
				3694
				3695	Enough ``DW_CFA_nop`` instructions to make the size of this entry match the
				3696	length value above.
				3697
				3698	.. _amdgpu-dwarf-call-frame-instructions:
				3699
				3700	Call Frame Instructions
				3701	+++++++++++++++++++++++
				3702
				3703	Some call frame instructions have operands that are encoded as DWARF operation
				3704	expressions E (see :ref:`amdgpu-dwarf-operation-expressions`). The DWARF
				3705	operations that can be used in E have the following restrictions:
				3706
				3707	* ``DW_OP_addrx``, ``DW_OP_call2``, ``DW_OP_call4``, ``DW_OP_call_ref``,
				3708	``DW_OP_const_type``, ``DW_OP_constx``, ``DW_OP_convert``,
				3709	``DW_OP_deref_type``, ``DW_OP_fbreg``, ``DW_OP_implicit_pointer``,
				3710	``DW_OP_regval_type``, ``DW_OP_reinterpret``, and ``DW_OP_xderef_type``
				3711	operations are not allowed because the call frame information must not depend
				3712	on other debug sections.
				3713
				3714	* ``DW_OP_push_object_address`` is not allowed because there is no object
				3715	context to provide a value to push.
				3716
				3717	* ``DW_OP_LLVM_push_lane`` is not allowed because the call frame instructions
				3718	describe the actions for the whole thread, not the lanes independently.
				3719
				3720	* ``DW_OP_call_frame_cfa`` and ``DW_OP_entry_value`` are not allowed because
				3721	their use would be circular.
				3722
				3723	* ``DW_OP_LLVM_call_frame_entry_reg`` is not allowed if evaluating E causes a
				3724	circular dependency between ``DW_OP_LLVM_call_frame_entry_reg`` operations.
				3725
				3726	For example, if a register R1 has a ``DW_CFA_def_cfa_expression``
				3727	instruction that evaluates a ``DW_OP_LLVM_call_frame_entry_reg`` *operation
				3728	that specifies register R2, and register R2 has a*
				3729	``DW_CFA_def_cfa_expression`` instruction that that evaluates a
				3730	``DW_OP_LLVM_call_frame_entry_reg`` operation that specifies register R1.
				3731
				3732	Call frame instructions to which these restrictions apply include
				3733	``DW_CFA_def_cfa_expression``\ , ``DW_CFA_expression``\ , and
				3734	``DW_CFA_val_expression``\ .
				3735
				3736	.. _amdgpu-dwarf-row-creation-instructions:
				3737
				3738	Row Creation Instructions
				3739	#########################
				3740
				3741	.. note::
				3742
				3743	These instructions are the same as in DWARF Version 5 section 6.4.2.1.
				3744
				3745	.. _amdgpu-dwarf-cfa-definition-instructions:
				3746
				3747	CFA Definition Instructions
				3748	###########################
				3749
				3750	1. ``DW_CFA_def_cfa``
				3751
				3752	The ``DW_CFA_def_cfa`` instruction takes two unsigned LEB128 operands
				3753	representing a register number R and a (non-factored) byte displacement B.
				3754	AS is set to the target architecture default address space identifier. The
				3755	required action is to define the current CFA rule to be the result of
				3756	evaluating the DWARF operation expression ``DW_OP_constu AS;
				3757	DW_OP_aspace_bregx R, B`` as a location description.
				3758
				3759	2. ``DW_CFA_def_cfa_sf``
				3760
				3761	The ``DW_CFA_def_cfa_sf`` instruction takes two operands: an unsigned LEB128
				3762	value representing a register number R and a signed LEB128 factored byte
				3763	displacement B. AS is set to the target architecture default address space
				3764	identifier. The required action is to define the current CFA rule to be the
				3765	result of evaluating the DWARF operation expression ``DW_OP_constu AS;
				3766	DW_OP_aspace_bregx R, B*data_alignment_factor`` as a location description.
				3767
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3768	The action is the same as ``DW_CFA_def_cfa``\ *, except that the second
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3769	operand is signed and factored.*
				3770
				3771	3. ``DW_CFA_def_aspace_cfa`` New
				3772
				3773	The ``DW_CFA_def_aspace_cfa`` instruction takes three unsigned LEB128
				3774	operands representing a register number R, a (non-factored) byte
				3775	displacement B, and a target architecture specific address space identifier
				3776	AS. The required action is to define the current CFA rule to be the result
				3777	of evaluating the DWARF operation expression ``DW_OP_constu AS;
				3778	DW_OP_aspace_bregx R, B`` as a location description.
				3779
				3780	If AS is not one of the values defined by the target architecture specific
				3781	``DW_ASPACE_*`` values then the DWARF expression is ill-formed.
				3782
				3783	4. ``DW_CFA_def_aspace_cfa_sf`` New
				3784
				3785	The ``DW_CFA_def_cfa_sf`` instruction takes three operands: an unsigned
				3786	LEB128 value representing a register number R, a signed LEB128 factored byte
				3787	displacement B, and an unsigned LEB128 value representing a target
				3788	architecture specific address space identifier AS. The required action is to
				3789	define the current CFA rule to be the result of evaluating the DWARF
				3790	operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R,
				3791	B*data_alignment_factor`` as a location description.
				3792
				3793	If AS is not one of the values defined by the target architecture specific
				3794	``DW_ASPACE_*`` values, then the DWARF expression is ill-formed.
				3795
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3796	The action is the same as ``DW_CFA_aspace_def_cfa``\ *, except that the
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3797	second operand is signed and factored.*
				3798
				3799	5. ``DW_CFA_def_cfa_register``
				3800
				3801	The ``DW_CFA_def_cfa_register`` instruction takes a single unsigned LEB128
				3802	operand representing a register number R. The required action is to define
				3803	the current CFA rule to be the result of evaluating the DWARF operation
				3804	expression ``DW_OP_constu AS; DW_OP_aspace_bregx R, B`` as a location
				3805	description. B and AS are the old CFA byte displacement and address space
				3806	respectively.
				3807
				3808	If the subprogram has no current CFA rule, or the rule was defined by a
				3809	``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
				3810
				3811	6. ``DW_CFA_def_cfa_offset``
				3812
				3813	The ``DW_CFA_def_cfa_offset`` instruction takes a single unsigned LEB128
				3814	operand representing a (non-factored) byte displacement B. The required
				3815	action is to define the current CFA rule to be the result of evaluating the
				3816	DWARF operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R, B`` as a
				3817	location description. R and AS are the old CFA register number and address
				3818	space respectively.
				3819
				3820	If the subprogram has no current CFA rule, or the rule was defined by a
				3821	``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
				3822
				3823	7. ``DW_CFA_def_cfa_offset_sf``
				3824
				3825	The ``DW_CFA_def_cfa_offset_sf`` instruction takes a signed LEB128 operand
				3826	representing a factored byte displacement B. The required action is to
				3827	define the current CFA rule to be the result of evaluating the DWARF
				3828	operation expression ``DW_OP_constu AS; DW_OP_aspace_bregx R,
				3829	B*data_alignment_factor`` as a location description. R and AS are the old
				3830	CFA register number and address space respectively.
				3831
				3832	If the subprogram has no current CFA rule, or the rule was defined by a
				3833	``DW_CFA_def_cfa_expression`` instruction, then the DWARF is ill-formed.
				3834
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3835	The action is the same as ``DW_CFA_def_cfa_offset``\ *, except that the
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3836	operand is signed and factored.*
				3837
				3838	8. ``DW_CFA_def_cfa_expression``
				3839
				3840	The ``DW_CFA_def_cfa_expression`` instruction takes a single operand encoded
				3841	as a ``DW_FORM_exprloc`` value representing a DWARF operation expression E.
				3842	The required action is to define the current CFA rule to be the result of
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3843	evaluating E with the current context, except the result kind is a location
				3844	description, the compilation unit is unspecified, the object is unspecified,
				3845	and an empty initial stack.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3846
				3847	See :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
				3848	the DWARF expression operations that can be used in E.*
				3849
				3850	The DWARF is ill-formed if the result of evaluating E is not a memory byte
				3851	address location description.
				3852
				3853	.. _amdgpu-dwarf-register-rule-instructions:
				3854
				3855	Register Rule Instructions
				3856	##########################
				3857
				3858	1. ``DW_CFA_undefined``
				3859
				3860	The ``DW_CFA_undefined`` instruction takes a single unsigned LEB128 operand
				3861	that represents a register number R. The required action is to set the rule
				3862	for the register specified by R to ``undefined``.
				3863
				3864	2. ``DW_CFA_same_value``
				3865
				3866	The ``DW_CFA_same_value`` instruction takes a single unsigned LEB128 operand
				3867	that represents a register number R. The required action is to set the rule
				3868	for the register specified by R to ``same value``.
				3869
				3870	3. ``DW_CFA_offset``
				3871
				3872	The ``DW_CFA_offset`` instruction takes two operands: a register number R
				3873	(encoded with the opcode) and an unsigned LEB128 constant representing a
				3874	factored displacement B. The required action is to change the rule for the
				3875	register specified by R to be an offset(B\data_alignment_factor)* rule.
				3876
				3877	.. note::
				3878
				3879	Seems this should be named ``DW_CFA_offset_uf`` since the offset is
				3880	unsigned factored.
				3881
				3882	4. ``DW_CFA_offset_extended``
				3883
				3884	The ``DW_CFA_offset_extended`` instruction takes two unsigned LEB128
				3885	operands representing a register number R and a factored displacement B.
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3886	This instruction is identical to ``DW_CFA_offset``, except for the encoding
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3887	and size of the register operand.
				3888
				3889	.. note::
				3890
				3891	Seems this should be named ``DW_CFA_offset_extended_uf`` since the
				3892	displacement is unsigned factored.
				3893
				3894	5. ``DW_CFA_offset_extended_sf``
				3895
				3896	The ``DW_CFA_offset_extended_sf`` instruction takes two operands: an
				3897	unsigned LEB128 value representing a register number R and a signed LEB128
				3898	factored displacement B. This instruction is identical to
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3899	``DW_CFA_offset_extended``, except that B is signed.
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3900
				3901	6. ``DW_CFA_val_offset``
				3902
				3903	The ``DW_CFA_val_offset`` instruction takes two unsigned LEB128 operands
				3904	representing a register number R and a factored displacement B. The required
				3905	action is to change the rule for the register indicated by R to be a
				3906	val_offset(B\data_alignment_factor)* rule.
				3907
				3908	.. note::
				3909
				3910	Seems this should be named ``DW_CFA_val_offset_uf`` since the displacement
				3911	is unsigned factored.
				3912
				3913	.. note::
				3914
				3915	An alternative is to define ``DW_CFA_val_offset`` to implicitly use the
				3916	target architecture default address space, and add another operation that
				3917	specifies the address space.
				3918
				3919	7. ``DW_CFA_val_offset_sf``
				3920
				3921	The ``DW_CFA_val_offset_sf`` instruction takes two operands: an unsigned
				3922	LEB128 value representing a register number R and a signed LEB128 factored
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3923	displacement B. This instruction is identical to ``DW_CFA_val_offset``,
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3924	except that B is signed.
				3925
				3926	8. ``DW_CFA_register``
				3927
				3928	The ``DW_CFA_register`` instruction takes two unsigned LEB128 operands
				3929	representing register numbers R1 and R2 respectively. The required action is
				3930	to set the rule for the register specified by R1 to be a register(R2) rule.
				3931
				3932	9. ``DW_CFA_expression``
				3933
				3934	The ``DW_CFA_expression`` instruction takes two operands: an unsigned LEB128
				3935	value representing a register number R, and a ``DW_FORM_block`` value
				3936	representing a DWARF operation expression E. The required action is to
				3937	change the rule for the register specified by R to be an expression(E)
				3938	rule.
				3939
				3940	*That is, E computes the location description where the register value can
				3941	be retrieved.*
				3942
				3943	See :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
				3944	the DWARF expression operations that can be used in E.*
				3945
				3946	10. ``DW_CFA_val_expression``
				3947
				3948	The ``DW_CFA_val_expression`` instruction takes two operands: an unsigned
				3949	LEB128 value representing a register number R, and a ``DW_FORM_block`` value
				3950	representing a DWARF operation expression E. The required action is to
				3951	change the rule for the register specified by R to be a val_expression(E)
				3952	rule.
				3953
				3954	That is, E computes the value of register R.
				3955
				3956	See :ref:`amdgpu-dwarf-call-frame-instructions` *regarding restrictions on
				3957	the DWARF expression operations that can be used in E.*
				3958
				3959	If the result of evaluating E is not a value with a base type size that
				3960	matches the register size, then the DWARF is ill-formed.
				3961
				3962	11. ``DW_CFA_restore``
				3963
				3964	The ``DW_CFA_restore`` instruction takes a single operand (encoded with the
				3965	opcode) that represents a register number R. The required action is to
				3966	change the rule for the register specified by R to the rule assigned it by
				3967	the ``initial_instructions`` in the CIE.
				3968
				3969	12. ``DW_CFA_restore_extended``
				3970
				3971	The ``DW_CFA_restore_extended`` instruction takes a single unsigned LEB128
				3972	operand that represents a register number R. This instruction is identical
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	3973	to ``DW_CFA_restore``, except for the encoding and size of the register
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	3974	operand.
				3975
				3976	Row State Instructions
				3977	######################
				3978
				3979	.. note::
				3980
				3981	These instructions are the same as in DWARF Version 5 section 6.4.2.4.
				3982
				3983	Padding Instruction
				3984	###################
				3985
				3986	.. note::
				3987
				3988	These instructions are the same as in DWARF Version 5 section 6.4.2.5.
				3989
				3990	Call Frame Instruction Usage
				3991	++++++++++++++++++++++++++++
				3992
				3993	.. note::
				3994
				3995	The same as in DWARF Version 5 section 6.4.3.
				3996
				3997	.. _amdgpu-dwarf-call-frame-calling-address:
				3998
				3999	Call Frame Calling Address
				4000	++++++++++++++++++++++++++
				4001
				4002	.. note::
				4003
				4004	The same as in DWARF Version 5 section 6.4.4.
				4005
				4006	Data Representation
				4007	-------------------
				4008
				4009	.. _amdgpu-dwarf-32-bit-and-64-bit-dwarf-formats:
				4010
				4011	32-Bit and 64-Bit DWARF Formats
				4012	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				4013
				4014	.. note::
				4015
				4016	This augments DWARF Version 5 section 7.4.
				4017
				4018	1. Within the body of the ``.debug_info`` section, certain forms of attribute
				4019	value depend on the choice of DWARF format as follows. For the 32-bit DWARF
				4020	format, the value is a 4-byte unsigned integer; for the 64-bit DWARF format,
				4021	the value is an 8-byte unsigned integer.
				4022
				4023	.. table:: ``.debug_info`` section attribute form roles
				4024	:name: amdgpu-dwarf-debug-info-section-attribute-form-roles-table
				4025
				4026	================================== ===================================
				4027	Form Role
				4028	================================== ===================================
				4029	DW_FORM_line_strp offset in ``.debug_line_str``
				4030	DW_FORM_ref_addr offset in ``.debug_info``
				4031	DW_FORM_sec_offset offset in a section other than
				4032	``.debug_info`` or ``.debug_str``
				4033	DW_FORM_strp offset in ``.debug_str``
				4034	DW_FORM_strp_sup offset in ``.debug_str`` section of
				4035	supplementary object file
				4036	DW_OP_call_ref offset in ``.debug_info``
				4037	DW_OP_implicit_pointer offset in ``.debug_info``
				4038	DW_OP_LLVM_aspace_implicit_pointer offset in ``.debug_info``
				4039	================================== ===================================
				4040
				4041	Format of Debugging Information
				4042	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				4043
				4044	Attribute Encodings
				4045	+++++++++++++++++++
				4046
				4047	.. note::
				4048
				4049	This augments DWARF Version 5 section 7.5.4 and Table 7.5.
				4050
				4051	The following table gives the encoding of the additional debugging information
				4052	entry attributes.
				4053
				4054	.. table:: Attribute encodings
				4055	:name: amdgpu-dwarf-attribute-encodings-table
				4056
Scott Linder	084f3cf	2020-04-30 14:00:09 -0400	[diff] [blame]	4057	================================== ====== ===================================
				4058	Attribute Name Value Classes
				4059	================================== ====== ===================================
				4060	DW_AT_LLVM_active_lane 0x3e08 exprloc, loclist
				4061	DW_AT_LLVM_augmentation 0x3e09 string
				4062	DW_AT_LLVM_lanes 0x3e0a constant
				4063	DW_AT_LLVM_lane_pc 0x3e0b exprloc, loclist
				4064	DW_AT_LLVM_vector_size 0x3e0c constant
				4065	================================== ====== ===================================
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	4066
				4067	DWARF Expressions
				4068	~~~~~~~~~~~~~~~~~
				4069
				4070	.. note::
				4071
				4072	Rename DWARF Version 5 section 7.7 to reflect the unification of location
				4073	descriptions into DWARF expressions.
				4074
				4075	Operation Expressions
				4076	+++++++++++++++++++++
				4077
				4078	.. note::
				4079
				4080	Rename DWARF Version 5 section 7.7.1 and delete section 7.7.2 to reflect the
				4081	unification of location descriptions into DWARF expressions.
				4082
				4083	This augments DWARF Version 5 section 7.7.1 and Table 7.9.
				4084
				4085	The following table gives the encoding of the additional DWARF expression
				4086	operations.
				4087
				4088	.. table:: DWARF Operation Encodings
				4089	:name: amdgpu-dwarf-operation-encodings-table
				4090
				4091	================================== ===== ======== ===============================
				4092	Operation Code Number Notes
				4093	of
				4094	Operands
				4095	================================== ===== ======== ===============================
				4096	DW_OP_LLVM_form_aspace_address 0xe1 0
				4097	DW_OP_LLVM_push_lane 0xe2 0
				4098	DW_OP_LLVM_offset 0xe3 0
Tony	756ba35	2020-04-20 16:55:34 -0400	[diff] [blame]	4099	DW_OP_LLVM_offset_uconst 0xe4 1 ULEB128 byte displacement
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	4100	DW_OP_LLVM_bit_offset 0xe5 0
				4101	DW_OP_LLVM_call_frame_entry_reg 0xe6 1 ULEB128 register number
				4102	DW_OP_LLVM_undefined 0xe7 0
				4103	DW_OP_LLVM_aspace_bregx 0xe8 2 ULEB128 register number,
				4104	ULEB128 byte displacement
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	4105	DW_OP_LLVM_aspace_implicit_pointer 0xe9 2 4-byte or 8-byte offset of DIE,
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	4106	SLEB128 byte displacement
				4107	DW_OP_LLVM_piece_end 0xea 0
				4108	DW_OP_LLVM_extend 0xeb 2 ULEB128 bit size,
				4109	ULEB128 count
				4110	DW_OP_LLVM_select_bit_piece 0xec 2 ULEB128 bit size,
				4111	ULEB128 count
				4112	================================== ===== ======== ===============================
				4113
				4114	Location List Expressions
				4115	+++++++++++++++++++++++++
				4116
				4117	.. note::
				4118
				4119	Rename DWARF Version 5 section 7.7.3 to reflect that location lists are a kind
				4120	of DWARF expression.
				4121
				4122	Source Languages
				4123	~~~~~~~~~~~~~~~~
				4124
				4125	.. note::
				4126
				4127	This augments DWARF Version 5 section 7.12 and Table 7.17.
				4128
				4129	The following table gives the encoding of the additional DWARF languages.
				4130
				4131	.. table:: Language encodings
				4132	:name: amdgpu-dwarf-language-encodings-table
				4133
				4134	==================== ====== ===================
				4135	Language Name Value Default Lower Bound
				4136	==================== ====== ===================
				4137	``DW_LANG_LLVM_HIP`` 0x8100 0
				4138	==================== ====== ===================
				4139
				4140	Address Class and Address Space Encodings
				4141	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				4142
				4143	.. note::
				4144
				4145	This replaces DWARF Version 5 section 7.13.
				4146
				4147	The encodings of the constants used for the currently defined address classes
				4148	are given in :ref:`amdgpu-dwarf-address-class-encodings-table`.
				4149
				4150	.. table:: Address class encodings
				4151	:name: amdgpu-dwarf-address-class-encodings-table
				4152
				4153	========================== ======
				4154	Address Class Name Value
				4155	========================== ======
				4156	``DW_ADDR_none`` 0x0000
				4157	``DW_ADDR_LLVM_global`` 0x0001
				4158	``DW_ADDR_LLVM_constant`` 0x0002
				4159	``DW_ADDR_LLVM_group`` 0x0003
				4160	``DW_ADDR_LLVM_private`` 0x0004
				4161	``DW_ADDR_LLVM_lo_user`` 0x8000
				4162	``DW_ADDR_LLVM_hi_user`` 0xffff
				4163	========================== ======
				4164
				4165	Line Number Information
				4166	~~~~~~~~~~~~~~~~~~~~~~~
				4167
				4168	.. note::
				4169
				4170	This augments DWARF Version 5 section 7.22 and Table 7.27.
				4171
				4172	The following table gives the encoding of the additional line number header
				4173	entry formats.
				4174
				4175	.. table:: Line number header entry format encodings
				4176	:name: amdgpu-dwarf-line-number-header-entry-format-encodings-table
				4177
				4178	==================================== ====================
				4179	Line number header entry format name Value
				4180	==================================== ====================
				4181	``DW_LNCT_LLVM_source`` 0x2001
				4182	``DW_LNCT_LLVM_is_MD5`` 0x2002
				4183	==================================== ====================
				4184
				4185	Call Frame Information
				4186	~~~~~~~~~~~~~~~~~~~~~~
				4187
				4188	.. note::
				4189
				4190	This augments DWARF Version 5 section 7.24 and Table 7.29.
				4191
				4192	The following table gives the encoding of the additional call frame information
				4193	instructions.
				4194
				4195	.. table:: Call frame instruction encodings
				4196	:name: amdgpu-dwarf-call-frame-instruction-encodings-table
				4197
				4198	======================== ====== ====== ================ ================ ================
				4199	Instruction High 2 Low 6 Operand 1 Operand 2 Operand 3
				4200	Bits Bits
				4201	======================== ====== ====== ================ ================ ================
Scott Linder	084f3cf	2020-04-30 14:00:09 -0400	[diff] [blame]	4202	DW_CFA_def_aspace_cfa 0 0x30 ULEB128 register ULEB128 offset ULEB128 address space
				4203	DW_CFA_def_aspace_cfa_sf 0 0x31 ULEB128 register SLEB128 offset ULEB128 address space
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	4204	======================== ====== ====== ================ ================ ================
				4205
				4206	Attributes by Tag Value (Informative)
				4207	-------------------------------------
				4208
				4209	.. note::
				4210
				4211	This augments DWARF Version 5 Appendix A and Table A.1.
				4212
				4213	The following table provides the additional attributes that are applicable to
				4214	debugger information entries.
				4215
				4216	.. table:: Attributes by tag value
				4217	:name: amdgpu-dwarf-attributes-by-tag-value-table
				4218
				4219	============================= =============================
				4220	Tag Name Applicable Attributes
				4221	============================= =============================
				4222	``DW_TAG_base_type`` * ``DW_AT_LLVM_vector_size``
				4223	``DW_TAG_compile_unit`` * ``DW_AT_LLVM_augmentation``
				4224	``DW_TAG_entry_point`` * ``DW_AT_LLVM_active_lane``
				4225	* ``DW_AT_LLVM_lane_pc``
				4226	* ``DW_AT_LLVM_lanes``
				4227	``DW_TAG_inlined_subroutine`` * ``DW_AT_LLVM_active_lane``
				4228	* ``DW_AT_LLVM_lane_pc``
				4229	* ``DW_AT_LLVM_lanes``
				4230	``DW_TAG_subprogram`` * ``DW_AT_LLVM_active_lane``
				4231	* ``DW_AT_LLVM_lane_pc``
				4232	* ``DW_AT_LLVM_lanes``
				4233	============================= =============================
				4234
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	4235	.. _amdgpu-dwarf-examples:
				4236
Tony	1b58cba	2020-05-22 22:01:01 -0400	[diff] [blame]	4237	Examples
				4238	========
				4239
Tony	e24f5f3	2020-07-03 22:31:53 +0000	[diff] [blame]	4240	The AMD GPU specific usage of the features in these extensions, including
				4241	examples, is available at User Guide for AMDGPU Backend section
				4242	:ref:`amdgpu-dwarf-debug-information`.
Tony	1b58cba	2020-05-22 22:01:01 -0400	[diff] [blame]	4243
Tony	5aa2fd8	2020-07-03 22:31:53 +0000	[diff] [blame]	4244	.. note::
				4245
				4246	Change examples to use ``DW_OP_LLVM_offset`` instead of ``DW_OP_add`` when
				4247	acting on a location description.
				4248
				4249	Need to provide examples of new features.
				4250
Tony	b4668a2	2020-05-26 23:44:10 -0400	[diff] [blame]	4251	.. _amdgpu-dwarf-references:
				4252
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	4253	References
Tony	1b58cba	2020-05-22 22:01:01 -0400	[diff] [blame]	4254	==========
Tony	1eac2c5	2020-04-14 00:55:43 -0400	[diff] [blame]	4255
				4256	.. _amdgpu-dwarf-AMD:
				4257
				4258	1. [AMD] `Advanced Micro Devices <https://www.amd.com/>`__
				4259
				4260	.. _amdgpu-dwarf-AMD-ROCm:
				4261
				4262	2. [AMD-ROCm] `AMD ROCm Platform <https://rocm-documentation.readthedocs.io>`__
				4263
				4264	.. _amdgpu-dwarf-AMD-ROCgdb:
				4265
				4266	3. [AMD-ROCgdb] `AMD ROCm Debugger (ROCgdb) <https://github.com/ROCm-Developer-Tools/ROCgdb>`__
				4267
				4268	.. _amdgpu-dwarf-AMDGPU-LLVM:
				4269
				4270	4. [AMDGPU-LLVM] `User Guide for AMDGPU LLVM Backend <https://llvm.org/docs/AMDGPUUsage.html>`__
				4271
				4272	.. _amdgpu-dwarf-CUDA:
				4273
				4274	5. [CUDA] `Nvidia CUDA Language <https://docs.nvidia.com/cuda/cuda-c-programming-guide/>`__
				4275
				4276	.. _amdgpu-dwarf-DWARF:
				4277
				4278	6. [DWARF] `DWARF Debugging Information Format <http://dwarfstd.org/>`__
				4279
				4280	.. _amdgpu-dwarf-ELF:
				4281
				4282	7. [ELF] `Executable and Linkable Format (ELF) <http://www.sco.com/developers/gabi/>`__
				4283
				4284	.. _amdgpu-dwarf-GCC:
				4285
				4286	8. [GCC] `GCC: The GNU Compiler Collection <https://www.gnu.org/software/gcc/>`__
				4287
				4288	.. _amdgpu-dwarf-GDB:
				4289
				4290	9. [GDB] `GDB: The GNU Project Debugger <https://www.gnu.org/software/gdb/>`__
				4291
				4292	.. _amdgpu-dwarf-HIP:
				4293
				4294	10. [HIP] `HIP Programming Guide <https://rocm-documentation.readthedocs.io/en/latest/Programming_Guides/Programming-Guides.html#hip-programing-guide>`__
				4295
				4296	.. _amdgpu-dwarf-HSA:
				4297
				4298	11. [HSA] `Heterogeneous System Architecture (HSA) Foundation <http://www.hsafoundation.com/>`__
				4299
				4300	.. _amdgpu-dwarf-LLVM:
				4301
				4302	12. [LLVM] `The LLVM Compiler Infrastructure <https://llvm.org/>`__
				4303
				4304	.. _amdgpu-dwarf-OpenCL:
				4305
				4306	13. [OpenCL] `The OpenCL Specification Version 2.0 <http://www.khronos.org/registry/cl/specs/opencl-2.0.pdf>`__
				4307
				4308	.. _amdgpu-dwarf-Perforce-TotalView:
				4309
				4310	14. [Perforce-TotalView] `Perforce TotalView HPC Debugging Software <https://totalview.io/products/totalview>`__
				4311
				4312	.. _amdgpu-dwarf-SEMVER:
				4313
Scott Linder	084f3cf	2020-04-30 14:00:09 -0400	[diff] [blame]	4314	15. [SEMVER] `Semantic Versioning <https://semver.org/>`__