blob: 8dbf44ff88249c78eab6944bb65c59adf576bccd [file] [log] [blame]
sewardj5d616df2013-07-02 08:07:15 +00001/* -*- mode: C; c-basic-offset: 3; -*- */
sewardjb8b79ad2008-03-03 01:35:41 +00002
3/*--------------------------------------------------------------------*/
tomfba428c2010-04-28 08:09:30 +00004/*--- Read DWARF3/4 ".debug_info" sections (DIE trees). ---*/
sewardjb8b79ad2008-03-03 01:35:41 +00005/*--- readdwarf3.c ---*/
6/*--------------------------------------------------------------------*/
7
8/*
9 This file is part of Valgrind, a dynamic binary instrumentation
10 framework.
11
sewardj0f157dd2013-10-18 14:27:36 +000012 Copyright (C) 2008-2013 OpenWorks LLP
sewardjb8b79ad2008-03-03 01:35:41 +000013 info@open-works.co.uk
14
15 This program is free software; you can redistribute it and/or
16 modify it under the terms of the GNU General Public License as
17 published by the Free Software Foundation; either version 2 of the
18 License, or (at your option) any later version.
19
20 This program is distributed in the hope that it will be useful, but
21 WITHOUT ANY WARRANTY; without even the implied warranty of
22 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
23 General Public License for more details.
24
25 You should have received a copy of the GNU General Public License
26 along with this program; if not, write to the Free Software
27 Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
28 02111-1307, USA.
29
30 The GNU General Public License is contained in the file COPYING.
31
32 Neither the names of the U.S. Department of Energy nor the
33 University of California nor the names of its contributors may be
34 used to endorse or promote products derived from this software
35 without prior written permission.
36*/
37
njn8b68b642009-06-24 00:37:09 +000038#if defined(VGO_linux) || defined(VGO_darwin)
39
sewardjb8b79ad2008-03-03 01:35:41 +000040/* REFERENCE (without which this code will not make much sense):
41
42 DWARF Debugging Information Format, Version 3,
43 dated 20 December 2005 (the "D3 spec").
44
45 Available at http://www.dwarfstd.org/Dwarf3.pdf. There's also a
46 .doc (MS Word) version, but for some reason the section numbers
47 between the Word and PDF versions differ by 1 in the first digit.
48 All section references in this code are to the PDF version.
49
50 CURRENT HACKS:
51
52 DW_TAG_{const,volatile}_type no DW_AT_type is allowed; it is
53 assumed to mean "const void" or "volatile void" respectively.
54 GDB appears to interpret them like this, anyway.
55
56 In many cases it is important to know the svma of a CU (the "base
57 address of the CU", as the D3 spec calls it). There are some
58 situations in which the spec implies this value is unknown, but the
59 Dwarf3 produced by gcc-4.1 seems to assume is not unknown but
60 merely zero when not explicitly stated. So we too have to make
61 that assumption.
62
sewardj9c606bd2008-09-18 18:12:50 +000063 POTENTIAL BUG? Spotted 6 Sept 08. Why doesn't
64 unitary_range_list() bias the resulting range list in the same way
65 that its more general cousin, get_range_list(), does? I don't
66 know.
67
sewardjb8b79ad2008-03-03 01:35:41 +000068 TODO, 2008 Feb 17:
69
70 get rid of cu_svma_known and document the assumed-zero svma hack.
71
72 ML_(sizeOfType): differentiate between zero sized types and types
73 for which the size is unknown. Is this important? I don't know.
74
sewardja2af13d2012-04-04 17:42:02 +000075 DW_TAG_array_types: deal with explicit sizes (currently we compute
sewardjb8b79ad2008-03-03 01:35:41 +000076 the size from the bounds and the element size, although that's
77 fragile, if the bounds incompletely specified, or completely
78 absent)
79
80 Document reason for difference (by 1) of stack preening depth in
81 parse_var_DIE vs parse_type_DIE.
82
83 Don't hand to ML_(addVars), vars whose locations are entirely in
84 registers (DW_OP_reg*). This is merely a space-saving
85 optimisation, as ML_(evaluate_Dwarf3_Expr) should handle these
86 expressions correctly, by failing to evaluate them and hence
87 effectively ignoring the variable with which they are associated.
88
sewardja2af13d2012-04-04 17:42:02 +000089 Deal with DW_TAG_array_types which have element size != stride
sewardjb8b79ad2008-03-03 01:35:41 +000090
91 In some cases, the info for a variable is split between two
92 different DIEs (generally a declarer and a definer). We punt on
93 these. Could do better here.
94
95 The 'data_bias' argument passed to the expression evaluator
96 (ML_(evaluate_Dwarf3_Expr)) should really be changed to a
97 MaybeUWord, to make it clear when we do vs don't know what it is
98 for the evaluation of an expression. At the moment zero is passed
99 for this parameter in the don't know case. That's a bit fragile
100 and obscure; using a MaybeUWord would be clearer.
101
102 POTENTIAL PERFORMANCE IMPROVEMENTS:
103
sewardj9c606bd2008-09-18 18:12:50 +0000104 Currently, duplicate removal and all other queries for the type
105 entities array is done using cuOffset-based pointing, which
106 involves a binary search (VG_(lookupXA)) for each access. This is
107 wildly inefficient, although simple. It would be better to
108 translate all the cuOffset-based references (iow, all the "R" and
109 "Rs" fields in the TyEnts in 'tyents') to direct index numbers in
110 'tyents' right at the start of dedup_types(), and use direct
111 indexing (VG_(indexXA)) wherever possible after that.
112
113 cmp__XArrays_of_AddrRange is also a performance bottleneck. Move
114 VG_(indexXA) into pub_tool_xarray.h so it can be inlined at all use
115 points, and possibly also make an _UNCHECKED version which skips
116 the range checks in performance-critical situations such as this.
sewardjb8b79ad2008-03-03 01:35:41 +0000117
118 Handle interaction between read_DIE and parse_{var,type}_DIE
119 better. Currently read_DIE reads the entire DIE just to find where
120 the end is (and for debug printing), so that it can later reliably
121 move the cursor to the end regardless of what parse_{var,type}_DIE
122 do. This means many DIEs (most, even?) are read twice. It would
123 be smarter to make parse_{var,type}_DIE return a Bool indicating
124 whether or not they advanced the DIE cursor, and only if they
125 didn't should read_DIE itself read through the DIE.
126
127 ML_(addVar) and add_var_to_arange: quite a lot of DiAddrRanges have
128 zero variables in their .vars XArray. Rather than have an XArray
129 with zero elements (which uses 2 malloc'd blocks), allow the .vars
130 pointer to be NULL in this case.
131
132 More generally, reduce the amount of memory allocated and freed
133 while reading Dwarf3 type/variable information. Even modest (20MB)
134 objects cause this module to allocate and free hundreds of
135 thousands of small blocks, and ML_(arena_malloc) and its various
136 groupies always show up at the top of performance profiles. */
137
138#include "pub_core_basics.h"
tom588658b2009-01-22 13:40:12 +0000139#include "pub_core_debuginfo.h"
sewardjb8b79ad2008-03-03 01:35:41 +0000140#include "pub_core_libcbase.h"
141#include "pub_core_libcassert.h"
142#include "pub_core_libcprint.h"
sewardj6c591e12011-04-11 16:17:51 +0000143#include "pub_core_libcsetjmp.h" // setjmp facilities
sewardjd9350682012-04-05 07:55:47 +0000144#include "pub_core_hashtable.h"
sewardjb8b79ad2008-03-03 01:35:41 +0000145#include "pub_core_options.h"
njnf76d27a2009-05-28 01:53:07 +0000146#include "pub_core_tooliface.h" /* VG_(needs) */
sewardjb8b79ad2008-03-03 01:35:41 +0000147#include "pub_core_xarray.h"
sewardj9c606bd2008-09-18 18:12:50 +0000148#include "pub_core_wordfm.h"
sewardjb8b79ad2008-03-03 01:35:41 +0000149#include "priv_misc.h" /* dinfo_zalloc/free */
sewardj5d616df2013-07-02 08:07:15 +0000150#include "priv_image.h"
sewardjb8b79ad2008-03-03 01:35:41 +0000151#include "priv_tytypes.h"
152#include "priv_d3basics.h"
153#include "priv_storage.h"
154#include "priv_readdwarf3.h" /* self */
155
156
157/*------------------------------------------------------------*/
158/*--- ---*/
159/*--- Basic machinery for parsing DIEs. ---*/
160/*--- ---*/
161/*------------------------------------------------------------*/
162
163#define TRACE_D3(format, args...) \
philippea0a73932014-06-15 15:42:20 +0000164 if (UNLIKELY(td3)) { VG_(printf)(format, ## args); }
sewardjb8b79ad2008-03-03 01:35:41 +0000165
sewardj9c606bd2008-09-18 18:12:50 +0000166#define D3_INVALID_CUOFF ((UWord)(-1UL))
167#define D3_FAKEVOID_CUOFF ((UWord)(-2UL))
sewardjb8b79ad2008-03-03 01:35:41 +0000168
169typedef
170 struct {
sewardj5d616df2013-07-02 08:07:15 +0000171 DiSlice sli; // to which this cursor applies
172 DiOffT sli_next; // offset in underlying DiImage; must be >= sli.ioff
florian6bd9dc12012-11-23 16:17:43 +0000173 void (*barf)( const HChar* ) __attribute__((noreturn));
174 const HChar* barfstr;
sewardjb8b79ad2008-03-03 01:35:41 +0000175 }
176 Cursor;
177
178static inline Bool is_sane_Cursor ( Cursor* c ) {
179 if (!c) return False;
180 if (!c->barf) return False;
181 if (!c->barfstr) return False;
sewardj5d616df2013-07-02 08:07:15 +0000182 if (!ML_(sli_is_valid)(c->sli)) return False;
183 if (c->sli.ioff == DiOffT_INVALID) return False;
184 if (c->sli_next < c->sli.ioff) return False;
sewardjb8b79ad2008-03-03 01:35:41 +0000185 return True;
186}
187
sewardj5d616df2013-07-02 08:07:15 +0000188// Initialise a cursor from a DiSlice (ELF section, really) so as to
189// start reading at offset |sli_initial_offset| from the start of the
190// slice.
191static void init_Cursor ( /*OUT*/Cursor* c,
192 DiSlice sli,
193 ULong sli_initial_offset,
194 __attribute__((noreturn)) void (*barf)(const HChar*),
florian6bd9dc12012-11-23 16:17:43 +0000195 const HChar* barfstr )
sewardjb8b79ad2008-03-03 01:35:41 +0000196{
197 vg_assert(c);
sewardj5d616df2013-07-02 08:07:15 +0000198 VG_(bzero_inline)(c, sizeof(*c));
199 c->sli = sli;
200 c->sli_next = c->sli.ioff + sli_initial_offset;
sewardjb8b79ad2008-03-03 01:35:41 +0000201 c->barf = barf;
202 c->barfstr = barfstr;
203 vg_assert(is_sane_Cursor(c));
204}
205
206static Bool is_at_end_Cursor ( Cursor* c ) {
207 vg_assert(is_sane_Cursor(c));
sewardj5d616df2013-07-02 08:07:15 +0000208 return c->sli_next >= c->sli.ioff + c->sli.szB;
sewardjb8b79ad2008-03-03 01:35:41 +0000209}
210
sewardj5d616df2013-07-02 08:07:15 +0000211static inline ULong get_position_of_Cursor ( Cursor* c ) {
sewardjb8b79ad2008-03-03 01:35:41 +0000212 vg_assert(is_sane_Cursor(c));
sewardj5d616df2013-07-02 08:07:15 +0000213 return c->sli_next - c->sli.ioff;
sewardjb8b79ad2008-03-03 01:35:41 +0000214}
sewardj5d616df2013-07-02 08:07:15 +0000215static inline void set_position_of_Cursor ( Cursor* c, ULong pos ) {
216 c->sli_next = c->sli.ioff + pos;
sewardjb8b79ad2008-03-03 01:35:41 +0000217 vg_assert(is_sane_Cursor(c));
218}
219
sewardj5d616df2013-07-02 08:07:15 +0000220static /*signed*/Long get_remaining_length_Cursor ( Cursor* c ) {
sewardjb8b79ad2008-03-03 01:35:41 +0000221 vg_assert(is_sane_Cursor(c));
sewardj5d616df2013-07-02 08:07:15 +0000222 return c->sli.ioff + c->sli.szB - c->sli_next;
sewardjb8b79ad2008-03-03 01:35:41 +0000223}
224
sewardj5d616df2013-07-02 08:07:15 +0000225//static void* get_address_of_Cursor ( Cursor* c ) {
226// vg_assert(is_sane_Cursor(c));
227// return &c->region_start_img[ c->region_next ];
228//}
229
230static DiCursor get_DiCursor_from_Cursor ( Cursor* c ) {
231 return mk_DiCursor(c->sli.img, c->sli_next);
sewardjb8b79ad2008-03-03 01:35:41 +0000232}
233
sewardjb8b79ad2008-03-03 01:35:41 +0000234/* FIXME: document assumptions on endianness for
235 get_UShort/UInt/ULong. */
236static inline UChar get_UChar ( Cursor* c ) {
237 UChar r;
sewardj5d616df2013-07-02 08:07:15 +0000238 vg_assert(is_sane_Cursor(c));
239 if (c->sli_next + sizeof(UChar) > c->sli.ioff + c->sli.szB) {
sewardjb8b79ad2008-03-03 01:35:41 +0000240 c->barf(c->barfstr);
241 /*NOTREACHED*/
242 vg_assert(0);
243 }
sewardj5d616df2013-07-02 08:07:15 +0000244 r = ML_(img_get_UChar)(c->sli.img, c->sli_next);
245 c->sli_next += sizeof(UChar);
sewardjb8b79ad2008-03-03 01:35:41 +0000246 return r;
247}
248static UShort get_UShort ( Cursor* c ) {
249 UShort r;
250 vg_assert(is_sane_Cursor(c));
sewardj5d616df2013-07-02 08:07:15 +0000251 if (c->sli_next + sizeof(UShort) > c->sli.ioff + c->sli.szB) {
sewardjb8b79ad2008-03-03 01:35:41 +0000252 c->barf(c->barfstr);
253 /*NOTREACHED*/
254 vg_assert(0);
255 }
sewardj5d616df2013-07-02 08:07:15 +0000256 r = ML_(img_get_UShort)(c->sli.img, c->sli_next);
257 c->sli_next += sizeof(UShort);
sewardjb8b79ad2008-03-03 01:35:41 +0000258 return r;
259}
260static UInt get_UInt ( Cursor* c ) {
261 UInt r;
262 vg_assert(is_sane_Cursor(c));
sewardj5d616df2013-07-02 08:07:15 +0000263 if (c->sli_next + sizeof(UInt) > c->sli.ioff + c->sli.szB) {
sewardjb8b79ad2008-03-03 01:35:41 +0000264 c->barf(c->barfstr);
265 /*NOTREACHED*/
266 vg_assert(0);
267 }
sewardj5d616df2013-07-02 08:07:15 +0000268 r = ML_(img_get_UInt)(c->sli.img, c->sli_next);
269 c->sli_next += sizeof(UInt);
sewardjb8b79ad2008-03-03 01:35:41 +0000270 return r;
271}
272static ULong get_ULong ( Cursor* c ) {
273 ULong r;
274 vg_assert(is_sane_Cursor(c));
sewardj5d616df2013-07-02 08:07:15 +0000275 if (c->sli_next + sizeof(ULong) > c->sli.ioff + c->sli.szB) {
sewardjb8b79ad2008-03-03 01:35:41 +0000276 c->barf(c->barfstr);
277 /*NOTREACHED*/
278 vg_assert(0);
279 }
sewardj5d616df2013-07-02 08:07:15 +0000280 r = ML_(img_get_ULong)(c->sli.img, c->sli_next);
281 c->sli_next += sizeof(ULong);
sewardjb8b79ad2008-03-03 01:35:41 +0000282 return r;
283}
sewardj5d616df2013-07-02 08:07:15 +0000284static ULong get_ULEB128 ( Cursor* c ) {
sewardjb8b79ad2008-03-03 01:35:41 +0000285 ULong result;
286 Int shift;
287 UChar byte;
288 /* unroll first iteration */
289 byte = get_UChar( c );
290 result = (ULong)(byte & 0x7f);
291 if (LIKELY(!(byte & 0x80))) return result;
292 shift = 7;
293 /* end unroll first iteration */
294 do {
295 byte = get_UChar( c );
296 result |= ((ULong)(byte & 0x7f)) << shift;
297 shift += 7;
298 } while (byte & 0x80);
299 return result;
300}
301static Long get_SLEB128 ( Cursor* c ) {
302 ULong result = 0;
303 Int shift = 0;
304 UChar byte;
305 do {
306 byte = get_UChar(c);
307 result |= ((ULong)(byte & 0x7f)) << shift;
308 shift += 7;
309 } while (byte & 0x80);
310 if (shift < 64 && (byte & 0x40))
311 result |= -(1ULL << shift);
312 return result;
313}
314
sewardj5d616df2013-07-02 08:07:15 +0000315/* Assume 'c' points to the start of a string. Return a DiCursor of
316 whatever it points at, and advance it past the terminating zero.
317 This makes it safe for the caller to then copy the string with
318 ML_(addStr), since (w.r.t. image overruns) the process of advancing
319 past the terminating zero will already have "vetted" the string. */
320static DiCursor get_AsciiZ ( Cursor* c ) {
321 UChar uc;
322 DiCursor res = get_DiCursor_from_Cursor(c);
sewardjb8b79ad2008-03-03 01:35:41 +0000323 do { uc = get_UChar(c); } while (uc != 0);
324 return res;
325}
326
327static ULong peek_ULEB128 ( Cursor* c ) {
sewardj5d616df2013-07-02 08:07:15 +0000328 DiOffT here = c->sli_next;
329 ULong r = get_ULEB128( c );
330 c->sli_next = here;
sewardjb8b79ad2008-03-03 01:35:41 +0000331 return r;
332}
333static UChar peek_UChar ( Cursor* c ) {
sewardj5d616df2013-07-02 08:07:15 +0000334 DiOffT here = c->sli_next;
335 UChar r = get_UChar( c );
336 c->sli_next = here;
sewardjb8b79ad2008-03-03 01:35:41 +0000337 return r;
338}
339
340static ULong get_Dwarfish_UWord ( Cursor* c, Bool is_dw64 ) {
341 return is_dw64 ? get_ULong(c) : (ULong) get_UInt(c);
342}
343
344static UWord get_UWord ( Cursor* c ) {
345 vg_assert(sizeof(UWord) == sizeof(void*));
346 if (sizeof(UWord) == 4) return get_UInt(c);
347 if (sizeof(UWord) == 8) return get_ULong(c);
348 vg_assert(0);
349}
350
sewardjb8b79ad2008-03-03 01:35:41 +0000351/* Read a DWARF3 'Initial Length' field */
352static ULong get_Initial_Length ( /*OUT*/Bool* is64,
353 Cursor* c,
florian6bd9dc12012-11-23 16:17:43 +0000354 const HChar* barfMsg )
sewardjb8b79ad2008-03-03 01:35:41 +0000355{
356 ULong w64;
357 UInt w32;
358 *is64 = False;
359 w32 = get_UInt( c );
360 if (w32 >= 0xFFFFFFF0 && w32 < 0xFFFFFFFF) {
361 c->barf( barfMsg );
362 }
363 else if (w32 == 0xFFFFFFFF) {
364 *is64 = True;
365 w64 = get_ULong( c );
366 } else {
367 *is64 = False;
368 w64 = (ULong)w32;
369 }
370 return w64;
371}
372
373
374/*------------------------------------------------------------*/
375/*--- ---*/
376/*--- "CUConst" structure ---*/
377/*--- ---*/
378/*------------------------------------------------------------*/
379
philippe746e97e2014-06-15 10:51:14 +0000380typedef
381 struct _name_form {
382 ULong at_name;
383 ULong at_form;
384 } name_form;
385
386typedef
387 struct _g_abbv {
388 struct _g_abbv *next; // read/write by hash table.
389 UWord abbv_code; // key, read by hash table
390 ULong atag;
391 ULong has_children;
392 name_form nf[0];
393 /* Variable-length array of name/form pairs, terminated
394 by a 0/0 pair. */
395 } g_abbv;
sewardjb8b79ad2008-03-03 01:35:41 +0000396
397/* Holds information that is constant through the parsing of a
398 Compilation Unit. This is basically plumbed through to
399 everywhere. */
400typedef
401 struct {
402 /* Call here if anything goes wrong */
florian6bd9dc12012-11-23 16:17:43 +0000403 void (*barf)( const HChar* ) __attribute__((noreturn));
sewardjb8b79ad2008-03-03 01:35:41 +0000404 /* Is this 64-bit DWARF ? */
405 Bool is_dw64;
tomfba428c2010-04-28 08:09:30 +0000406 /* Which DWARF version ? (2, 3 or 4) */
sewardjb8b79ad2008-03-03 01:35:41 +0000407 UShort version;
sewardj055b0f82009-01-24 00:04:28 +0000408 /* Length of this Compilation Unit, as stated in the
409 .unit_length :: InitialLength field of the CU Header.
410 However, this size (as specified by the D3 spec) does not
411 include the size of the .unit_length field itself, which is
412 either 4 or 12 bytes (32-bit or 64-bit Dwarf3). That value
413 can be obtained through the expression ".is_dw64 ? 12 : 4". */
sewardjb8b79ad2008-03-03 01:35:41 +0000414 ULong unit_length;
415 /* Offset of start of this unit in .debug_info */
416 UWord cu_start_offset;
417 /* SVMA for this CU. In the D3 spec, is known as the "base
418 address of the compilation unit (last para sec 3.1.1).
419 Needed for (amongst things) interpretation of location-list
420 values. */
421 Addr cu_svma;
422 Bool cu_svma_known;
sewardj5d616df2013-07-02 08:07:15 +0000423
sewardjb8b79ad2008-03-03 01:35:41 +0000424 /* The debug_abbreviations table to be used for this Unit */
sewardj5d616df2013-07-02 08:07:15 +0000425 //UChar* debug_abbv;
sewardjb8b79ad2008-03-03 01:35:41 +0000426 /* Upper bound on size thereof (an overestimate, in general) */
sewardj5d616df2013-07-02 08:07:15 +0000427 //UWord debug_abbv_maxszB;
428 /* A bounded area of the image, to be used as the
429 debug_abbreviations table tobe used for this Unit. */
430 DiSlice debug_abbv;
431
432 /* Image information for various sections. */
433 DiSlice escn_debug_str;
434 DiSlice escn_debug_ranges;
435 DiSlice escn_debug_loc;
436 DiSlice escn_debug_line;
437 DiSlice escn_debug_info;
438 DiSlice escn_debug_types;
439 DiSlice escn_debug_info_alt;
440 DiSlice escn_debug_str_alt;
sewardjf7c97142012-07-14 09:59:01 +0000441 /* How much to add to .debug_types resp. alternate .debug_info offsets
442 in cook_die*. */
443 UWord types_cuOff_bias;
444 UWord alt_cuOff_bias;
sewardjb8b79ad2008-03-03 01:35:41 +0000445 /* --- Needed so we can add stuff to the string table. --- */
446 struct _DebugInfo* di;
philippe746e97e2014-06-15 10:51:14 +0000447 /* --- a hash table of g_abbv (i.e. parsed abbreviations) --- */
448 VgHashTable ht_abbvs;
sewardjd9350682012-04-05 07:55:47 +0000449
450 /* True if this came from .debug_types; otherwise it came from
451 .debug_info. */
452 Bool is_type_unit;
453 /* For a unit coming from .debug_types, these hold the TU's type
454 signature and the uncooked DIE offset of the TU's signatured
455 type. For a unit coming from .debug_info, these are unused. */
456 ULong type_signature;
457 ULong type_offset;
458
459 /* Signatured type hash; computed once and then shared by all
460 CUs. */
461 VgHashTable signature_types;
sewardjf7c97142012-07-14 09:59:01 +0000462
463 /* True if this came from alternate .debug_info; otherwise
464 it came from normal .debug_info or .debug_types. */
465 Bool is_alt_info;
sewardjb8b79ad2008-03-03 01:35:41 +0000466 }
467 CUConst;
468
469
sewardjd9350682012-04-05 07:55:47 +0000470/* Return the cooked value of DIE depending on whether CC represents a
sewardjf7c97142012-07-14 09:59:01 +0000471 .debug_types unit. To cook a DIE, we pretend that the .debug_info,
472 .debug_types and optional alternate .debug_info sections form
473 a contiguous whole, so that DIEs coming from .debug_types are numbered
474 starting at the end of .debug_info and DIEs coming from alternate
475 .debug_info are numbered starting at the end of .debug_types. */
sewardjd9350682012-04-05 07:55:47 +0000476static UWord cook_die( CUConst* cc, UWord die )
477{
478 if (cc->is_type_unit)
sewardjf7c97142012-07-14 09:59:01 +0000479 die += cc->types_cuOff_bias;
480 else if (cc->is_alt_info)
481 die += cc->alt_cuOff_bias;
sewardjd9350682012-04-05 07:55:47 +0000482 return die;
483}
484
485/* Like cook_die, but understand that DIEs coming from a
sewardjf7c97142012-07-14 09:59:01 +0000486 DW_FORM_ref_sig8 reference are already cooked. Also, handle
487 DW_FORM_GNU_ref_alt from within primary .debug_info or .debug_types
488 as reference to alternate .debug_info. */
sewardjd9350682012-04-05 07:55:47 +0000489static UWord cook_die_using_form( CUConst *cc, UWord die, DW_FORM form)
490{
491 if (form == DW_FORM_ref_sig8)
492 return die;
sewardjf7c97142012-07-14 09:59:01 +0000493 if (form == DW_FORM_GNU_ref_alt)
494 return die + cc->alt_cuOff_bias;
sewardjd9350682012-04-05 07:55:47 +0000495 return cook_die( cc, die );
496}
497
sewardjf7c97142012-07-14 09:59:01 +0000498/* Return the uncooked offset of DIE and set *TYPE_FLAG to true if the DIE
499 came from the .debug_types section and *ALT_FLAG to true if the DIE
500 came from alternate .debug_info section. */
501static UWord uncook_die( CUConst *cc, UWord die, /*OUT*/Bool *type_flag,
502 Bool *alt_flag )
sewardjd9350682012-04-05 07:55:47 +0000503{
sewardjf7c97142012-07-14 09:59:01 +0000504 *alt_flag = False;
505 *type_flag = False;
sewardj5d616df2013-07-02 08:07:15 +0000506 /* The use of escn_debug_{info,types}.szB seems safe to me even if
507 escn_debug_{info,types} are DiSlice_INVALID (meaning the
508 sections were not found), because DiSlice_INVALID.szB is always
509 zero. That said, it seems unlikely we'd ever get here if
510 .debug_info or .debug_types were missing. */
511 if (die >= cc->escn_debug_info.szB) {
512 if (die >= cc->escn_debug_info.szB + cc->escn_debug_types.szB) {
sewardjf7c97142012-07-14 09:59:01 +0000513 *alt_flag = True;
sewardj5d616df2013-07-02 08:07:15 +0000514 die -= cc->escn_debug_info.szB + cc->escn_debug_types.szB;
sewardjf7c97142012-07-14 09:59:01 +0000515 } else {
516 *type_flag = True;
sewardj5d616df2013-07-02 08:07:15 +0000517 die -= cc->escn_debug_info.szB;
sewardjf7c97142012-07-14 09:59:01 +0000518 }
sewardjd9350682012-04-05 07:55:47 +0000519 }
520 return die;
521}
522
sewardjb8b79ad2008-03-03 01:35:41 +0000523/*------------------------------------------------------------*/
524/*--- ---*/
525/*--- Helper functions for Guarded Expressions ---*/
526/*--- ---*/
527/*------------------------------------------------------------*/
528
529/* Parse the location list starting at img-offset 'debug_loc_offset'
530 in .debug_loc. Results are biased with 'svma_of_referencing_CU'
531 and so I believe are correct SVMAs for the object as a whole. This
532 function allocates the UChar*, and the caller must deallocate it.
533 The resulting block is in so-called Guarded-Expression format.
534
535 Guarded-Expression format is similar but not identical to the DWARF3
536 location-list format. The format of each returned block is:
537
538 UChar biasMe;
539 UChar isEnd;
540 followed by zero or more of
541
542 (Addr aMin; Addr aMax; UShort nbytes; ..bytes..; UChar isEnd)
543
544 '..bytes..' is an standard DWARF3 location expression which is
545 valid when aMin <= pc <= aMax (possibly after suitable biasing).
546
547 The number of bytes in '..bytes..' is nbytes.
548
549 The end of the sequence is marked by an isEnd == 1 value. All
550 previous isEnd values must be zero.
551
552 biasMe is 1 if the aMin/aMax fields need this DebugInfo's
553 text_bias added before use, and 0 if the GX is this is not
554 necessary (is ready to go).
555
556 Hence the block can be quickly parsed and is self-describing. Note
557 that aMax is 1 less than the corresponding value in a DWARF3
558 location list. Zero length ranges, with aMax == aMin-1, are not
559 allowed.
560*/
sewardj9c606bd2008-09-18 18:12:50 +0000561/* 2008-sept-12: moved ML_(pp_GX) from here to d3basics.c, where
562 it more logically belongs. */
563
sewardjb8b79ad2008-03-03 01:35:41 +0000564
tom402c9ee2009-03-09 09:19:03 +0000565/* Apply a text bias to a GX. */
566static void bias_GX ( /*MOD*/GExpr* gx, struct _DebugInfo* di )
sewardjb8b79ad2008-03-03 01:35:41 +0000567{
568 UShort nbytes;
569 UChar* p = &gx->payload[0];
tom86781fa2011-10-05 08:48:07 +0000570 UChar* pA;
sewardjb8b79ad2008-03-03 01:35:41 +0000571 UChar uc;
572 uc = *p++; /*biasMe*/
573 if (uc == 0)
574 return;
575 vg_assert(uc == 1);
576 p[-1] = 0; /* mark it as done */
577 while (True) {
578 uc = *p++;
579 if (uc == 1)
580 break; /*isEnd*/
581 vg_assert(uc == 0);
sewardj68a2ebd2008-03-07 22:17:31 +0000582 /* t-bias aMin */
tom86781fa2011-10-05 08:48:07 +0000583 pA = (UChar*)p;
584 ML_(write_Addr)(pA, ML_(read_Addr)(pA) + di->text_debug_bias);
sewardj68a2ebd2008-03-07 22:17:31 +0000585 p += sizeof(Addr);
586 /* t-bias aMax */
tom86781fa2011-10-05 08:48:07 +0000587 pA = (UChar*)p;
588 ML_(write_Addr)(pA, ML_(read_Addr)(pA) + di->text_debug_bias);
sewardj68a2ebd2008-03-07 22:17:31 +0000589 p += sizeof(Addr);
590 /* nbytes, and actual expression */
tom86781fa2011-10-05 08:48:07 +0000591 nbytes = ML_(read_UShort)(p); p += sizeof(UShort);
sewardjb8b79ad2008-03-03 01:35:41 +0000592 p += nbytes;
593 }
594}
595
596__attribute__((noinline))
sewardj5d616df2013-07-02 08:07:15 +0000597static GExpr* make_singleton_GX ( DiCursor block, ULong nbytes )
sewardjb8b79ad2008-03-03 01:35:41 +0000598{
599 SizeT bytesReqd;
600 GExpr* gx;
601 UChar *p, *pstart;
602
603 vg_assert(sizeof(UWord) == sizeof(Addr));
604 vg_assert(nbytes <= 0xFFFF); /* else we overflow the nbytes field */
605 bytesReqd
606 = sizeof(UChar) /*biasMe*/ + sizeof(UChar) /*!isEnd*/
607 + sizeof(UWord) /*aMin*/ + sizeof(UWord) /*aMax*/
sewardj5d616df2013-07-02 08:07:15 +0000608 + sizeof(UShort) /*nbytes*/ + (SizeT)nbytes
sewardjb8b79ad2008-03-03 01:35:41 +0000609 + sizeof(UChar); /*isEnd*/
610
sewardj9c606bd2008-09-18 18:12:50 +0000611 gx = ML_(dinfo_zalloc)( "di.readdwarf3.msGX.1",
612 sizeof(GExpr) + bytesReqd );
sewardjb8b79ad2008-03-03 01:35:41 +0000613 vg_assert(gx);
614
615 p = pstart = &gx->payload[0];
616
tom86781fa2011-10-05 08:48:07 +0000617 p = ML_(write_UChar)(p, 0); /*biasMe*/
618 p = ML_(write_UChar)(p, 0); /*!isEnd*/
619 p = ML_(write_Addr)(p, 0); /*aMin*/
620 p = ML_(write_Addr)(p, ~0); /*aMax*/
621 p = ML_(write_UShort)(p, nbytes); /*nbytes*/
sewardj5d616df2013-07-02 08:07:15 +0000622 ML_(cur_read_get)(p, block, nbytes); p += nbytes;
tom86781fa2011-10-05 08:48:07 +0000623 p = ML_(write_UChar)(p, 1); /*isEnd*/
sewardjb8b79ad2008-03-03 01:35:41 +0000624
625 vg_assert( (SizeT)(p - pstart) == bytesReqd);
626 vg_assert( &gx->payload[bytesReqd]
627 == ((UChar*)gx) + sizeof(GExpr) + bytesReqd );
628
sewardjb8b79ad2008-03-03 01:35:41 +0000629 return gx;
630}
631
632__attribute__((noinline))
633static GExpr* make_general_GX ( CUConst* cc,
634 Bool td3,
sewardj5d616df2013-07-02 08:07:15 +0000635 ULong debug_loc_offset,
sewardjb8b79ad2008-03-03 01:35:41 +0000636 Addr svma_of_referencing_CU )
637{
638 Addr base;
639 Cursor loc;
640 XArray* xa; /* XArray of UChar */
641 GExpr* gx;
642 Word nbytes;
643
644 vg_assert(sizeof(UWord) == sizeof(Addr));
sewardj5d616df2013-07-02 08:07:15 +0000645 if (!ML_(sli_is_valid)(cc->escn_debug_loc) || cc->escn_debug_loc.szB == 0)
sewardjb8b79ad2008-03-03 01:35:41 +0000646 cc->barf("make_general_GX: .debug_loc is empty/missing");
647
sewardj5d616df2013-07-02 08:07:15 +0000648 init_Cursor( &loc, cc->escn_debug_loc, 0, cc->barf,
sewardjb8b79ad2008-03-03 01:35:41 +0000649 "Overrun whilst reading .debug_loc section(2)" );
650 set_position_of_Cursor( &loc, debug_loc_offset );
651
sewardj5d616df2013-07-02 08:07:15 +0000652 TRACE_D3("make_general_GX (.debug_loc_offset = %llu, ioff = %llu) {\n",
653 debug_loc_offset, (ULong)get_DiCursor_from_Cursor(&loc).ioff );
sewardjb8b79ad2008-03-03 01:35:41 +0000654
655 /* Who frees this xa? It is freed before this fn exits. */
sewardj9c606bd2008-09-18 18:12:50 +0000656 xa = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.mgGX.1",
657 ML_(dinfo_free),
sewardjb8b79ad2008-03-03 01:35:41 +0000658 sizeof(UChar) );
659
660 { UChar c = 1; /*biasMe*/ VG_(addBytesToXA)( xa, &c, sizeof(c) ); }
661
662 base = 0;
663 while (True) {
664 Bool acquire;
665 UWord len;
666 /* Read a (host-)word pair. This is something of a hack since
667 the word size to read is really dictated by the ELF file;
668 however, we assume we're reading a file with the same
669 word-sizeness as the host. Reasonably enough. */
670 UWord w1 = get_UWord( &loc );
671 UWord w2 = get_UWord( &loc );
672
673 TRACE_D3(" %08lx %08lx\n", w1, w2);
674 if (w1 == 0 && w2 == 0)
675 break; /* end of list */
676
677 if (w1 == -1UL) {
678 /* new value for 'base' */
679 base = w2;
680 continue;
681 }
682
683 /* else a location expression follows */
684 /* else enumerate [w1+base, w2+base) */
685 /* w2 is 1 past end of range, as per D3 defn for "DW_AT_high_pc"
686 (sec 2.17.2) */
687 if (w1 > w2) {
688 TRACE_D3("negative range is for .debug_loc expr at "
sewardj5d616df2013-07-02 08:07:15 +0000689 "file offset %llu\n",
sewardjb8b79ad2008-03-03 01:35:41 +0000690 debug_loc_offset);
691 cc->barf( "negative range in .debug_loc section" );
692 }
693
694 /* ignore zero length ranges */
695 acquire = w1 < w2;
696 len = (UWord)get_UShort( &loc );
697
698 if (acquire) {
699 UWord w;
700 UShort s;
701 UChar c;
702 c = 0; /* !isEnd*/
703 VG_(addBytesToXA)( xa, &c, sizeof(c) );
704 w = w1 + base + svma_of_referencing_CU;
705 VG_(addBytesToXA)( xa, &w, sizeof(w) );
706 w = w2 -1 + base + svma_of_referencing_CU;
707 VG_(addBytesToXA)( xa, &w, sizeof(w) );
708 s = (UShort)len;
709 VG_(addBytesToXA)( xa, &s, sizeof(s) );
710 }
711
712 while (len > 0) {
713 UChar byte = get_UChar( &loc );
714 TRACE_D3("%02x", (UInt)byte);
715 if (acquire)
716 VG_(addBytesToXA)( xa, &byte, 1 );
717 len--;
718 }
719 TRACE_D3("\n");
720 }
721
722 { UChar c = 1; /*isEnd*/ VG_(addBytesToXA)( xa, &c, sizeof(c) ); }
723
724 nbytes = VG_(sizeXA)( xa );
725 vg_assert(nbytes >= 1);
726
sewardj9c606bd2008-09-18 18:12:50 +0000727 gx = ML_(dinfo_zalloc)( "di.readdwarf3.mgGX.2", sizeof(GExpr) + nbytes );
sewardjb8b79ad2008-03-03 01:35:41 +0000728 vg_assert(gx);
729 VG_(memcpy)( &gx->payload[0], (UChar*)VG_(indexXA)(xa,0), nbytes );
730 vg_assert( &gx->payload[nbytes]
731 == ((UChar*)gx) + sizeof(GExpr) + nbytes );
732
733 VG_(deleteXA)( xa );
734
sewardjb8b79ad2008-03-03 01:35:41 +0000735 TRACE_D3("}\n");
736
737 return gx;
738}
739
740
741/*------------------------------------------------------------*/
742/*--- ---*/
743/*--- Helper functions for range lists and CU headers ---*/
744/*--- ---*/
745/*------------------------------------------------------------*/
746
747/* Denotes an address range. Both aMin and aMax are included in the
748 range; hence a complete range is (0, ~0) and an empty range is any
749 (X, X-1) for X > 0.*/
750typedef
751 struct { Addr aMin; Addr aMax; }
752 AddrRange;
753
754
sewardj9c606bd2008-09-18 18:12:50 +0000755/* Generate an arbitrary structural total ordering on
756 XArray* of AddrRange. */
757static Word cmp__XArrays_of_AddrRange ( XArray* rngs1, XArray* rngs2 )
758{
759 Word n1, n2, i;
760 tl_assert(rngs1 && rngs2);
761 n1 = VG_(sizeXA)( rngs1 );
762 n2 = VG_(sizeXA)( rngs2 );
763 if (n1 < n2) return -1;
764 if (n1 > n2) return 1;
765 for (i = 0; i < n1; i++) {
766 AddrRange* rng1 = (AddrRange*)VG_(indexXA)( rngs1, i );
767 AddrRange* rng2 = (AddrRange*)VG_(indexXA)( rngs2, i );
768 if (rng1->aMin < rng2->aMin) return -1;
769 if (rng1->aMin > rng2->aMin) return 1;
770 if (rng1->aMax < rng2->aMax) return -1;
771 if (rng1->aMax > rng2->aMax) return 1;
772 }
773 return 0;
774}
775
776
sewardjb8b79ad2008-03-03 01:35:41 +0000777__attribute__((noinline))
778static XArray* /* of AddrRange */ empty_range_list ( void )
779{
780 XArray* xa; /* XArray of AddrRange */
781 /* Who frees this xa? varstack_preen() does. */
sewardj9c606bd2008-09-18 18:12:50 +0000782 xa = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.erl.1",
783 ML_(dinfo_free),
sewardjb8b79ad2008-03-03 01:35:41 +0000784 sizeof(AddrRange) );
785 return xa;
786}
787
788
sewardj9c606bd2008-09-18 18:12:50 +0000789__attribute__((noinline))
sewardjb8b79ad2008-03-03 01:35:41 +0000790static XArray* unitary_range_list ( Addr aMin, Addr aMax )
791{
792 XArray* xa;
793 AddrRange pair;
794 vg_assert(aMin <= aMax);
795 /* Who frees this xa? varstack_preen() does. */
sewardj9c606bd2008-09-18 18:12:50 +0000796 xa = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.url.1",
797 ML_(dinfo_free),
sewardjb8b79ad2008-03-03 01:35:41 +0000798 sizeof(AddrRange) );
799 pair.aMin = aMin;
800 pair.aMax = aMax;
801 VG_(addToXA)( xa, &pair );
802 return xa;
803}
804
805
806/* Enumerate the address ranges starting at img-offset
807 'debug_ranges_offset' in .debug_ranges. Results are biased with
808 'svma_of_referencing_CU' and so I believe are correct SVMAs for the
809 object as a whole. This function allocates the XArray, and the
810 caller must deallocate it. */
811__attribute__((noinline))
812static XArray* /* of AddrRange */
813 get_range_list ( CUConst* cc,
814 Bool td3,
815 UWord debug_ranges_offset,
816 Addr svma_of_referencing_CU )
817{
818 Addr base;
819 Cursor ranges;
820 XArray* xa; /* XArray of AddrRange */
821 AddrRange pair;
822
sewardj5d616df2013-07-02 08:07:15 +0000823 if (!ML_(sli_is_valid)(cc->escn_debug_ranges)
824 || cc->escn_debug_ranges.szB == 0)
sewardjb8b79ad2008-03-03 01:35:41 +0000825 cc->barf("get_range_list: .debug_ranges is empty/missing");
826
sewardj5d616df2013-07-02 08:07:15 +0000827 init_Cursor( &ranges, cc->escn_debug_ranges, 0, cc->barf,
sewardjb8b79ad2008-03-03 01:35:41 +0000828 "Overrun whilst reading .debug_ranges section(2)" );
829 set_position_of_Cursor( &ranges, debug_ranges_offset );
830
831 /* Who frees this xa? varstack_preen() does. */
sewardj9c606bd2008-09-18 18:12:50 +0000832 xa = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.grl.1", ML_(dinfo_free),
sewardjb8b79ad2008-03-03 01:35:41 +0000833 sizeof(AddrRange) );
834 base = 0;
835 while (True) {
836 /* Read a (host-)word pair. This is something of a hack since
837 the word size to read is really dictated by the ELF file;
838 however, we assume we're reading a file with the same
839 word-sizeness as the host. Reasonably enough. */
840 UWord w1 = get_UWord( &ranges );
841 UWord w2 = get_UWord( &ranges );
842
843 if (w1 == 0 && w2 == 0)
844 break; /* end of list. */
845
846 if (w1 == -1UL) {
847 /* new value for 'base' */
848 base = w2;
849 continue;
850 }
851
852 /* else enumerate [w1+base, w2+base) */
853 /* w2 is 1 past end of range, as per D3 defn for "DW_AT_high_pc"
854 (sec 2.17.2) */
855 if (w1 > w2)
856 cc->barf( "negative range in .debug_ranges section" );
857 if (w1 < w2) {
858 pair.aMin = w1 + base + svma_of_referencing_CU;
859 pair.aMax = w2 - 1 + base + svma_of_referencing_CU;
860 vg_assert(pair.aMin <= pair.aMax);
861 VG_(addToXA)( xa, &pair );
862 }
863 }
864 return xa;
865}
866
philippe746e97e2014-06-15 10:51:14 +0000867/* Initialises the hash table of abbreviations.
868 We do a single scan of the abbv slice to parse and
869 build all abbreviations, for the following reasons:
870 * all or most abbreviations will be needed in any case
871 (at least for var-info reading).
872 * re-reading each time an abbreviation causes a lot of calls
873 to get_ULEB128.
874 * a CU should not have many abbreviations. */
875static void init_ht_abbvs (CUConst* cc,
876 Bool td3)
877{
878 Cursor c;
879 g_abbv *ta; // temporary abbreviation, reallocated if needed.
880 UInt ta_nf_maxE; // max nr of pairs in ta.nf[], doubled when reallocated.
881 UInt ta_nf_n; // nr of pairs in ta->nf that are initialised.
882 g_abbv *ht_ta; // abbv to insert in hash table.
883
884 #define SZ_G_ABBV(_nf_szE) (sizeof(g_abbv) + _nf_szE * sizeof(name_form))
885
886 ta_nf_maxE = 10; // starting with enough for 9 pairs+terminating pair.
887 ta = ML_(dinfo_zalloc) ("di.readdwarf3.ht_ta_nf", SZ_G_ABBV(ta_nf_maxE));
888 cc->ht_abbvs = VG_(HT_construct) ("di.readdwarf3.ht_abbvs");
889
890 init_Cursor( &c, cc->debug_abbv, 0, cc->barf,
891 "Overrun whilst parsing .debug_abbrev section(2)" );
892 while (True) {
893 ta->abbv_code = get_ULEB128( &c );
894 if (ta->abbv_code == 0) break; /* end of the table */
895
896 ta->atag = get_ULEB128( &c );
897 ta->has_children = get_UChar( &c );
898 ta_nf_n = 0;
899 while (True) {
900 if (ta_nf_n >= ta_nf_maxE) {
901 g_abbv *old_ta = ta;
902 ta = ML_(dinfo_zalloc) ("di.readdwarf3.ht_ta_nf",
903 SZ_G_ABBV(2 * ta_nf_maxE));
904 ta_nf_maxE = 2 * ta_nf_maxE;
905 VG_(memcpy) (ta, old_ta, SZ_G_ABBV(ta_nf_n));
906 ML_(dinfo_free) (old_ta);
907 }
908 ta->nf[ta_nf_n].at_name = get_ULEB128( &c );
909 ta->nf[ta_nf_n].at_form = get_ULEB128( &c );
910 if (ta->nf[ta_nf_n].at_name == 0 && ta->nf[ta_nf_n].at_form == 0) {
911 ta_nf_n++;
912 break;
913 }
914 ta_nf_n++;
915 }
916 ht_ta = ML_(dinfo_zalloc) ("di.readdwarf3.ht_ta", SZ_G_ABBV(ta_nf_n));
917 VG_(memcpy) (ht_ta, ta, SZ_G_ABBV(ta_nf_n));
918 VG_(HT_add_node) ( cc->ht_abbvs, ht_ta );
919 TRACE_D3(" Adding abbv_code %llu TAG %s [%s] nf %d\n",
920 (ULong) ht_ta->abbv_code, ML_(pp_DW_TAG)(ht_ta->atag),
921 ML_(pp_DW_children)(ht_ta->has_children),
922 ta_nf_n);
923 }
924
925 ML_(dinfo_free) (ta);
926 #undef SZ_G_ABBV
927}
928
929static g_abbv* get_abbv (CUConst* cc, ULong abbv_code)
930{
931 g_abbv *abbv;
932
933 abbv = VG_(HT_lookup) (cc->ht_abbvs, abbv_code);
934 if (!abbv)
935 cc->barf ("abbv_code not found in ht_abbvs table");
936 return abbv;
937}
938
939/* Free the memory allocated in CUConst. */
940static void clear_CUConst (CUConst* cc)
941{
942 VG_(HT_destruct) ( cc->ht_abbvs, ML_(dinfo_free));
943 cc->ht_abbvs = NULL;
944}
sewardjb8b79ad2008-03-03 01:35:41 +0000945
946/* Parse the Compilation Unit header indicated at 'c' and
947 initialise 'cc' accordingly. */
948static __attribute__((noinline))
949void parse_CU_Header ( /*OUT*/CUConst* cc,
950 Bool td3,
951 Cursor* c,
sewardj5d616df2013-07-02 08:07:15 +0000952 DiSlice escn_debug_abbv,
sewardjf7c97142012-07-14 09:59:01 +0000953 Bool type_unit,
954 Bool alt_info )
sewardjb8b79ad2008-03-03 01:35:41 +0000955{
956 UChar address_size;
sewardj5d616df2013-07-02 08:07:15 +0000957 ULong debug_abbrev_offset;
sewardjb8b79ad2008-03-03 01:35:41 +0000958
959 VG_(memset)(cc, 0, sizeof(*cc));
960 vg_assert(c && c->barf);
961 cc->barf = c->barf;
962
963 /* initial_length field */
964 cc->unit_length
965 = get_Initial_Length( &cc->is_dw64, c,
966 "parse_CU_Header: invalid initial-length field" );
967
968 TRACE_D3(" Length: %lld\n", cc->unit_length );
969
970 /* version */
971 cc->version = get_UShort( c );
tomfba428c2010-04-28 08:09:30 +0000972 if (cc->version != 2 && cc->version != 3 && cc->version != 4)
973 cc->barf( "parse_CU_Header: is neither DWARF2 nor DWARF3 nor DWARF4" );
sewardjb8b79ad2008-03-03 01:35:41 +0000974 TRACE_D3(" Version: %d\n", (Int)cc->version );
975
976 /* debug_abbrev_offset */
977 debug_abbrev_offset = get_Dwarfish_UWord( c, cc->is_dw64 );
sewardj5d616df2013-07-02 08:07:15 +0000978 if (debug_abbrev_offset >= escn_debug_abbv.szB)
sewardjb8b79ad2008-03-03 01:35:41 +0000979 cc->barf( "parse_CU_Header: invalid debug_abbrev_offset" );
sewardj5d616df2013-07-02 08:07:15 +0000980 TRACE_D3(" Abbrev Offset: %lld\n", debug_abbrev_offset );
sewardjb8b79ad2008-03-03 01:35:41 +0000981
982 /* address size. If this isn't equal to the host word size, just
983 give up. This makes it safe to assume elsewhere that
sewardj31452302009-01-25 23:50:32 +0000984 DW_FORM_addr and DW_FORM_ref_addr can be treated as a host
985 word. */
sewardjb8b79ad2008-03-03 01:35:41 +0000986 address_size = get_UChar( c );
987 if (address_size != sizeof(void*))
988 cc->barf( "parse_CU_Header: invalid address_size" );
989 TRACE_D3(" Pointer Size: %d\n", (Int)address_size );
990
sewardjd9350682012-04-05 07:55:47 +0000991 cc->is_type_unit = type_unit;
sewardjf7c97142012-07-14 09:59:01 +0000992 cc->is_alt_info = alt_info;
sewardjd9350682012-04-05 07:55:47 +0000993
994 if (type_unit) {
995 cc->type_signature = get_ULong( c );
996 cc->type_offset = get_Dwarfish_UWord( c, cc->is_dw64 );
997 }
998
sewardj5d616df2013-07-02 08:07:15 +0000999 /* Set up cc->debug_abbv to point to the relevant table for this
1000 CU. Set its .szB so that at least we can't read off the end of
1001 the debug_abbrev section -- potentially (and quite likely) too
1002 big, if this isn't the last table in the section, but at least
1003 it's safe.
1004
1005 This amounts to taking debug_abbv_escn and moving the start
1006 position along by debug_abbrev_offset bytes, hence forming a
1007 smaller DiSlice which has the same end point. Since we checked
1008 just above that debug_abbrev_offset is less than the size of
1009 debug_abbv_escn, this should leave us with a nonempty slice. */
1010 vg_assert(debug_abbrev_offset < escn_debug_abbv.szB);
1011 cc->debug_abbv = escn_debug_abbv;
1012 cc->debug_abbv.ioff += debug_abbrev_offset;
1013 cc->debug_abbv.szB -= debug_abbrev_offset;
1014
philippe746e97e2014-06-15 10:51:14 +00001015 init_ht_abbvs(cc, td3);
sewardjb8b79ad2008-03-03 01:35:41 +00001016}
1017
sewardjd9350682012-04-05 07:55:47 +00001018/* This represents a single signatured type. It maps a type signature
1019 (a ULong) to a cooked DIE offset. Objects of this type are stored
1020 in the type signature hash table. */
1021typedef
1022 struct D3SignatureType {
1023 struct D3SignatureType *next;
1024 UWord data;
1025 ULong type_signature;
1026 UWord die;
1027 }
1028 D3SignatureType;
1029
1030/* Record a signatured type in the hash table. */
1031static void record_signatured_type ( VgHashTable tab,
1032 ULong type_signature,
1033 UWord die )
1034{
1035 D3SignatureType *dstype = ML_(dinfo_zalloc) ( "di.readdwarf3.sigtype",
1036 sizeof(D3SignatureType) );
1037 dstype->data = (UWord) type_signature;
1038 dstype->type_signature = type_signature;
1039 dstype->die = die;
1040 VG_(HT_add_node) ( tab, dstype );
1041}
1042
1043/* Given a type signature hash table and a type signature, return the
1044 cooked DIE offset of the type. If the type cannot be found, call
1045 BARF. */
1046static UWord lookup_signatured_type ( VgHashTable tab,
1047 ULong type_signature,
florian6bd9dc12012-11-23 16:17:43 +00001048 void (*barf)( const HChar* ) __attribute__((noreturn)) )
sewardjd9350682012-04-05 07:55:47 +00001049{
1050 D3SignatureType *dstype = VG_(HT_lookup) ( tab, (UWord) type_signature );
1051 /* This may be unwarranted chumminess with the hash table
1052 implementation. */
1053 while ( dstype != NULL && dstype->type_signature != type_signature)
1054 dstype = dstype->next;
1055 if (dstype == NULL) {
1056 barf("lookup_signatured_type: could not find signatured type");
1057 /*NOTREACHED*/
1058 vg_assert(0);
1059 }
1060 return dstype->die;
1061}
sewardjb8b79ad2008-03-03 01:35:41 +00001062
sewardjb8b79ad2008-03-03 01:35:41 +00001063
sewardj5d616df2013-07-02 08:07:15 +00001064/* Represents Form data. If szB is 1/2/4/8 then the result is in the
1065 lowest 1/2/4/8 bytes of u.val. If szB is zero or negative then the
1066 result is an image section beginning at u.cur and with size -szB.
1067 No other szB values are allowed. */
1068typedef
1069 struct {
1070 Long szB; // 1, 2, 4, 8 or non-positive values only.
1071 union { ULong val; DiCursor cur; } u;
1072 }
1073 FormContents;
sewardjb8b79ad2008-03-03 01:35:41 +00001074
sewardj5d616df2013-07-02 08:07:15 +00001075/* From 'c', get the Form data into 'cts'. Either it gets a 1/2/4/8
1076 byte scalar value, or (a reference to) zero or more bytes starting
1077 at a DiCursor.*/
sewardjb8b79ad2008-03-03 01:35:41 +00001078static
sewardj5d616df2013-07-02 08:07:15 +00001079void get_Form_contents ( /*OUT*/FormContents* cts,
sewardjb8b79ad2008-03-03 01:35:41 +00001080 CUConst* cc, Cursor* c,
1081 Bool td3, DW_FORM form )
1082{
sewardj5d616df2013-07-02 08:07:15 +00001083 VG_(bzero_inline)(cts, sizeof(*cts));
sewardjb8b79ad2008-03-03 01:35:41 +00001084 switch (form) {
1085 case DW_FORM_data1:
sewardj5d616df2013-07-02 08:07:15 +00001086 cts->u.val = (ULong)(UChar)get_UChar(c);
1087 cts->szB = 1;
1088 TRACE_D3("%u", (UInt)cts->u.val);
sewardjb8b79ad2008-03-03 01:35:41 +00001089 break;
1090 case DW_FORM_data2:
sewardj5d616df2013-07-02 08:07:15 +00001091 cts->u.val = (ULong)(UShort)get_UShort(c);
1092 cts->szB = 2;
1093 TRACE_D3("%u", (UInt)cts->u.val);
sewardjb8b79ad2008-03-03 01:35:41 +00001094 break;
1095 case DW_FORM_data4:
sewardj5d616df2013-07-02 08:07:15 +00001096 cts->u.val = (ULong)(UInt)get_UInt(c);
1097 cts->szB = 4;
1098 TRACE_D3("%u", (UInt)cts->u.val);
sewardjb8b79ad2008-03-03 01:35:41 +00001099 break;
sewardj0b5bf912008-03-07 20:07:58 +00001100 case DW_FORM_data8:
sewardj5d616df2013-07-02 08:07:15 +00001101 cts->u.val = get_ULong(c);
1102 cts->szB = 8;
1103 TRACE_D3("%llu", cts->u.val);
sewardj0b5bf912008-03-07 20:07:58 +00001104 break;
tomfba428c2010-04-28 08:09:30 +00001105 case DW_FORM_sec_offset:
sewardj5d616df2013-07-02 08:07:15 +00001106 cts->u.val = (ULong)get_Dwarfish_UWord( c, cc->is_dw64 );
1107 cts->szB = cc->is_dw64 ? 8 : 4;
1108 TRACE_D3("%llu", cts->u.val);
tomfba428c2010-04-28 08:09:30 +00001109 break;
sewardjb8b79ad2008-03-03 01:35:41 +00001110 case DW_FORM_sdata:
sewardj5d616df2013-07-02 08:07:15 +00001111 cts->u.val = (ULong)(Long)get_SLEB128(c);
1112 cts->szB = 8;
1113 TRACE_D3("%lld", (Long)cts->u.val);
sewardjb8b79ad2008-03-03 01:35:41 +00001114 break;
tomfba428c2010-04-28 08:09:30 +00001115 case DW_FORM_udata:
sewardj5d616df2013-07-02 08:07:15 +00001116 cts->u.val = (ULong)(Long)get_ULEB128(c);
1117 cts->szB = 8;
1118 TRACE_D3("%llu", (Long)cts->u.val);
tomfba428c2010-04-28 08:09:30 +00001119 break;
sewardjb8b79ad2008-03-03 01:35:41 +00001120 case DW_FORM_addr:
1121 /* note, this is a hack. DW_FORM_addr is defined as getting
1122 a word the size of the target machine as defined by the
1123 address_size field in the CU Header. However,
1124 parse_CU_Header() rejects all inputs except those for
1125 which address_size == sizeof(Word), hence we can just
1126 treat it as a (host) Word. */
sewardj5d616df2013-07-02 08:07:15 +00001127 cts->u.val = (ULong)(UWord)get_UWord(c);
1128 cts->szB = sizeof(UWord);
1129 TRACE_D3("0x%lx", (UWord)cts->u.val);
sewardjb8b79ad2008-03-03 01:35:41 +00001130 break;
sewardj31452302009-01-25 23:50:32 +00001131
1132 case DW_FORM_ref_addr:
1133 /* We make the same word-size assumption as DW_FORM_addr. */
1134 /* What does this really mean? From D3 Sec 7.5.4,
1135 description of "reference", it would appear to reference
1136 some other DIE, by specifying the offset from the
1137 beginning of a .debug_info section. The D3 spec mentions
1138 that this might be in some other shared object and
1139 executable. But I don't see how the name of the other
1140 object/exe is specified.
1141
1142 At least for the DW_FORM_ref_addrs created by icc11, the
1143 references seem to be within the same object/executable.
1144 So for the moment we merely range-check, to see that they
1145 actually do specify a plausible offset within this
1146 object's .debug_info, and return the value unchanged.
sewardjee93cdb2012-04-29 11:35:37 +00001147
1148 In DWARF 2, DW_FORM_ref_addr is address-sized, but in
1149 DWARF 3 and later, it is offset-sized.
sewardj31452302009-01-25 23:50:32 +00001150 */
sewardjee93cdb2012-04-29 11:35:37 +00001151 if (cc->version == 2) {
sewardj5d616df2013-07-02 08:07:15 +00001152 cts->u.val = (ULong)(UWord)get_UWord(c);
1153 cts->szB = sizeof(UWord);
sewardjee93cdb2012-04-29 11:35:37 +00001154 } else {
sewardj5d616df2013-07-02 08:07:15 +00001155 cts->u.val = get_Dwarfish_UWord(c, cc->is_dw64);
1156 cts->szB = cc->is_dw64 ? sizeof(ULong) : sizeof(UInt);
sewardjee93cdb2012-04-29 11:35:37 +00001157 }
sewardj5d616df2013-07-02 08:07:15 +00001158 TRACE_D3("0x%lx", (UWord)cts->u.val);
1159 if (0) VG_(printf)("DW_FORM_ref_addr 0x%lx\n", (UWord)cts->u.val);
1160 if (/* the following is surely impossible, but ... */
1161 !ML_(sli_is_valid)(cc->escn_debug_info)
1162 || cts->u.val >= (ULong)cc->escn_debug_info.szB) {
sewardj31452302009-01-25 23:50:32 +00001163 /* Hmm. Offset is nonsensical for this object's .debug_info
1164 section. Be safe and reject it. */
1165 cc->barf("get_Form_contents: DW_FORM_ref_addr points "
1166 "outside .debug_info");
1167 }
1168 break;
1169
sewardjb8b79ad2008-03-03 01:35:41 +00001170 case DW_FORM_strp: {
1171 /* this is an offset into .debug_str */
sewardjb8b79ad2008-03-03 01:35:41 +00001172 UWord uw = (UWord)get_Dwarfish_UWord( c, cc->is_dw64 );
sewardj5d616df2013-07-02 08:07:15 +00001173 if (!ML_(sli_is_valid)(cc->escn_debug_str)
1174 || uw >= cc->escn_debug_str.szB)
sewardj31452302009-01-25 23:50:32 +00001175 cc->barf("get_Form_contents: DW_FORM_strp "
sewardjb8b79ad2008-03-03 01:35:41 +00001176 "points outside .debug_str");
1177 /* FIXME: check the entire string lies inside debug_str,
1178 not just the first byte of it. */
sewardj5d616df2013-07-02 08:07:15 +00001179 DiCursor str
1180 = ML_(cur_plus)( ML_(cur_from_sli)(cc->escn_debug_str), uw );
1181 if (td3) {
1182 HChar* tmp = ML_(cur_read_strdup)(str, "di.getFC.1");
1183 TRACE_D3("(indirect string, offset: 0x%lx): %s", uw, tmp);
1184 ML_(dinfo_free)(tmp);
1185 }
1186 cts->u.cur = str;
1187 cts->szB = - (Long)(1 + (ULong)ML_(cur_strlen)(str));
sewardjb8b79ad2008-03-03 01:35:41 +00001188 break;
1189 }
1190 case DW_FORM_string: {
sewardj5d616df2013-07-02 08:07:15 +00001191 DiCursor str = get_AsciiZ(c);
1192 if (td3) {
1193 HChar* tmp = ML_(cur_read_strdup)(str, "di.getFC.2");
1194 TRACE_D3("%s", tmp);
1195 ML_(dinfo_free)(tmp);
1196 }
1197 cts->u.cur = str;
sewardjb8b79ad2008-03-03 01:35:41 +00001198 /* strlen is safe because get_AsciiZ already 'vetted' the
1199 entire string */
sewardj5d616df2013-07-02 08:07:15 +00001200 cts->szB = - (Long)(1 + (ULong)ML_(cur_strlen)(str));
sewardjb8b79ad2008-03-03 01:35:41 +00001201 break;
1202 }
tomfba428c2010-04-28 08:09:30 +00001203 case DW_FORM_ref1: {
sewardj5d616df2013-07-02 08:07:15 +00001204 UChar u8 = get_UChar(c);
1205 UWord res = cc->cu_start_offset + (UWord)u8;
1206 cts->u.val = (ULong)res;
1207 cts->szB = sizeof(UWord);
tomfba428c2010-04-28 08:09:30 +00001208 TRACE_D3("<%lx>", res);
1209 break;
1210 }
1211 case DW_FORM_ref2: {
sewardj5d616df2013-07-02 08:07:15 +00001212 UShort u16 = get_UShort(c);
1213 UWord res = cc->cu_start_offset + (UWord)u16;
1214 cts->u.val = (ULong)res;
1215 cts->szB = sizeof(UWord);
tomfba428c2010-04-28 08:09:30 +00001216 TRACE_D3("<%lx>", res);
1217 break;
1218 }
sewardjb8b79ad2008-03-03 01:35:41 +00001219 case DW_FORM_ref4: {
sewardj5d616df2013-07-02 08:07:15 +00001220 UInt u32 = get_UInt(c);
1221 UWord res = cc->cu_start_offset + (UWord)u32;
1222 cts->u.val = (ULong)res;
1223 cts->szB = sizeof(UWord);
sewardjb8b79ad2008-03-03 01:35:41 +00001224 TRACE_D3("<%lx>", res);
1225 break;
1226 }
tomfba428c2010-04-28 08:09:30 +00001227 case DW_FORM_ref8: {
sewardj5d616df2013-07-02 08:07:15 +00001228 ULong u64 = get_ULong(c);
1229 UWord res = cc->cu_start_offset + (UWord)u64;
1230 cts->u.val = (ULong)res;
1231 cts->szB = sizeof(UWord);
tomfba428c2010-04-28 08:09:30 +00001232 TRACE_D3("<%lx>", res);
1233 break;
1234 }
1235 case DW_FORM_ref_udata: {
sewardj5d616df2013-07-02 08:07:15 +00001236 ULong u64 = get_ULEB128(c);
1237 UWord res = cc->cu_start_offset + (UWord)u64;
1238 cts->u.val = (ULong)res;
1239 cts->szB = sizeof(UWord);
tomfba428c2010-04-28 08:09:30 +00001240 TRACE_D3("<%lx>", res);
1241 break;
1242 }
sewardjb8b79ad2008-03-03 01:35:41 +00001243 case DW_FORM_flag: {
1244 UChar u8 = get_UChar(c);
1245 TRACE_D3("%u", (UInt)u8);
sewardj5d616df2013-07-02 08:07:15 +00001246 cts->u.val = (ULong)u8;
1247 cts->szB = 1;
sewardjb8b79ad2008-03-03 01:35:41 +00001248 break;
1249 }
tomfba428c2010-04-28 08:09:30 +00001250 case DW_FORM_flag_present:
1251 TRACE_D3("1");
sewardj5d616df2013-07-02 08:07:15 +00001252 cts->u.val = 1;
1253 cts->szB = 1;
tomfba428c2010-04-28 08:09:30 +00001254 break;
sewardjb8b79ad2008-03-03 01:35:41 +00001255 case DW_FORM_block1: {
sewardj5d616df2013-07-02 08:07:15 +00001256 ULong u64b;
1257 ULong u64 = (ULong)get_UChar(c);
1258 DiCursor block = get_DiCursor_from_Cursor(c);
sewardjb8b79ad2008-03-03 01:35:41 +00001259 TRACE_D3("%llu byte block: ", u64);
1260 for (u64b = u64; u64b > 0; u64b--) {
1261 UChar u8 = get_UChar(c);
1262 TRACE_D3("%x ", (UInt)u8);
1263 }
sewardj5d616df2013-07-02 08:07:15 +00001264 cts->u.cur = block;
1265 cts->szB = - (Long)u64;
sewardjb8b79ad2008-03-03 01:35:41 +00001266 break;
1267 }
sewardjb2250d32008-10-23 11:13:05 +00001268 case DW_FORM_block2: {
sewardj5d616df2013-07-02 08:07:15 +00001269 ULong u64b;
1270 ULong u64 = (ULong)get_UShort(c);
1271 DiCursor block = get_DiCursor_from_Cursor(c);
sewardjb2250d32008-10-23 11:13:05 +00001272 TRACE_D3("%llu byte block: ", u64);
1273 for (u64b = u64; u64b > 0; u64b--) {
1274 UChar u8 = get_UChar(c);
1275 TRACE_D3("%x ", (UInt)u8);
1276 }
sewardj5d616df2013-07-02 08:07:15 +00001277 cts->u.cur = block;
1278 cts->szB = - (Long)u64;
sewardjb2250d32008-10-23 11:13:05 +00001279 break;
1280 }
tomfba428c2010-04-28 08:09:30 +00001281 case DW_FORM_block4: {
sewardj5d616df2013-07-02 08:07:15 +00001282 ULong u64b;
1283 ULong u64 = (ULong)get_UInt(c);
1284 DiCursor block = get_DiCursor_from_Cursor(c);
tomfba428c2010-04-28 08:09:30 +00001285 TRACE_D3("%llu byte block: ", u64);
1286 for (u64b = u64; u64b > 0; u64b--) {
1287 UChar u8 = get_UChar(c);
1288 TRACE_D3("%x ", (UInt)u8);
1289 }
sewardj5d616df2013-07-02 08:07:15 +00001290 cts->u.cur = block;
1291 cts->szB = - (Long)u64;
tomfba428c2010-04-28 08:09:30 +00001292 break;
1293 }
1294 case DW_FORM_exprloc:
1295 case DW_FORM_block: {
sewardj5d616df2013-07-02 08:07:15 +00001296 ULong u64b;
1297 ULong u64 = (ULong)get_ULEB128(c);
1298 DiCursor block = get_DiCursor_from_Cursor(c);
tomfba428c2010-04-28 08:09:30 +00001299 TRACE_D3("%llu byte block: ", u64);
1300 for (u64b = u64; u64b > 0; u64b--) {
1301 UChar u8 = get_UChar(c);
1302 TRACE_D3("%x ", (UInt)u8);
1303 }
sewardj5d616df2013-07-02 08:07:15 +00001304 cts->u.cur = block;
1305 cts->szB = - (Long)u64;
tomfba428c2010-04-28 08:09:30 +00001306 break;
1307 }
1308 case DW_FORM_ref_sig8: {
1309 ULong u64b;
sewardjd9350682012-04-05 07:55:47 +00001310 ULong signature = get_ULong (c);
1311 ULong work = signature;
tomfba428c2010-04-28 08:09:30 +00001312 TRACE_D3("8 byte signature: ");
1313 for (u64b = 8; u64b > 0; u64b--) {
sewardjd9350682012-04-05 07:55:47 +00001314 UChar u8 = work & 0xff;
tomfba428c2010-04-28 08:09:30 +00001315 TRACE_D3("%x ", (UInt)u8);
sewardjd9350682012-04-05 07:55:47 +00001316 work >>= 8;
tomfba428c2010-04-28 08:09:30 +00001317 }
sewardjd9350682012-04-05 07:55:47 +00001318 /* Due to the way that the hash table is constructed, the
1319 resulting DIE offset here is already "cooked". See
1320 cook_die_using_form. */
sewardj5d616df2013-07-02 08:07:15 +00001321 cts->u.val = lookup_signatured_type (cc->signature_types, signature,
1322 c->barf);
1323 cts->szB = sizeof(UWord);
tomfba428c2010-04-28 08:09:30 +00001324 break;
1325 }
1326 case DW_FORM_indirect:
sewardj5d616df2013-07-02 08:07:15 +00001327 get_Form_contents (cts, cc, c, td3, (DW_FORM)get_ULEB128(c));
tomfba428c2010-04-28 08:09:30 +00001328 return;
1329
sewardjf7c97142012-07-14 09:59:01 +00001330 case DW_FORM_GNU_ref_alt:
sewardj5d616df2013-07-02 08:07:15 +00001331 cts->u.val = get_Dwarfish_UWord(c, cc->is_dw64);
1332 cts->szB = cc->is_dw64 ? sizeof(ULong) : sizeof(UInt);
1333 TRACE_D3("0x%lx", (UWord)cts->u.val);
1334 if (0) VG_(printf)("DW_FORM_GNU_ref_alt 0x%lx\n", (UWord)cts->u.val);
1335 if (/* the following is surely impossible, but ... */
1336 !ML_(sli_is_valid)(cc->escn_debug_info_alt)
1337 || cts->u.val >= (ULong)cc->escn_debug_info_alt.szB) {
sewardjf7c97142012-07-14 09:59:01 +00001338 /* Hmm. Offset is nonsensical for this object's .debug_info
1339 section. Be safe and reject it. */
1340 cc->barf("get_Form_contents: DW_FORM_ref_addr points "
1341 "outside alternate .debug_info");
1342 }
1343 break;
1344
1345 case DW_FORM_GNU_strp_alt: {
1346 /* this is an offset into alternate .debug_str */
sewardj5d616df2013-07-02 08:07:15 +00001347 SizeT uw = (UWord)get_Dwarfish_UWord( c, cc->is_dw64 );
1348 if (!ML_(sli_is_valid)(cc->escn_debug_str_alt)
1349 || uw >= cc->escn_debug_str_alt.szB)
sewardjf7c97142012-07-14 09:59:01 +00001350 cc->barf("get_Form_contents: DW_FORM_GNU_strp_alt "
1351 "points outside alternate .debug_str");
1352 /* FIXME: check the entire string lies inside debug_str,
1353 not just the first byte of it. */
sewardj5d616df2013-07-02 08:07:15 +00001354 DiCursor str
1355 = ML_(cur_plus)( ML_(cur_from_sli)(cc->escn_debug_str_alt), uw);
1356 if (td3) {
1357 HChar* tmp = ML_(cur_read_strdup)(str, "di.getFC.3");
1358 TRACE_D3("(indirect alt string, offset: 0x%lx): %s", uw, tmp);
1359 ML_(dinfo_free)(tmp);
1360 }
1361 cts->u.cur = str;
1362 cts->szB = - (Long)(1 + (ULong)ML_(cur_strlen)(str));
sewardjf7c97142012-07-14 09:59:01 +00001363 break;
1364 }
1365
sewardjb8b79ad2008-03-03 01:35:41 +00001366 default:
sewardj31452302009-01-25 23:50:32 +00001367 VG_(printf)(
sewardj5d616df2013-07-02 08:07:15 +00001368 "get_Form_contents: unhandled %d (%s) at <%llx>\n",
sewardj31452302009-01-25 23:50:32 +00001369 form, ML_(pp_DW_FORM)(form), get_position_of_Cursor(c));
sewardjb8b79ad2008-03-03 01:35:41 +00001370 c->barf("get_Form_contents: unhandled DW_FORM");
1371 }
1372}
1373
1374
1375/*------------------------------------------------------------*/
1376/*--- ---*/
1377/*--- Parsing of variable-related DIEs ---*/
1378/*--- ---*/
1379/*------------------------------------------------------------*/
1380
1381typedef
1382 struct _TempVar {
philippe7293d252014-06-14 16:30:09 +00001383 HChar* name; /* in DebugInfo's .strpool */
sewardjb8b79ad2008-03-03 01:35:41 +00001384 /* Represent ranges economically. nRanges is the number of
1385 ranges. Cases:
1386 0: .rngOneMin .rngOneMax .manyRanges are all zero
1387 1: .rngOneMin .rngOneMax hold the range; .rngMany is NULL
1388 2: .rngOneMin .rngOneMax are zero; .rngMany holds the ranges.
1389 This is merely an optimisation to avoid having to allocate
1390 and free the XArray in the common (98%) of cases where there
1391 is zero or one address ranges. */
1392 UWord nRanges;
1393 Addr rngOneMin;
1394 Addr rngOneMax;
sewardj9c606bd2008-09-18 18:12:50 +00001395 XArray* rngMany; /* of AddrRange. NON-UNIQUE PTR in AR_DINFO. */
1396 /* Do not free .rngMany, since many TempVars will have the same
1397 value. Instead the associated storage is to be freed by
1398 deleting 'rangetree', which stores a single copy of each
1399 range. */
sewardjb8b79ad2008-03-03 01:35:41 +00001400 /* --- */
1401 Int level;
sewardj9c606bd2008-09-18 18:12:50 +00001402 UWord typeR; /* a cuOff */
sewardjb8b79ad2008-03-03 01:35:41 +00001403 GExpr* gexpr; /* for this variable */
1404 GExpr* fbGX; /* to find the frame base of the enclosing fn, if
1405 any */
florian1636d332012-11-15 04:27:04 +00001406 HChar* fName; /* declaring file name, or NULL */
sewardjb8b79ad2008-03-03 01:35:41 +00001407 Int fLine; /* declaring file line number, or zero */
1408 /* offset in .debug_info, so that abstract instances can be
1409 found to satisfy references from concrete instances. */
1410 UWord dioff;
1411 UWord absOri; /* so the absOri fields refer to dioff fields
1412 in some other, related TempVar. */
1413 }
1414 TempVar;
1415
sewardj7cf4e6b2008-05-01 20:24:26 +00001416#define N_D3_VAR_STACK 48
sewardjb8b79ad2008-03-03 01:35:41 +00001417
1418typedef
1419 struct {
1420 /* Contains the range stack: a stack of address ranges, one
1421 stack entry for each nested scope.
1422
1423 Some scope entries are created by function definitions
1424 (DW_AT_subprogram), and for those, we also note the GExpr
1425 derived from its DW_AT_frame_base attribute, if any.
1426 Consequently it should be possible to find, for any
1427 variable's DIE, the GExpr for the the containing function's
1428 DW_AT_frame_base by scanning back through the stack to find
1429 the nearest entry associated with a function. This somewhat
1430 elaborate scheme is provided so as to make it possible to
1431 obtain the correct DW_AT_frame_base expression even in the
1432 presence of nested functions (or to be more precise, in the
1433 presence of nested DW_AT_subprogram DIEs).
1434 */
1435 Int sp; /* [sp] is innermost active entry; sp==-1 for empty
1436 stack */
1437 XArray* ranges[N_D3_VAR_STACK]; /* XArray of AddrRange */
1438 Int level[N_D3_VAR_STACK]; /* D3 DIE levels */
1439 Bool isFunc[N_D3_VAR_STACK]; /* from DW_AT_subprogram? */
1440 GExpr* fbGX[N_D3_VAR_STACK]; /* if isFunc, contains the FB
1441 expr, else NULL */
1442 /* The file name table. Is a mapping from integer index to the
philippe7293d252014-06-14 16:30:09 +00001443 (permanent) copy of the string in in DebugInfo's .strpool. */
sewardjb8b79ad2008-03-03 01:35:41 +00001444 XArray* /* of UChar* */ filenameTable;
1445 }
1446 D3VarParser;
1447
florian6bd9dc12012-11-23 16:17:43 +00001448static void varstack_show ( D3VarParser* parser, const HChar* str ) {
sewardjb8b79ad2008-03-03 01:35:41 +00001449 Word i, j;
1450 VG_(printf)(" varstack (%s) {\n", str);
1451 for (i = 0; i <= parser->sp; i++) {
1452 XArray* xa = parser->ranges[i];
1453 vg_assert(xa);
1454 VG_(printf)(" [%ld] (level %d)", i, parser->level[i]);
1455 if (parser->isFunc[i]) {
1456 VG_(printf)(" (fbGX=%p)", parser->fbGX[i]);
1457 } else {
1458 vg_assert(parser->fbGX[i] == NULL);
1459 }
1460 VG_(printf)(": ");
1461 if (VG_(sizeXA)( xa ) == 0) {
1462 VG_(printf)("** empty PC range array **");
1463 } else {
1464 for (j = 0; j < VG_(sizeXA)( xa ); j++) {
1465 AddrRange* range = (AddrRange*) VG_(indexXA)( xa, j );
1466 vg_assert(range);
barta0b6b2c2008-07-07 06:49:24 +00001467 VG_(printf)("[%#lx,%#lx] ", range->aMin, range->aMax);
sewardjb8b79ad2008-03-03 01:35:41 +00001468 }
1469 }
1470 VG_(printf)("\n");
1471 }
1472 VG_(printf)(" }\n");
1473}
1474
1475/* Remove from the stack, all entries with .level > 'level' */
1476static
1477void varstack_preen ( D3VarParser* parser, Bool td3, Int level )
1478{
1479 Bool changed = False;
1480 vg_assert(parser->sp < N_D3_VAR_STACK);
1481 while (True) {
1482 vg_assert(parser->sp >= -1);
1483 if (parser->sp == -1) break;
1484 if (parser->level[parser->sp] <= level) break;
1485 if (0)
1486 TRACE_D3("BBBBAAAA varstack_pop [newsp=%d]\n", parser->sp-1);
1487 vg_assert(parser->ranges[parser->sp]);
1488 /* Who allocated this xa? get_range_list() or
1489 unitary_range_list(). */
1490 VG_(deleteXA)( parser->ranges[parser->sp] );
1491 parser->ranges[parser->sp] = NULL;
1492 parser->level[parser->sp] = 0;
1493 parser->isFunc[parser->sp] = False;
1494 parser->fbGX[parser->sp] = NULL;
1495 parser->sp--;
1496 changed = True;
1497 }
1498 if (changed && td3)
1499 varstack_show( parser, "after preen" );
1500}
1501
1502static void varstack_push ( CUConst* cc,
1503 D3VarParser* parser,
1504 Bool td3,
1505 XArray* ranges, Int level,
1506 Bool isFunc, GExpr* fbGX ) {
1507 if (0)
1508 TRACE_D3("BBBBAAAA varstack_push[newsp=%d]: %d %p\n",
1509 parser->sp+1, level, ranges);
1510
1511 /* First we need to zap everything >= 'level', as we are about to
1512 replace any previous entry at 'level', so .. */
1513 varstack_preen(parser, /*td3*/False, level-1);
1514
1515 vg_assert(parser->sp >= -1);
1516 vg_assert(parser->sp < N_D3_VAR_STACK);
1517 if (parser->sp == N_D3_VAR_STACK-1)
1518 cc->barf("varstack_push: N_D3_VAR_STACK is too low; "
1519 "increase and recompile");
1520 if (parser->sp >= 0)
1521 vg_assert(parser->level[parser->sp] < level);
1522 parser->sp++;
1523 vg_assert(parser->ranges[parser->sp] == NULL);
1524 vg_assert(parser->level[parser->sp] == 0);
1525 vg_assert(parser->isFunc[parser->sp] == False);
1526 vg_assert(parser->fbGX[parser->sp] == NULL);
1527 vg_assert(ranges != NULL);
1528 if (!isFunc) vg_assert(fbGX == NULL);
1529 parser->ranges[parser->sp] = ranges;
1530 parser->level[parser->sp] = level;
1531 parser->isFunc[parser->sp] = isFunc;
1532 parser->fbGX[parser->sp] = fbGX;
1533 if (td3)
1534 varstack_show( parser, "after push" );
1535}
1536
1537
sewardj5d616df2013-07-02 08:07:15 +00001538/* cts is derived from a DW_AT_location and so refers either to a
1539 location expression or to a location list. Figure out which, and
1540 in both cases bundle the expression or location list into a
1541 so-called GExpr (guarded expression). */
sewardjb8b79ad2008-03-03 01:35:41 +00001542__attribute__((noinline))
sewardj5d616df2013-07-02 08:07:15 +00001543static GExpr* get_GX ( CUConst* cc, Bool td3, const FormContents* cts )
sewardjb8b79ad2008-03-03 01:35:41 +00001544{
1545 GExpr* gexpr = NULL;
sewardj5d616df2013-07-02 08:07:15 +00001546 if (cts->szB < 0) {
1547 /* represents a non-empty in-line location expression, and
1548 cts->u.cur points at the image bytes */
1549 gexpr = make_singleton_GX( cts->u.cur, (ULong)(- cts->szB) );
sewardjb8b79ad2008-03-03 01:35:41 +00001550 }
1551 else
sewardj5d616df2013-07-02 08:07:15 +00001552 if (cts->szB > 0) {
1553 /* represents a location list. cts->u.val is the offset of it
1554 in .debug_loc. */
sewardjb8b79ad2008-03-03 01:35:41 +00001555 if (!cc->cu_svma_known)
1556 cc->barf("get_GX: location list, but CU svma is unknown");
sewardj5d616df2013-07-02 08:07:15 +00001557 gexpr = make_general_GX( cc, td3, cts->u.val, cc->cu_svma );
sewardjb8b79ad2008-03-03 01:35:41 +00001558 }
1559 else {
1560 vg_assert(0); /* else caller is bogus */
1561 }
1562 return gexpr;
1563}
1564
1565
1566static
philippea0a73932014-06-15 15:42:20 +00001567void read_filename_table( /*MOD*/XArray* /* of UChar* */ filenameTable,
sewardj5d616df2013-07-02 08:07:15 +00001568 CUConst* cc, ULong debug_line_offset,
sewardjb8b79ad2008-03-03 01:35:41 +00001569 Bool td3 )
1570{
1571 Bool is_dw64;
1572 Cursor c;
1573 Word i;
sewardjb8b79ad2008-03-03 01:35:41 +00001574 UShort version;
sewardjb8b79ad2008-03-03 01:35:41 +00001575 UChar opcode_base;
florian1636d332012-11-15 04:27:04 +00001576 HChar* str;
sewardjb8b79ad2008-03-03 01:35:41 +00001577
philippea0a73932014-06-15 15:42:20 +00001578 vg_assert(filenameTable && cc && cc->barf);
sewardj5d616df2013-07-02 08:07:15 +00001579 if (!ML_(sli_is_valid)(cc->escn_debug_line)
1580 || cc->escn_debug_line.szB <= debug_line_offset) {
sewardjb8b79ad2008-03-03 01:35:41 +00001581 cc->barf("read_filename_table: .debug_line is missing?");
sewardj5d616df2013-07-02 08:07:15 +00001582 }
sewardjb8b79ad2008-03-03 01:35:41 +00001583
sewardj5d616df2013-07-02 08:07:15 +00001584 init_Cursor( &c, cc->escn_debug_line, debug_line_offset, cc->barf,
sewardjb8b79ad2008-03-03 01:35:41 +00001585 "Overrun whilst reading .debug_line section(1)" );
1586
njn4c245e52009-03-15 23:25:38 +00001587 /* unit_length = */
1588 get_Initial_Length( &is_dw64, &c,
sewardjb8b79ad2008-03-03 01:35:41 +00001589 "read_filename_table: invalid initial-length field" );
1590 version = get_UShort( &c );
tomfba428c2010-04-28 08:09:30 +00001591 if (version != 2 && version != 3 && version != 4)
1592 cc->barf("read_filename_table: Only DWARF version 2, 3 and 4 line info "
sewardjb8b79ad2008-03-03 01:35:41 +00001593 "is currently supported.");
njn4c245e52009-03-15 23:25:38 +00001594 /*header_length = (ULong)*/ get_Dwarfish_UWord( &c, is_dw64 );
1595 /*minimum_instruction_length = */ get_UChar( &c );
tomfba428c2010-04-28 08:09:30 +00001596 if (version >= 4)
1597 /*maximum_operations_per_insn = */ get_UChar( &c );
njn4c245e52009-03-15 23:25:38 +00001598 /*default_is_stmt = */ get_UChar( &c );
1599 /*line_base = (Char)*/ get_UChar( &c );
1600 /*line_range = */ get_UChar( &c );
sewardjb8b79ad2008-03-03 01:35:41 +00001601 opcode_base = get_UChar( &c );
1602 /* skip over "standard_opcode_lengths" */
1603 for (i = 1; i < (Word)opcode_base; i++)
1604 (void)get_UChar( &c );
1605
1606 /* skip over the directory names table */
1607 while (peek_UChar(&c) != 0) {
1608 (void)get_AsciiZ(&c);
1609 }
1610 (void)get_UChar(&c); /* skip terminating zero */
1611
1612 /* Read and record the file names table */
philippea0a73932014-06-15 15:42:20 +00001613 vg_assert( VG_(sizeXA)( filenameTable ) == 0 );
sewardjb8b79ad2008-03-03 01:35:41 +00001614 /* Add a dummy index-zero entry. DWARF3 numbers its files
1615 from 1, for some reason. */
1616 str = ML_(addStr)( cc->di, "<unknown_file>", -1 );
philippea0a73932014-06-15 15:42:20 +00001617 VG_(addToXA)( filenameTable, &str );
sewardjb8b79ad2008-03-03 01:35:41 +00001618 while (peek_UChar(&c) != 0) {
sewardj5d616df2013-07-02 08:07:15 +00001619 DiCursor cur = get_AsciiZ(&c);
1620 str = ML_(addStrFromCursor)( cc->di, cur );
sewardjb8b79ad2008-03-03 01:35:41 +00001621 TRACE_D3(" read_filename_table: %ld %s\n",
philippea0a73932014-06-15 15:42:20 +00001622 VG_(sizeXA)(filenameTable), str);
1623 VG_(addToXA)( filenameTable, &str );
sewardjb8b79ad2008-03-03 01:35:41 +00001624 (void)get_ULEB128( &c ); /* skip directory index # */
1625 (void)get_ULEB128( &c ); /* skip last mod time */
1626 (void)get_ULEB128( &c ); /* file size */
1627 }
1628 /* We're done! The rest of it is not interesting. */
1629}
1630
philippea0a73932014-06-15 15:42:20 +00001631/* setup_cu_svma to be called when a cu is found at level 0,
1632 to establish the cu_svma. */
1633static void setup_cu_svma(CUConst* cc, Bool have_lo, Addr ip_lo, Bool td3)
1634{
1635 Addr cu_svma;
1636 /* We have potentially more than one type of parser parsing the
1637 dwarf information. At least currently, each parser establishes
1638 the cu_svma. So, in case cu_svma_known, we check that the same
1639 result is obtained by the 2nd parsing of the cu.
1640
1641 Alternatively, we could reset cu_svma_known after each parsing
1642 and then check that we only see a single DW_TAG_compile_unit DIE
1643 at level 0, DWARF3 only allows exactly one top level DIE per
1644 CU. */
1645
1646 if (have_lo)
1647 cu_svma = ip_lo;
1648 else {
1649 /* Now, it may be that this DIE doesn't tell us the CU's
1650 SVMA, by way of not having a DW_AT_low_pc. That's OK --
1651 the CU doesn't *have* to have its SVMA specified.
1652
1653 But as per last para D3 spec sec 3.1.1 ("Normal and
1654 Partial Compilation Unit Entries", "If the base address
1655 (viz, the SVMA) is undefined, then any DWARF entry of
1656 structure defined interms of the base address of that
1657 compilation unit is not valid.". So that means, if whilst
1658 processing the children of this top level DIE (or their
1659 children, etc) we see a DW_AT_range, and cu_svma_known is
1660 False, then the DIE that contains it is (per the spec)
1661 invalid, and we can legitimately stop and complain. */
1662 /* .. whereas The Reality is, simply assume the SVMA is zero
1663 if it isn't specified. */
1664 cu_svma = 0;
1665 }
1666
1667 if (cc->cu_svma_known) {
1668 vg_assert (cu_svma == cc->cu_svma);
1669 } else {
1670 cc->cu_svma_known = True;
1671 cc->cu_svma = cu_svma;
1672 if (0)
1673 TRACE_D3("setup_cu_svma: acquire CU_SVMA of %p\n", (void*) cc->cu_svma);
1674 }
1675}
1676
1677__attribute__((noreturn))
1678static void dump_bad_die_and_barf(
1679 DW_TAG dtag,
1680 UWord posn,
1681 Int level,
1682 Cursor* c_die, UWord saved_die_c_offset,
1683 g_abbv *abbv,
1684 CUConst* cc)
1685{
1686 FormContents cts;
1687 UInt nf_i;
1688 Bool debug_types_flag;
1689 Bool alt_flag;
1690
1691 set_position_of_Cursor( c_die, saved_die_c_offset );
1692 posn = uncook_die( cc, posn, &debug_types_flag, &alt_flag );
1693 VG_(printf)(" <%d><%lx>: %s", level, posn, ML_(pp_DW_TAG)( dtag ) );
1694 if (debug_types_flag) {
1695 VG_(printf)(" (in .debug_types)");
1696 }
1697 else if (alt_flag) {
1698 VG_(printf)(" (in alternate .debug_info)");
1699 }
1700 VG_(printf)("\n");
1701 nf_i = 0;
1702 while (True) {
1703 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
1704 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
1705 nf_i++;
1706 if (attr == 0 && form == 0) break;
1707 VG_(printf)(" %18s: ", ML_(pp_DW_AT)(attr));
1708 /* Get the form contents, so as to print them */
1709 get_Form_contents( &cts, cc, c_die, True, form );
1710 VG_(printf)("\t\n");
1711 }
1712 VG_(printf)("\n");
1713 cc->barf("parse_var_DIE: confused by the above DIE");
1714}
1715
philippe5c5b8fc2014-05-06 20:15:55 +00001716__attribute__((noinline))
1717static void bad_DIE_confusion(int linenr)
1718{
1719 VG_(printf)("\nparse_var_DIE(%d): confused by:\n", linenr);
1720}
1721#define goto_bad_DIE do {bad_DIE_confusion(__LINE__); goto bad_DIE;} while (0)
sewardjb8b79ad2008-03-03 01:35:41 +00001722
1723__attribute__((noinline))
sewardj9c606bd2008-09-18 18:12:50 +00001724static void parse_var_DIE (
1725 /*MOD*/WordFM* /* of (XArray* of AddrRange, void) */ rangestree,
1726 /*MOD*/XArray* /* of TempVar* */ tempvars,
1727 /*MOD*/XArray* /* of GExpr* */ gexprs,
1728 /*MOD*/D3VarParser* parser,
1729 DW_TAG dtag,
1730 UWord posn,
1731 Int level,
1732 Cursor* c_die,
philippe746e97e2014-06-15 10:51:14 +00001733 g_abbv *abbv,
sewardj9c606bd2008-09-18 18:12:50 +00001734 CUConst* cc,
1735 Bool td3
1736)
sewardjb8b79ad2008-03-03 01:35:41 +00001737{
sewardj5d616df2013-07-02 08:07:15 +00001738 FormContents cts;
philippe746e97e2014-06-15 10:51:14 +00001739 UInt nf_i;
sewardjb8b79ad2008-03-03 01:35:41 +00001740
1741 UWord saved_die_c_offset = get_position_of_Cursor( c_die );
sewardjb8b79ad2008-03-03 01:35:41 +00001742
1743 varstack_preen( parser, td3, level-1 );
1744
sewardjf7c97142012-07-14 09:59:01 +00001745 if (dtag == DW_TAG_compile_unit
1746 || dtag == DW_TAG_type_unit
1747 || dtag == DW_TAG_partial_unit) {
sewardjb8b79ad2008-03-03 01:35:41 +00001748 Bool have_lo = False;
1749 Bool have_hi1 = False;
sewardjde065a02012-05-09 23:09:05 +00001750 Bool hiIsRelative = False;
sewardjb8b79ad2008-03-03 01:35:41 +00001751 Bool have_range = False;
1752 Addr ip_lo = 0;
1753 Addr ip_hi1 = 0;
1754 Addr rangeoff = 0;
philippe746e97e2014-06-15 10:51:14 +00001755 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00001756 while (True) {
philippe746e97e2014-06-15 10:51:14 +00001757 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
1758 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
1759 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00001760 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00001761 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
1762 if (attr == DW_AT_low_pc && cts.szB > 0) {
1763 ip_lo = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001764 have_lo = True;
1765 }
sewardj5d616df2013-07-02 08:07:15 +00001766 if (attr == DW_AT_high_pc && cts.szB > 0) {
1767 ip_hi1 = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001768 have_hi1 = True;
sewardjde065a02012-05-09 23:09:05 +00001769 if (form != DW_FORM_addr)
1770 hiIsRelative = True;
sewardjb8b79ad2008-03-03 01:35:41 +00001771 }
sewardj5d616df2013-07-02 08:07:15 +00001772 if (attr == DW_AT_ranges && cts.szB > 0) {
1773 rangeoff = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001774 have_range = True;
1775 }
sewardj5d616df2013-07-02 08:07:15 +00001776 if (attr == DW_AT_stmt_list && cts.szB > 0) {
philippea0a73932014-06-15 15:42:20 +00001777 read_filename_table( parser->filenameTable, cc, cts.u.val, td3 );
sewardjb8b79ad2008-03-03 01:35:41 +00001778 }
1779 }
sewardjde065a02012-05-09 23:09:05 +00001780 if (have_lo && have_hi1 && hiIsRelative)
1781 ip_hi1 += ip_lo;
philippea0a73932014-06-15 15:42:20 +00001782
sewardjb8b79ad2008-03-03 01:35:41 +00001783 /* Now, does this give us an opportunity to find this
1784 CU's svma? */
philippea0a73932014-06-15 15:42:20 +00001785 if (level == 0)
1786 setup_cu_svma(cc, have_lo, ip_lo, td3);
sewardjb8b79ad2008-03-03 01:35:41 +00001787
sewardjb8b79ad2008-03-03 01:35:41 +00001788 /* Do we have something that looks sane? */
1789 if (have_lo && have_hi1 && (!have_range)) {
1790 if (ip_lo < ip_hi1)
1791 varstack_push( cc, parser, td3,
1792 unitary_range_list(ip_lo, ip_hi1 - 1),
1793 level,
1794 False/*isFunc*/, NULL/*fbGX*/ );
philippe8130f912014-05-07 21:09:16 +00001795 else if (ip_lo == 0 && ip_hi1 == 0)
1796 /* CU has no code, presumably?
1797 Such situations have been encountered for code
1798 compiled with -ffunction-sections -fdata-sections
1799 and linked with --gc-sections. Completely
1800 eliminated CU gives such 0 lo/hi pc. Similarly
1801 to a CU which has no lo/hi/range pc, we push
1802 an empty range list. */
1803 varstack_push( cc, parser, td3,
1804 empty_range_list(),
1805 level,
1806 False/*isFunc*/, NULL/*fbGX*/ );
sewardjb8b79ad2008-03-03 01:35:41 +00001807 } else
1808 if ((!have_lo) && (!have_hi1) && have_range) {
1809 varstack_push( cc, parser, td3,
1810 get_range_list( cc, td3,
1811 rangeoff, cc->cu_svma ),
1812 level,
1813 False/*isFunc*/, NULL/*fbGX*/ );
1814 } else
1815 if ((!have_lo) && (!have_hi1) && (!have_range)) {
1816 /* CU has no code, presumably? */
1817 varstack_push( cc, parser, td3,
1818 empty_range_list(),
1819 level,
1820 False/*isFunc*/, NULL/*fbGX*/ );
1821 } else
sewardjf578a692008-06-24 09:51:55 +00001822 if (have_lo && (!have_hi1) && have_range && ip_lo == 0) {
1823 /* broken DIE created by gcc-4.3.X ? Ignore the
1824 apparently-redundant DW_AT_low_pc and use the DW_AT_ranges
1825 instead. */
1826 varstack_push( cc, parser, td3,
1827 get_range_list( cc, td3,
1828 rangeoff, cc->cu_svma ),
1829 level,
1830 False/*isFunc*/, NULL/*fbGX*/ );
1831 } else {
1832 if (0) VG_(printf)("I got hlo %d hhi1 %d hrange %d\n",
1833 (Int)have_lo, (Int)have_hi1, (Int)have_range);
philippe5c5b8fc2014-05-06 20:15:55 +00001834 goto_bad_DIE;
sewardjf578a692008-06-24 09:51:55 +00001835 }
sewardjb8b79ad2008-03-03 01:35:41 +00001836 }
1837
1838 if (dtag == DW_TAG_lexical_block || dtag == DW_TAG_subprogram) {
1839 Bool have_lo = False;
1840 Bool have_hi1 = False;
1841 Bool have_range = False;
sewardjde065a02012-05-09 23:09:05 +00001842 Bool hiIsRelative = False;
sewardjb8b79ad2008-03-03 01:35:41 +00001843 Addr ip_lo = 0;
1844 Addr ip_hi1 = 0;
1845 Addr rangeoff = 0;
1846 Bool isFunc = dtag == DW_TAG_subprogram;
1847 GExpr* fbGX = NULL;
philippe746e97e2014-06-15 10:51:14 +00001848 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00001849 while (True) {
philippe746e97e2014-06-15 10:51:14 +00001850 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
1851 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
1852 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00001853 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00001854 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
1855 if (attr == DW_AT_low_pc && cts.szB > 0) {
1856 ip_lo = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001857 have_lo = True;
1858 }
sewardj5d616df2013-07-02 08:07:15 +00001859 if (attr == DW_AT_high_pc && cts.szB > 0) {
1860 ip_hi1 = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001861 have_hi1 = True;
sewardjde065a02012-05-09 23:09:05 +00001862 if (form != DW_FORM_addr)
1863 hiIsRelative = True;
sewardjb8b79ad2008-03-03 01:35:41 +00001864 }
sewardj5d616df2013-07-02 08:07:15 +00001865 if (attr == DW_AT_ranges && cts.szB > 0) {
1866 rangeoff = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001867 have_range = True;
1868 }
1869 if (isFunc
1870 && attr == DW_AT_frame_base
sewardj5d616df2013-07-02 08:07:15 +00001871 && cts.szB != 0 /* either scalar or nonempty block */) {
1872 fbGX = get_GX( cc, False/*td3*/, &cts );
sewardjb8b79ad2008-03-03 01:35:41 +00001873 vg_assert(fbGX);
sewardj59a2d182008-08-22 23:18:02 +00001874 VG_(addToXA)(gexprs, &fbGX);
sewardjb8b79ad2008-03-03 01:35:41 +00001875 }
1876 }
sewardjde065a02012-05-09 23:09:05 +00001877 if (have_lo && have_hi1 && hiIsRelative)
1878 ip_hi1 += ip_lo;
sewardjb8b79ad2008-03-03 01:35:41 +00001879 /* Do we have something that looks sane? */
1880 if (dtag == DW_TAG_subprogram
1881 && (!have_lo) && (!have_hi1) && (!have_range)) {
1882 /* This is legit - ignore it. Sec 3.3.3: "A subroutine entry
1883 representing a subroutine declaration that is not also a
1884 definition does not have code address or range
1885 attributes." */
1886 } else
1887 if (dtag == DW_TAG_lexical_block
1888 && (!have_lo) && (!have_hi1) && (!have_range)) {
1889 /* I believe this is legit, and means the lexical block
1890 contains no insns (whatever that might mean). Ignore. */
1891 } else
1892 if (have_lo && have_hi1 && (!have_range)) {
1893 /* This scope supplies just a single address range. */
1894 if (ip_lo < ip_hi1)
1895 varstack_push( cc, parser, td3,
1896 unitary_range_list(ip_lo, ip_hi1 - 1),
1897 level, isFunc, fbGX );
1898 } else
1899 if ((!have_lo) && (!have_hi1) && have_range) {
1900 /* This scope supplies multiple address ranges via the use of
1901 a range list. */
1902 varstack_push( cc, parser, td3,
1903 get_range_list( cc, td3,
1904 rangeoff, cc->cu_svma ),
1905 level, isFunc, fbGX );
1906 } else
1907 if (have_lo && (!have_hi1) && (!have_range)) {
1908 /* This scope is bogus. The D3 spec sec 3.4 (Lexical Block
1909 Entries) says fairly clearly that a scope must have either
1910 _range or (_low_pc and _high_pc). */
1911 /* The spec is a bit ambiguous though. Perhaps a single byte
1912 range is intended? See sec 2.17 (Code Addresses And Ranges) */
1913 /* This case is here because icc9 produced this:
1914 <2><13bd>: DW_TAG_lexical_block
1915 DW_AT_decl_line : 5229
1916 DW_AT_decl_column : 37
1917 DW_AT_decl_file : 1
1918 DW_AT_low_pc : 0x401b03
1919 */
1920 /* Ignore (seems safe than pushing a single byte range) */
1921 } else
philippe5c5b8fc2014-05-06 20:15:55 +00001922 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00001923 }
1924
1925 if (dtag == DW_TAG_variable || dtag == DW_TAG_formal_parameter) {
florian1636d332012-11-15 04:27:04 +00001926 HChar* name = NULL;
sewardj9c606bd2008-09-18 18:12:50 +00001927 UWord typeR = D3_INVALID_CUOFF;
philippe81d24c32012-12-05 21:08:24 +00001928 Bool global = False;
sewardjb8b79ad2008-03-03 01:35:41 +00001929 GExpr* gexpr = NULL;
1930 Int n_attrs = 0;
1931 UWord abs_ori = (UWord)D3_INVALID_CUOFF;
sewardjb8b79ad2008-03-03 01:35:41 +00001932 Int lineNo = 0;
florian1636d332012-11-15 04:27:04 +00001933 HChar* fileName = NULL;
philippe746e97e2014-06-15 10:51:14 +00001934 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00001935 while (True) {
philippe746e97e2014-06-15 10:51:14 +00001936 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
1937 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
1938 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00001939 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00001940 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
sewardjb8b79ad2008-03-03 01:35:41 +00001941 n_attrs++;
sewardj5d616df2013-07-02 08:07:15 +00001942 if (attr == DW_AT_name && cts.szB < 0) {
1943 name = ML_(addStrFromCursor)( cc->di, cts.u.cur );
sewardjb8b79ad2008-03-03 01:35:41 +00001944 }
1945 if (attr == DW_AT_location
sewardj5d616df2013-07-02 08:07:15 +00001946 && cts.szB != 0 /* either scalar or nonempty block */) {
1947 gexpr = get_GX( cc, False/*td3*/, &cts );
sewardjb8b79ad2008-03-03 01:35:41 +00001948 vg_assert(gexpr);
sewardj59a2d182008-08-22 23:18:02 +00001949 VG_(addToXA)(gexprs, &gexpr);
sewardjb8b79ad2008-03-03 01:35:41 +00001950 }
sewardj5d616df2013-07-02 08:07:15 +00001951 if (attr == DW_AT_type && cts.szB > 0) {
1952 typeR = cook_die_using_form( cc, cts.u.val, form );
sewardjb8b79ad2008-03-03 01:35:41 +00001953 }
sewardj5d616df2013-07-02 08:07:15 +00001954 if (attr == DW_AT_external && cts.szB > 0 && cts.u.val > 0) {
philippe81d24c32012-12-05 21:08:24 +00001955 global = True;
sewardjb8b79ad2008-03-03 01:35:41 +00001956 }
sewardj5d616df2013-07-02 08:07:15 +00001957 if (attr == DW_AT_abstract_origin && cts.szB > 0) {
1958 abs_ori = (UWord)cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001959 }
sewardj5d616df2013-07-02 08:07:15 +00001960 if (attr == DW_AT_declaration && cts.szB > 0 && cts.u.val > 0) {
njn4c245e52009-03-15 23:25:38 +00001961 /*declaration = True;*/
sewardjb8b79ad2008-03-03 01:35:41 +00001962 }
sewardj5d616df2013-07-02 08:07:15 +00001963 if (attr == DW_AT_decl_line && cts.szB > 0) {
1964 lineNo = (Int)cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001965 }
sewardj5d616df2013-07-02 08:07:15 +00001966 if (attr == DW_AT_decl_file && cts.szB > 0) {
1967 Int ftabIx = (Int)cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00001968 if (ftabIx >= 1
1969 && ftabIx < VG_(sizeXA)( parser->filenameTable )) {
florian1636d332012-11-15 04:27:04 +00001970 fileName = *(HChar**)
sewardjb8b79ad2008-03-03 01:35:41 +00001971 VG_(indexXA)( parser->filenameTable, ftabIx );
1972 vg_assert(fileName);
1973 }
1974 if (0) VG_(printf)("XXX filename = %s\n", fileName);
1975 }
1976 }
philippe81d24c32012-12-05 21:08:24 +00001977 if (!global && dtag == DW_TAG_variable && level == 1) {
1978 /* Case of a static variable. It is better to declare
1979 it global as the variable is not really related to
1980 a PC range, as its address can be used by program
1981 counters outside of the ranges where it is visible . */
1982 global = True;
1983 }
1984
sewardjb8b79ad2008-03-03 01:35:41 +00001985 /* We'll collect it under if one of the following three
1986 conditions holds:
1987 (1) has location and type -> completed
1988 (2) has type only -> is an abstract instance
1989 (3) has location and abs_ori -> is a concrete instance
sewardj9c606bd2008-09-18 18:12:50 +00001990 Name, filename and line number are all optional frills.
sewardjb8b79ad2008-03-03 01:35:41 +00001991 */
1992 if ( /* 1 */ (gexpr && typeR != D3_INVALID_CUOFF)
1993 /* 2 */ || (typeR != D3_INVALID_CUOFF)
1994 /* 3 */ || (gexpr && abs_ori != (UWord)D3_INVALID_CUOFF) ) {
1995
1996 /* Add this variable to the list of interesting looking
1997 variables. Crucially, note along with it the address
1998 range(s) associated with the variable, which for locals
1999 will be the address ranges at the top of the varparser's
2000 stack. */
2001 GExpr* fbGX = NULL;
2002 Word i, nRanges;
2003 XArray* /* of AddrRange */ xa;
2004 TempVar* tv;
2005 /* Stack can't be empty; we put a dummy entry on it for the
2006 entire address range before starting with the DIEs for
2007 this CU. */
2008 vg_assert(parser->sp >= 0);
2009
philippe81d24c32012-12-05 21:08:24 +00002010 /* If this is a local variable (non-global), try to find
sewardjb8b79ad2008-03-03 01:35:41 +00002011 the GExpr for the DW_AT_frame_base of the containing
2012 function. It should have been pushed on the stack at the
2013 time we encountered its DW_TAG_subprogram DIE, so the way
2014 to find it is to scan back down the stack looking for it.
2015 If there isn't an enclosing stack entry marked 'isFunc'
2016 then we must be seeing variable or formal param DIEs
2017 outside of a function, so we deem the Dwarf to be
2018 malformed if that happens. Note that the fbGX may be NULL
2019 if the containing DT_TAG_subprogram didn't supply a
2020 DW_AT_frame_base -- that's OK, but there must actually be
2021 a containing DW_TAG_subprogram. */
philippe81d24c32012-12-05 21:08:24 +00002022 if (!global) {
sewardjb8b79ad2008-03-03 01:35:41 +00002023 Bool found = False;
2024 for (i = parser->sp; i >= 0; i--) {
2025 if (parser->isFunc[i]) {
2026 fbGX = parser->fbGX[i];
2027 found = True;
2028 break;
2029 }
2030 }
2031 if (!found) {
2032 if (0 && VG_(clo_verbosity) >= 0) {
2033 VG_(message)(Vg_DebugMsg,
philippe81d24c32012-12-05 21:08:24 +00002034 "warning: parse_var_DIE: non-global variable "
sewardj738856f2009-07-15 14:48:32 +00002035 "outside DW_TAG_subprogram\n");
sewardjb8b79ad2008-03-03 01:35:41 +00002036 }
philippe5c5b8fc2014-05-06 20:15:55 +00002037 /* goto_bad_DIE; */
sewardjb8b79ad2008-03-03 01:35:41 +00002038 /* This seems to happen a lot. Just ignore it -- if,
2039 when we come to evaluation of the location (guarded)
2040 expression, it requires a frame base value, and
2041 there's no expression for that, then evaluation as a
2042 whole will fail. Harmless - a bit of a waste of
2043 cycles but nothing more. */
2044 }
2045 }
2046
philippe81d24c32012-12-05 21:08:24 +00002047 /* re "global ? 0 : parser->sp" (twice), if the var is
2048 marked 'global' then we must put it at the global scope,
sewardjb8b79ad2008-03-03 01:35:41 +00002049 as only the global scope (level 0) covers the entire PC
2050 address space. It is asserted elsewhere that level 0
2051 always covers the entire address space. */
philippe81d24c32012-12-05 21:08:24 +00002052 xa = parser->ranges[global ? 0 : parser->sp];
sewardjb8b79ad2008-03-03 01:35:41 +00002053 nRanges = VG_(sizeXA)(xa);
2054 vg_assert(nRanges >= 0);
2055
sewardj9c606bd2008-09-18 18:12:50 +00002056 tv = ML_(dinfo_zalloc)( "di.readdwarf3.pvD.1", sizeof(TempVar) );
sewardjb8b79ad2008-03-03 01:35:41 +00002057 tv->name = name;
philippe81d24c32012-12-05 21:08:24 +00002058 tv->level = global ? 0 : parser->sp;
sewardjb8b79ad2008-03-03 01:35:41 +00002059 tv->typeR = typeR;
2060 tv->gexpr = gexpr;
2061 tv->fbGX = fbGX;
2062 tv->fName = fileName;
2063 tv->fLine = lineNo;
2064 tv->dioff = posn;
2065 tv->absOri = abs_ori;
2066
2067 /* See explanation on definition of type TempVar for the
2068 reason for this elaboration. */
2069 tv->nRanges = nRanges;
2070 tv->rngOneMin = 0;
2071 tv->rngOneMax = 0;
2072 tv->rngMany = NULL;
2073 if (nRanges == 1) {
2074 AddrRange* range = VG_(indexXA)(xa, 0);
2075 tv->rngOneMin = range->aMin;
2076 tv->rngOneMax = range->aMax;
2077 }
2078 else if (nRanges > 1) {
sewardj9c606bd2008-09-18 18:12:50 +00002079 /* See if we already have a range list which is
2080 structurally identical. If so, use that; if not, clone
2081 this one, and add it to our collection. */
2082 UWord keyW, valW;
2083 if (VG_(lookupFM)( rangestree, &keyW, &valW, (UWord)xa )) {
2084 XArray* old = (XArray*)keyW;
2085 tl_assert(valW == 0);
2086 tl_assert(old != xa);
2087 tv->rngMany = old;
2088 } else {
2089 XArray* cloned = VG_(cloneXA)( "di.readdwarf3.pvD.2", xa );
2090 tv->rngMany = cloned;
2091 VG_(addToFM)( rangestree, (UWord)cloned, 0 );
2092 }
sewardjb8b79ad2008-03-03 01:35:41 +00002093 }
2094
sewardj59a2d182008-08-22 23:18:02 +00002095 VG_(addToXA)( tempvars, &tv );
sewardjb8b79ad2008-03-03 01:35:41 +00002096
2097 TRACE_D3(" Recording this variable, with %ld PC range(s)\n",
2098 VG_(sizeXA)(xa) );
2099 /* collect stats on how effective the ->ranges special
2100 casing is */
2101 if (0) {
sewardj9c606bd2008-09-18 18:12:50 +00002102 static Int ntot=0, ngt=0;
2103 ntot++;
2104 if (tv->rngMany) ngt++;
2105 if (0 == (ntot % 100000))
2106 VG_(printf)("XXXX %d tot, %d cloned\n", ntot, ngt);
sewardjb8b79ad2008-03-03 01:35:41 +00002107 }
2108
2109 }
2110
2111 /* Here are some other weird cases seen in the wild:
2112
2113 We have a variable with a name and a type, but no
2114 location. I guess that's a sign that it has been
2115 optimised away. Ignore it. Here's an example:
2116
2117 static Int lc_compar(void* n1, void* n2) {
2118 MC_Chunk* mc1 = *(MC_Chunk**)n1;
2119 MC_Chunk* mc2 = *(MC_Chunk**)n2;
2120 return (mc1->data < mc2->data ? -1 : 1);
2121 }
2122
2123 Both mc1 and mc2 are like this
2124 <2><5bc>: Abbrev Number: 21 (DW_TAG_variable)
2125 DW_AT_name : mc1
2126 DW_AT_decl_file : 1
2127 DW_AT_decl_line : 216
2128 DW_AT_type : <5d3>
2129
2130 whereas n1 and n2 do have locations specified.
2131
2132 ---------------------------------------------
2133
2134 We see a DW_TAG_formal_parameter with a type, but
2135 no name and no location. It's probably part of a function type
2136 construction, thusly, hence ignore it:
2137 <1><2b4>: Abbrev Number: 12 (DW_TAG_subroutine_type)
2138 DW_AT_sibling : <2c9>
2139 DW_AT_prototyped : 1
2140 DW_AT_type : <114>
2141 <2><2be>: Abbrev Number: 13 (DW_TAG_formal_parameter)
2142 DW_AT_type : <13e>
2143 <2><2c3>: Abbrev Number: 13 (DW_TAG_formal_parameter)
2144 DW_AT_type : <133>
2145
2146 ---------------------------------------------
2147
2148 Is very minimal, like this:
2149 <4><81d>: Abbrev Number: 44 (DW_TAG_variable)
2150 DW_AT_abstract_origin: <7ba>
2151 What that signifies I have no idea. Ignore.
2152
2153 ----------------------------------------------
2154
2155 Is very minimal, like this:
2156 <200f>: DW_TAG_formal_parameter
2157 DW_AT_abstract_ori: <1f4c>
2158 DW_AT_location : 13440
2159 What that signifies I have no idea. Ignore.
2160 It might be significant, though: the variable at least
2161 has a location and so might exist somewhere.
2162 Maybe we should handle this.
2163
2164 ---------------------------------------------
2165
2166 <22407>: DW_TAG_variable
2167 DW_AT_name : (indirect string, offset: 0x6579):
2168 vgPlain_trampoline_stuff_start
2169 DW_AT_decl_file : 29
2170 DW_AT_decl_line : 56
2171 DW_AT_external : 1
2172 DW_AT_declaration : 1
2173
2174 Nameless and typeless variable that has a location? Who
2175 knows. Not me.
2176 <2><3d178>: Abbrev Number: 22 (DW_TAG_variable)
2177 DW_AT_location : 9 byte block: 3 c0 c7 13 38 0 0 0 0
2178 (DW_OP_addr: 3813c7c0)
2179
2180 No, really. Check it out. gcc is quite simply borked.
2181 <3><168cc>: Abbrev Number: 141 (DW_TAG_variable)
2182 // followed by no attributes, and the next DIE is a sibling,
2183 // not a child
2184 */
2185 }
2186 return;
2187
2188 bad_DIE:
philippea0a73932014-06-15 15:42:20 +00002189 dump_bad_die_and_barf(dtag, posn, level,
2190 c_die, saved_die_c_offset,
2191 abbv,
2192 cc);
2193 /*NOTREACHED*/
2194}
2195
2196typedef
2197 struct {
2198 /* The file name table. Is a mapping from integer index to the
2199 (permanent) copy of the string in DebugInfo's .strchunks. */
2200 XArray* /* of UChar* */ filenameTable;
sewardjd9350682012-04-05 07:55:47 +00002201 }
philippea0a73932014-06-15 15:42:20 +00002202 D3InlParser;
2203
2204/* Return the function name corresponding to absori.
2205 The return value is a (permanent) string in DebugInfo's .strchunks. */
2206static HChar* get_inlFnName (Int absori, CUConst* cc, Bool td3)
2207{
2208 Cursor c;
2209 g_abbv *abbv;
2210 ULong atag, abbv_code;
2211 UInt has_children;
2212 UWord posn;
2213 HChar *ret = NULL;
2214 FormContents cts;
2215 UInt nf_i;
2216
2217 init_Cursor (&c, cc->escn_debug_info, absori, cc->barf,
2218 "Overrun get_inlFnName absori");
2219
2220 posn = cook_die( cc, get_position_of_Cursor( &c ) );
2221 abbv_code = get_ULEB128( &c );
2222 abbv = get_abbv ( cc, abbv_code);
2223 atag = abbv->atag;
2224 TRACE_D3("\n");
2225 TRACE_D3(" <get_inlFnName><%lx>: Abbrev Number: %llu (%s)\n",
2226 posn, abbv_code, ML_(pp_DW_TAG)( atag ) );
2227
2228 if (atag == 0)
2229 cc->barf("get_inlFnName: invalid zero tag on DIE");
2230
2231 has_children = abbv->has_children;
2232 if (has_children != DW_children_no && has_children != DW_children_yes)
2233 cc->barf("get_inlFnName: invalid has_children value");
2234
2235 if (atag != DW_TAG_subprogram)
2236 cc->barf("get_inlFnName: absori not a subprogram");
2237
philippe746e97e2014-06-15 10:51:14 +00002238 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00002239 while (True) {
philippe746e97e2014-06-15 10:51:14 +00002240 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2241 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2242 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00002243 if (attr == 0 && form == 0) break;
philippea0a73932014-06-15 15:42:20 +00002244 get_Form_contents( &cts, cc, &c, False/*td3*/, form );
2245 if (attr == DW_AT_name) {
2246 HChar *fnname;
2247 if (cts.szB >= 0)
2248 cc->barf("get_inlFnName: expecting indirect string");
2249 fnname = ML_(cur_read_strdup)( cts.u.cur,
2250 "get_inlFnName.1" );
2251 ret = ML_(addStr)(cc->di, fnname, -1);
2252 ML_(dinfo_free) (fnname);
2253 break;
2254 }
sewardjb8b79ad2008-03-03 01:35:41 +00002255 }
philippea0a73932014-06-15 15:42:20 +00002256
2257 if (ret)
2258 return ret;
2259 else
2260 return ML_(addStr)(cc->di, "AbsOriFnNameNotFound", -1);
2261}
2262
2263/* Returns True if the (possibly) childrens of the current DIE are interesting
2264 to parse. Returns False otherwise.
2265 If the current DIE has a sibling, the non interesting children can
2266 maybe be skipped (if the DIE has a DW_AT_sibling). */
2267__attribute__((noinline))
2268static Bool parse_inl_DIE (
2269 /*MOD*/D3InlParser* parser,
2270 DW_TAG dtag,
2271 UWord posn,
2272 Int level,
2273 Cursor* c_die,
2274 g_abbv *abbv,
2275 CUConst* cc,
2276 Bool td3
2277)
2278{
2279 FormContents cts;
2280 UInt nf_i;
2281
2282 UWord saved_die_c_offset = get_position_of_Cursor( c_die );
2283
2284 /* Get info about DW_TAG_compile_unit and DW_TAG_partial_unit 'which
2285 in theory could also contain inlined fn calls). */
2286 if (dtag == DW_TAG_compile_unit || dtag == DW_TAG_partial_unit) {
2287 Bool have_lo = False;
2288 Addr ip_lo = 0;
2289
2290 nf_i = 0;
2291 while (True) {
2292 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2293 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2294 nf_i++;
2295 if (attr == 0 && form == 0) break;
2296 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
2297 if (attr == DW_AT_low_pc && cts.szB > 0) {
2298 ip_lo = cts.u.val;
2299 have_lo = True;
2300 }
2301 if (attr == DW_AT_stmt_list && cts.szB > 0) {
2302 read_filename_table( parser->filenameTable, cc, cts.u.val, td3 );
2303 }
2304 }
2305 if (level == 0)
2306 setup_cu_svma (cc, have_lo, ip_lo, td3);
2307 }
2308
2309 if (dtag == DW_TAG_inlined_subroutine) {
2310 Bool have_lo = False;
2311 Bool have_hi1 = False;
2312 Bool have_range = False;
2313 Bool hiIsRelative = False;
2314 Addr ip_lo = 0;
2315 Addr ip_hi1 = 0;
2316 Addr rangeoff = 0;
2317 HChar* caller_filename = NULL;
2318 Int caller_lineno = 0;
2319 Int inlinedfn_abstract_origin = 0;
2320
2321 nf_i = 0;
2322 while (True) {
2323 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2324 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2325 nf_i++;
2326 if (attr == 0 && form == 0) break;
2327 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
2328 if (attr == DW_AT_call_file && cts.szB > 0) {
2329 Int ftabIx = (Int)cts.u.val;
2330 if (ftabIx >= 1
2331 && ftabIx < VG_(sizeXA)( parser->filenameTable )) {
2332 caller_filename = *(HChar**)
2333 VG_(indexXA)( parser->filenameTable, ftabIx );
2334 vg_assert(caller_filename);
2335 }
2336 if (0) VG_(printf)("XXX caller_filename = %s\n", caller_filename);
2337 }
2338 if (attr == DW_AT_call_line && cts.szB > 0) {
2339 caller_lineno = cts.u.val;
2340 }
2341
2342 if (attr == DW_AT_abstract_origin && cts.szB > 0) {
2343 inlinedfn_abstract_origin = cts.u.val;
2344 }
2345
2346 if (attr == DW_AT_low_pc && cts.szB > 0) {
2347 ip_lo = cts.u.val;
2348 have_lo = True;
2349 }
2350 if (attr == DW_AT_high_pc && cts.szB > 0) {
2351 ip_hi1 = cts.u.val;
2352 have_hi1 = True;
2353 if (form != DW_FORM_addr)
2354 hiIsRelative = True;
2355 }
2356 if (attr == DW_AT_ranges && cts.szB > 0) {
2357 rangeoff = cts.u.val;
2358 have_range = True;
2359 }
2360 }
2361 if (have_lo && have_hi1 && hiIsRelative)
2362 ip_hi1 += ip_lo;
2363 /* Do we have something that looks sane? */
2364 if (dtag == DW_TAG_inlined_subroutine
2365 && (!have_lo) && (!have_hi1) && (!have_range)) {
2366 /* Seems strange. How can an inlined subroutine have
2367 no code ? */
2368 goto_bad_DIE;
2369 } else
2370 if (have_lo && have_hi1 && (!have_range)) {
2371 /* This inlined call is just a single address range. */
2372 if (ip_lo < ip_hi1) {
2373 ML_(addInlInfo) (cc->di,
2374 ip_lo, ip_hi1,
2375 get_inlFnName (inlinedfn_abstract_origin, cc, td3),
2376 caller_filename,
2377 NULL, // INLINED TBD dirname ?????
2378 caller_lineno, level);
2379 }
2380 } else if (have_range) {
2381 /* This inlined call is several address ranges. */
2382 XArray *ranges;
2383 Word j;
2384 HChar *inlfnname = get_inlFnName (inlinedfn_abstract_origin, cc, td3);
2385
2386 ranges = get_range_list( cc, td3,
2387 rangeoff, cc->cu_svma );
2388 for (j = 0; j < VG_(sizeXA)( ranges ); j++) {
2389 AddrRange* range = (AddrRange*) VG_(indexXA)( ranges, j );
2390 ML_(addInlInfo) (cc->di,
2391 range->aMin, range->aMax+1,
2392 // aMax+1 as range has its last bound included
2393 // while ML_(addInlInfo) expects last bound not
2394 // included.
2395 inlfnname,
2396 caller_filename,
2397 NULL, // INLINED TBD dirname ?????
2398 caller_lineno, level);
2399 }
2400 VG_(deleteXA)( ranges );
2401 } else
2402 goto_bad_DIE;
2403 }
2404
2405 // Only recursively parse the (possible) children for the DIE which
2406 // might maybe contain a DW_TAG_inlined_subroutine:
2407 return dtag == DW_TAG_lexical_block || dtag == DW_TAG_subprogram
2408 || dtag == DW_TAG_inlined_subroutine
2409 || dtag == DW_TAG_compile_unit || dtag == DW_TAG_partial_unit;
2410
2411 bad_DIE:
2412 dump_bad_die_and_barf(dtag, posn, level,
2413 c_die, saved_die_c_offset,
2414 abbv,
2415 cc);
sewardjb8b79ad2008-03-03 01:35:41 +00002416 /*NOTREACHED*/
2417}
2418
2419
2420/*------------------------------------------------------------*/
2421/*--- ---*/
2422/*--- Parsing of type-related DIEs ---*/
2423/*--- ---*/
2424/*------------------------------------------------------------*/
2425
2426#define N_D3_TYPE_STACK 16
2427
2428typedef
2429 struct {
sewardj2acc87c2011-02-01 23:10:14 +00002430 /* What source language? 'A'=Ada83/95,
2431 'C'=C/C++,
2432 'F'=Fortran,
2433 '?'=other
sewardjb8b79ad2008-03-03 01:35:41 +00002434 Established once per compilation unit. */
2435 UChar language;
2436 /* A stack of types which are currently under construction */
2437 Int sp; /* [sp] is innermost active entry; sp==-1 for empty
2438 stack */
sewardj9c606bd2008-09-18 18:12:50 +00002439 /* Note that the TyEnts in qparentE are temporary copies of the
2440 ones accumulating in the main tyent array. So it is not safe
2441 to free up anything on them when popping them off the stack
2442 (iow, it isn't safe to use TyEnt__make_EMPTY on them). Just
2443 memset them to zero when done. */
2444 TyEnt qparentE[N_D3_TYPE_STACK]; /* parent TyEnts */
sewardjb8b79ad2008-03-03 01:35:41 +00002445 Int qlevel[N_D3_TYPE_STACK];
2446
2447 }
2448 D3TypeParser;
2449
florian6bd9dc12012-11-23 16:17:43 +00002450static void typestack_show ( D3TypeParser* parser, const HChar* str ) {
sewardjb8b79ad2008-03-03 01:35:41 +00002451 Word i;
2452 VG_(printf)(" typestack (%s) {\n", str);
2453 for (i = 0; i <= parser->sp; i++) {
2454 VG_(printf)(" [%ld] (level %d): ", i, parser->qlevel[i]);
sewardj9c606bd2008-09-18 18:12:50 +00002455 ML_(pp_TyEnt)( &parser->qparentE[i] );
sewardjb8b79ad2008-03-03 01:35:41 +00002456 VG_(printf)("\n");
2457 }
2458 VG_(printf)(" }\n");
2459}
2460
2461/* Remove from the stack, all entries with .level > 'level' */
2462static
2463void typestack_preen ( D3TypeParser* parser, Bool td3, Int level )
2464{
2465 Bool changed = False;
2466 vg_assert(parser->sp < N_D3_TYPE_STACK);
2467 while (True) {
2468 vg_assert(parser->sp >= -1);
2469 if (parser->sp == -1) break;
2470 if (parser->qlevel[parser->sp] <= level) break;
2471 if (0)
2472 TRACE_D3("BBBBAAAA typestack_pop [newsp=%d]\n", parser->sp-1);
sewardj9c606bd2008-09-18 18:12:50 +00002473 vg_assert(ML_(TyEnt__is_type)(&parser->qparentE[parser->sp]));
2474 VG_(memset)(&parser->qparentE[parser->sp], 0, sizeof(TyEnt));
2475 parser->qparentE[parser->sp].cuOff = D3_INVALID_CUOFF;
2476 parser->qparentE[parser->sp].tag = Te_EMPTY;
2477 parser->qlevel[parser->sp] = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00002478 parser->sp--;
2479 changed = True;
2480 }
2481 if (changed && td3)
2482 typestack_show( parser, "after preen" );
2483}
2484
2485static Bool typestack_is_empty ( D3TypeParser* parser ) {
2486 vg_assert(parser->sp >= -1 && parser->sp < N_D3_TYPE_STACK);
2487 return parser->sp == -1;
2488}
2489
2490static void typestack_push ( CUConst* cc,
2491 D3TypeParser* parser,
2492 Bool td3,
sewardj9c606bd2008-09-18 18:12:50 +00002493 TyEnt* parentE, Int level ) {
sewardjb8b79ad2008-03-03 01:35:41 +00002494 if (0)
sewardj9c606bd2008-09-18 18:12:50 +00002495 TRACE_D3("BBBBAAAA typestack_push[newsp=%d]: %d %05lx\n",
2496 parser->sp+1, level, parentE->cuOff);
sewardjb8b79ad2008-03-03 01:35:41 +00002497
2498 /* First we need to zap everything >= 'level', as we are about to
2499 replace any previous entry at 'level', so .. */
2500 typestack_preen(parser, /*td3*/False, level-1);
2501
2502 vg_assert(parser->sp >= -1);
2503 vg_assert(parser->sp < N_D3_TYPE_STACK);
2504 if (parser->sp == N_D3_TYPE_STACK-1)
2505 cc->barf("typestack_push: N_D3_TYPE_STACK is too low; "
2506 "increase and recompile");
2507 if (parser->sp >= 0)
2508 vg_assert(parser->qlevel[parser->sp] < level);
2509 parser->sp++;
sewardj9c606bd2008-09-18 18:12:50 +00002510 vg_assert(parser->qparentE[parser->sp].tag == Te_EMPTY);
sewardjb8b79ad2008-03-03 01:35:41 +00002511 vg_assert(parser->qlevel[parser->sp] == 0);
sewardj9c606bd2008-09-18 18:12:50 +00002512 vg_assert(parentE);
2513 vg_assert(ML_(TyEnt__is_type)(parentE));
2514 vg_assert(parentE->cuOff != D3_INVALID_CUOFF);
2515 parser->qparentE[parser->sp] = *parentE;
sewardjb8b79ad2008-03-03 01:35:41 +00002516 parser->qlevel[parser->sp] = level;
2517 if (td3)
2518 typestack_show( parser, "after push" );
2519}
2520
sewardj2acc87c2011-02-01 23:10:14 +00002521/* True if the subrange type being parsed gives the bounds of an array. */
2522static Bool subrange_type_denotes_array_bounds ( D3TypeParser* parser,
2523 DW_TAG dtag ) {
2524 vg_assert(dtag == DW_TAG_subrange_type);
2525 /* For most languages, a subrange_type dtag always gives the
2526 bounds of an array.
2527 For Ada, there are additional conditions as a subrange_type
2528 is also used for other purposes. */
2529 if (parser->language != 'A')
2530 /* not Ada, so it definitely denotes an array bound. */
2531 return True;
2532 else
2533 /* Extra constraints for Ada: it only denotes an array bound if .. */
2534 return (! typestack_is_empty(parser)
2535 && parser->qparentE[parser->sp].tag == Te_TyArray);
2536}
sewardjb8b79ad2008-03-03 01:35:41 +00002537
2538/* Parse a type-related DIE. 'parser' holds the current parser state.
2539 'admin' is where the completed types are dumped. 'dtag' is the tag
2540 for this DIE. 'c_die' points to the start of the data fields (FORM
philippe746e97e2014-06-15 10:51:14 +00002541 stuff) for the DIE. abbv is the parsed abbreviation which describe
2542 the DIE.
sewardjb8b79ad2008-03-03 01:35:41 +00002543
2544 We may find the DIE uninteresting, in which case we should ignore
2545 it.
sewardj9c606bd2008-09-18 18:12:50 +00002546
2547 What happens: the DIE is examined. If uninteresting, it is ignored.
2548 Otherwise, the DIE gives rise to two things:
2549
2550 (1) the offset of this DIE in the CU -- the cuOffset, a UWord
2551 (2) a TyAdmin structure, which holds the type, or related stuff
2552
2553 (2) is added at the end of 'tyadmins', at some index, say 'i'.
2554
2555 A pair (cuOffset, i) is added to 'tydict'.
2556
2557 Hence 'tyadmins' holds the actual type entities, and 'tydict' holds
2558 a mapping from cuOffset to the index of the corresponding entry in
2559 'tyadmin'.
2560
2561 When resolving a cuOffset to a TyAdmin, first look up the cuOffset
2562 in the tydict (by binary search). This gives an index into
2563 tyadmins, and the required entity lives in tyadmins at that index.
sewardjb8b79ad2008-03-03 01:35:41 +00002564*/
2565__attribute__((noinline))
sewardj9c606bd2008-09-18 18:12:50 +00002566static void parse_type_DIE ( /*MOD*/XArray* /* of TyEnt */ tyents,
sewardjb8b79ad2008-03-03 01:35:41 +00002567 /*MOD*/D3TypeParser* parser,
2568 DW_TAG dtag,
2569 UWord posn,
2570 Int level,
2571 Cursor* c_die,
philippe746e97e2014-06-15 10:51:14 +00002572 g_abbv *abbv,
sewardjb8b79ad2008-03-03 01:35:41 +00002573 CUConst* cc,
2574 Bool td3 )
2575{
sewardj5d616df2013-07-02 08:07:15 +00002576 FormContents cts;
philippea0a73932014-06-15 15:42:20 +00002577 UInt nf_i;
sewardj9c606bd2008-09-18 18:12:50 +00002578 TyEnt typeE;
2579 TyEnt atomE;
2580 TyEnt fieldE;
2581 TyEnt boundE;
sewardjb8b79ad2008-03-03 01:35:41 +00002582
2583 UWord saved_die_c_offset = get_position_of_Cursor( c_die );
sewardjb8b79ad2008-03-03 01:35:41 +00002584
sewardj9c606bd2008-09-18 18:12:50 +00002585 VG_(memset)( &typeE, 0xAA, sizeof(typeE) );
2586 VG_(memset)( &atomE, 0xAA, sizeof(atomE) );
2587 VG_(memset)( &fieldE, 0xAA, sizeof(fieldE) );
2588 VG_(memset)( &boundE, 0xAA, sizeof(boundE) );
2589
sewardjb8b79ad2008-03-03 01:35:41 +00002590 /* If we've returned to a level at or above any previously noted
2591 parent, un-note it, so we don't believe we're still collecting
2592 its children. */
2593 typestack_preen( parser, td3, level-1 );
2594
sewardjf7c97142012-07-14 09:59:01 +00002595 if (dtag == DW_TAG_compile_unit
2596 || dtag == DW_TAG_type_unit
2597 || dtag == DW_TAG_partial_unit) {
sewardjb8b79ad2008-03-03 01:35:41 +00002598 /* See if we can find DW_AT_language, since it is important for
2599 establishing array bounds (see DW_TAG_subrange_type below in
2600 this fn) */
philippe746e97e2014-06-15 10:51:14 +00002601 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00002602 while (True) {
philippe746e97e2014-06-15 10:51:14 +00002603 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2604 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2605 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00002606 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00002607 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
sewardjb8b79ad2008-03-03 01:35:41 +00002608 if (attr != DW_AT_language)
2609 continue;
sewardj5d616df2013-07-02 08:07:15 +00002610 if (cts.szB <= 0)
philippe5c5b8fc2014-05-06 20:15:55 +00002611 goto_bad_DIE;
sewardj5d616df2013-07-02 08:07:15 +00002612 switch (cts.u.val) {
sewardjb8b79ad2008-03-03 01:35:41 +00002613 case DW_LANG_C89: case DW_LANG_C:
2614 case DW_LANG_C_plus_plus: case DW_LANG_ObjC:
2615 case DW_LANG_ObjC_plus_plus: case DW_LANG_UPC:
tomfba428c2010-04-28 08:09:30 +00002616 case DW_LANG_Upc: case DW_LANG_C99:
sewardjb8b79ad2008-03-03 01:35:41 +00002617 parser->language = 'C'; break;
2618 case DW_LANG_Fortran77: case DW_LANG_Fortran90:
2619 case DW_LANG_Fortran95:
2620 parser->language = 'F'; break;
sewardj2acc87c2011-02-01 23:10:14 +00002621 case DW_LANG_Ada83: case DW_LANG_Ada95:
2622 parser->language = 'A'; break;
2623 case DW_LANG_Cobol74:
sewardjb8b79ad2008-03-03 01:35:41 +00002624 case DW_LANG_Cobol85: case DW_LANG_Pascal83:
2625 case DW_LANG_Modula2: case DW_LANG_Java:
sewardj2acc87c2011-02-01 23:10:14 +00002626 case DW_LANG_PLI:
tomfba428c2010-04-28 08:09:30 +00002627 case DW_LANG_D: case DW_LANG_Python:
sewardjb8b79ad2008-03-03 01:35:41 +00002628 case DW_LANG_Mips_Assembler:
2629 parser->language = '?'; break;
2630 default:
philippe5c5b8fc2014-05-06 20:15:55 +00002631 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00002632 }
2633 }
2634 }
2635
2636 if (dtag == DW_TAG_base_type) {
2637 /* We can pick up a new base type any time. */
sewardj9c606bd2008-09-18 18:12:50 +00002638 VG_(memset)(&typeE, 0, sizeof(typeE));
2639 typeE.cuOff = D3_INVALID_CUOFF;
2640 typeE.tag = Te_TyBase;
philippe746e97e2014-06-15 10:51:14 +00002641 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00002642 while (True) {
philippe746e97e2014-06-15 10:51:14 +00002643 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2644 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2645 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00002646 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00002647 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
2648 if (attr == DW_AT_name && cts.szB < 0) {
sewardj9c606bd2008-09-18 18:12:50 +00002649 typeE.Te.TyBase.name
sewardj5d616df2013-07-02 08:07:15 +00002650 = ML_(cur_read_strdup)( cts.u.cur,
2651 "di.readdwarf3.ptD.base_type.1" );
sewardjb8b79ad2008-03-03 01:35:41 +00002652 }
sewardj5d616df2013-07-02 08:07:15 +00002653 if (attr == DW_AT_byte_size && cts.szB > 0) {
2654 typeE.Te.TyBase.szB = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00002655 }
sewardj5d616df2013-07-02 08:07:15 +00002656 if (attr == DW_AT_encoding && cts.szB > 0) {
2657 switch (cts.u.val) {
sewardjb8b79ad2008-03-03 01:35:41 +00002658 case DW_ATE_unsigned: case DW_ATE_unsigned_char:
tom50c50932010-10-18 14:57:58 +00002659 case DW_ATE_UTF: /* since DWARF4, e.g. char16_t from C++ */
sewardjb8b79ad2008-03-03 01:35:41 +00002660 case DW_ATE_boolean:/* FIXME - is this correct? */
sewardj5db15402012-06-07 09:13:21 +00002661 case DW_ATE_unsigned_fixed:
sewardj9c606bd2008-09-18 18:12:50 +00002662 typeE.Te.TyBase.enc = 'U'; break;
sewardjb8b79ad2008-03-03 01:35:41 +00002663 case DW_ATE_signed: case DW_ATE_signed_char:
sewardj5db15402012-06-07 09:13:21 +00002664 case DW_ATE_signed_fixed:
sewardj9c606bd2008-09-18 18:12:50 +00002665 typeE.Te.TyBase.enc = 'S'; break;
sewardjb8b79ad2008-03-03 01:35:41 +00002666 case DW_ATE_float:
sewardj9c606bd2008-09-18 18:12:50 +00002667 typeE.Te.TyBase.enc = 'F'; break;
sewardjb8b79ad2008-03-03 01:35:41 +00002668 case DW_ATE_complex_float:
sewardj9c606bd2008-09-18 18:12:50 +00002669 typeE.Te.TyBase.enc = 'C'; break;
sewardjb8b79ad2008-03-03 01:35:41 +00002670 default:
philippe5c5b8fc2014-05-06 20:15:55 +00002671 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00002672 }
2673 }
2674 }
2675
2676 /* Invent a name if it doesn't have one. gcc-4.3
2677 -ftree-vectorize is observed to emit nameless base types. */
sewardj9c606bd2008-09-18 18:12:50 +00002678 if (!typeE.Te.TyBase.name)
2679 typeE.Te.TyBase.name
2680 = ML_(dinfo_strdup)( "di.readdwarf3.ptD.base_type.2",
2681 "<anon_base_type>" );
sewardjb8b79ad2008-03-03 01:35:41 +00002682
2683 /* Do we have something that looks sane? */
2684 if (/* must have a name */
sewardj9c606bd2008-09-18 18:12:50 +00002685 typeE.Te.TyBase.name == NULL
sewardjb8b79ad2008-03-03 01:35:41 +00002686 /* and a plausible size. Yes, really 32: "complex long
2687 double" apparently has size=32 */
sewardj9c606bd2008-09-18 18:12:50 +00002688 || typeE.Te.TyBase.szB < 0 || typeE.Te.TyBase.szB > 32
sewardjb8b79ad2008-03-03 01:35:41 +00002689 /* and a plausible encoding */
sewardj9c606bd2008-09-18 18:12:50 +00002690 || (typeE.Te.TyBase.enc != 'U'
2691 && typeE.Te.TyBase.enc != 'S'
2692 && typeE.Te.TyBase.enc != 'F'
2693 && typeE.Te.TyBase.enc != 'C'))
philippe5c5b8fc2014-05-06 20:15:55 +00002694 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00002695 /* Last minute hack: if we see this
2696 <1><515>: DW_TAG_base_type
2697 DW_AT_byte_size : 0
2698 DW_AT_encoding : 5
2699 DW_AT_name : void
2700 convert it into a real Void type. */
sewardj9c606bd2008-09-18 18:12:50 +00002701 if (typeE.Te.TyBase.szB == 0
2702 && 0 == VG_(strcmp)("void", typeE.Te.TyBase.name)) {
2703 ML_(TyEnt__make_EMPTY)(&typeE);
2704 typeE.tag = Te_TyVoid;
2705 typeE.Te.TyVoid.isFake = False; /* it's a real one! */
sewardjb8b79ad2008-03-03 01:35:41 +00002706 }
sewardj9c606bd2008-09-18 18:12:50 +00002707
sewardjb8b79ad2008-03-03 01:35:41 +00002708 goto acquire_Type;
2709 }
2710
bart0e947cf2012-02-01 14:59:14 +00002711 /*
2712 * An example of DW_TAG_rvalue_reference_type:
2713 *
2714 * $ readelf --debug-dump /usr/lib/debug/usr/lib/libstdc++.so.6.0.16.debug
2715 * <1><1014>: Abbrev Number: 55 (DW_TAG_rvalue_reference_type)
2716 * <1015> DW_AT_byte_size : 4
2717 * <1016> DW_AT_type : <0xe52>
2718 */
sewardjb8b79ad2008-03-03 01:35:41 +00002719 if (dtag == DW_TAG_pointer_type || dtag == DW_TAG_reference_type
bart0e947cf2012-02-01 14:59:14 +00002720 || dtag == DW_TAG_ptr_to_member_type
2721 || dtag == DW_TAG_rvalue_reference_type) {
sewardjb8b79ad2008-03-03 01:35:41 +00002722 /* This seems legit for _pointer_type and _reference_type. I
2723 don't know if rolling _ptr_to_member_type in here really is
2724 legit, but it's better than not handling it at all. */
sewardj9c606bd2008-09-18 18:12:50 +00002725 VG_(memset)(&typeE, 0, sizeof(typeE));
2726 typeE.cuOff = D3_INVALID_CUOFF;
bart0e947cf2012-02-01 14:59:14 +00002727 switch (dtag) {
2728 case DW_TAG_pointer_type:
2729 typeE.tag = Te_TyPtr;
2730 break;
2731 case DW_TAG_reference_type:
2732 typeE.tag = Te_TyRef;
2733 break;
2734 case DW_TAG_ptr_to_member_type:
2735 typeE.tag = Te_TyPtrMbr;
2736 break;
2737 case DW_TAG_rvalue_reference_type:
2738 typeE.tag = Te_TyRvalRef;
2739 break;
2740 default:
2741 vg_assert(False);
2742 }
sewardjb8b79ad2008-03-03 01:35:41 +00002743 /* target type defaults to void */
sewardj9c606bd2008-09-18 18:12:50 +00002744 typeE.Te.TyPorR.typeR = D3_FAKEVOID_CUOFF;
bart0e947cf2012-02-01 14:59:14 +00002745 /* These four type kinds don't *have* to specify their size, in
sewardj31452302009-01-25 23:50:32 +00002746 which case we assume it's a machine word. But if they do
2747 specify it, it must be a machine word :-) This probably
2748 assumes that the word size of the Dwarf3 we're reading is the
2749 same size as that on the machine. gcc appears to give a size
2750 whereas icc9 doesn't. */
2751 typeE.Te.TyPorR.szB = sizeof(UWord);
philippe746e97e2014-06-15 10:51:14 +00002752 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00002753 while (True) {
philippe746e97e2014-06-15 10:51:14 +00002754 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2755 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2756 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00002757 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00002758 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
2759 if (attr == DW_AT_byte_size && cts.szB > 0) {
2760 typeE.Te.TyPorR.szB = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00002761 }
sewardj5d616df2013-07-02 08:07:15 +00002762 if (attr == DW_AT_type && cts.szB > 0) {
2763 typeE.Te.TyPorR.typeR
2764 = cook_die_using_form( cc, (UWord)cts.u.val, form );
sewardjb8b79ad2008-03-03 01:35:41 +00002765 }
2766 }
2767 /* Do we have something that looks sane? */
sewardj31452302009-01-25 23:50:32 +00002768 if (typeE.Te.TyPorR.szB != sizeof(UWord))
philippe5c5b8fc2014-05-06 20:15:55 +00002769 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00002770 else
2771 goto acquire_Type;
2772 }
2773
2774 if (dtag == DW_TAG_enumeration_type) {
2775 /* Create a new Type to hold the results. */
sewardj9c606bd2008-09-18 18:12:50 +00002776 VG_(memset)(&typeE, 0, sizeof(typeE));
2777 typeE.cuOff = posn;
2778 typeE.tag = Te_TyEnum;
mjw9d82d0f2013-06-28 14:03:58 +00002779 Bool is_decl = False;
sewardj9c606bd2008-09-18 18:12:50 +00002780 typeE.Te.TyEnum.atomRs
2781 = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.ptD.enum_type.1",
2782 ML_(dinfo_free),
2783 sizeof(UWord) );
philippe746e97e2014-06-15 10:51:14 +00002784 nf_i=0;
sewardjb8b79ad2008-03-03 01:35:41 +00002785 while (True) {
philippe746e97e2014-06-15 10:51:14 +00002786 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2787 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2788 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00002789 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00002790 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
2791 if (attr == DW_AT_name && cts.szB < 0) {
sewardj9c606bd2008-09-18 18:12:50 +00002792 typeE.Te.TyEnum.name
sewardj5d616df2013-07-02 08:07:15 +00002793 = ML_(cur_read_strdup)( cts.u.cur,
2794 "di.readdwarf3.pTD.enum_type.2" );
sewardjb8b79ad2008-03-03 01:35:41 +00002795 }
sewardj5d616df2013-07-02 08:07:15 +00002796 if (attr == DW_AT_byte_size && cts.szB > 0) {
2797 typeE.Te.TyEnum.szB = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00002798 }
mjw9d82d0f2013-06-28 14:03:58 +00002799 if (attr == DW_AT_declaration) {
2800 is_decl = True;
2801 }
sewardjb8b79ad2008-03-03 01:35:41 +00002802 }
sewardj6ece2312009-01-24 01:44:15 +00002803
2804 if (!typeE.Te.TyEnum.name)
2805 typeE.Te.TyEnum.name
2806 = ML_(dinfo_strdup)( "di.readdwarf3.pTD.enum_type.3",
2807 "<anon_enum_type>" );
2808
sewardjb8b79ad2008-03-03 01:35:41 +00002809 /* Do we have something that looks sane? */
sewardj2acc87c2011-02-01 23:10:14 +00002810 if (typeE.Te.TyEnum.szB == 0
2811 /* we must know the size */
2812 /* but not for Ada, which uses such dummy
mjw9d82d0f2013-06-28 14:03:58 +00002813 enumerations as helper for gdb ada mode.
2814 Also GCC allows incomplete enums as GNU extension.
2815 http://gcc.gnu.org/onlinedocs/gcc/Incomplete-Enums.html
2816 These are marked as DW_AT_declaration and won't have
2817 a size. They can only be used in declaration or as
2818 pointer types. You can't allocate variables or storage
2819 using such an enum type. (Also GCC seems to have a bug
2820 that will put such an enumeration_type into a .debug_types
2821 unit which should only contain complete types.) */
2822 && (parser->language != 'A' && !is_decl)) {
philippe5c5b8fc2014-05-06 20:15:55 +00002823 goto_bad_DIE;
sewardjd9350682012-04-05 07:55:47 +00002824 }
2825
sewardjb8b79ad2008-03-03 01:35:41 +00002826 /* On't stack! */
sewardj9c606bd2008-09-18 18:12:50 +00002827 typestack_push( cc, parser, td3, &typeE, level );
sewardjb8b79ad2008-03-03 01:35:41 +00002828 goto acquire_Type;
2829 }
2830
sewardjf3aaa332008-10-23 10:54:40 +00002831 /* gcc (GCC) 4.4.0 20081017 (experimental) occasionally produces
2832 DW_TAG_enumerator with only a DW_AT_name but no
2833 DW_AT_const_value. This is in violation of the Dwarf3 standard,
2834 and appears to be a new "feature" of gcc - versions 4.3.x and
2835 earlier do not appear to do this. So accept DW_TAG_enumerator
2836 which only have a name but no value. An example:
2837
2838 <1><180>: Abbrev Number: 6 (DW_TAG_enumeration_type)
2839 <181> DW_AT_name : (indirect string, offset: 0xda70):
2840 QtMsgType
2841 <185> DW_AT_byte_size : 4
2842 <186> DW_AT_decl_file : 14
2843 <187> DW_AT_decl_line : 1480
2844 <189> DW_AT_sibling : <0x1a7>
2845 <2><18d>: Abbrev Number: 7 (DW_TAG_enumerator)
2846 <18e> DW_AT_name : (indirect string, offset: 0x9e18):
2847 QtDebugMsg
2848 <2><192>: Abbrev Number: 7 (DW_TAG_enumerator)
2849 <193> DW_AT_name : (indirect string, offset: 0x1505f):
2850 QtWarningMsg
2851 <2><197>: Abbrev Number: 7 (DW_TAG_enumerator)
2852 <198> DW_AT_name : (indirect string, offset: 0x16f4a):
2853 QtCriticalMsg
2854 <2><19c>: Abbrev Number: 7 (DW_TAG_enumerator)
2855 <19d> DW_AT_name : (indirect string, offset: 0x156dd):
2856 QtFatalMsg
2857 <2><1a1>: Abbrev Number: 7 (DW_TAG_enumerator)
2858 <1a2> DW_AT_name : (indirect string, offset: 0x13660):
2859 QtSystemMsg
2860 */
sewardjb8b79ad2008-03-03 01:35:41 +00002861 if (dtag == DW_TAG_enumerator) {
sewardj9c606bd2008-09-18 18:12:50 +00002862 VG_(memset)( &atomE, 0, sizeof(atomE) );
2863 atomE.cuOff = posn;
2864 atomE.tag = Te_Atom;
philippe746e97e2014-06-15 10:51:14 +00002865 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00002866 while (True) {
philippe746e97e2014-06-15 10:51:14 +00002867 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2868 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2869 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00002870 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00002871 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
2872 if (attr == DW_AT_name && cts.szB < 0) {
sewardj9c606bd2008-09-18 18:12:50 +00002873 atomE.Te.Atom.name
sewardj5d616df2013-07-02 08:07:15 +00002874 = ML_(cur_read_strdup)( cts.u.cur,
2875 "di.readdwarf3.pTD.enumerator.1" );
sewardjb8b79ad2008-03-03 01:35:41 +00002876 }
sewardj5d616df2013-07-02 08:07:15 +00002877 if (attr == DW_AT_const_value && cts.szB > 0) {
2878 atomE.Te.Atom.value = cts.u.val;
sewardjf3aaa332008-10-23 10:54:40 +00002879 atomE.Te.Atom.valueKnown = True;
sewardjb8b79ad2008-03-03 01:35:41 +00002880 }
2881 }
2882 /* Do we have something that looks sane? */
sewardjf3aaa332008-10-23 10:54:40 +00002883 if (atomE.Te.Atom.name == NULL)
philippe5c5b8fc2014-05-06 20:15:55 +00002884 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00002885 /* Do we have a plausible parent? */
philippe5c5b8fc2014-05-06 20:15:55 +00002886 if (typestack_is_empty(parser)) goto_bad_DIE;
sewardj9c606bd2008-09-18 18:12:50 +00002887 vg_assert(ML_(TyEnt__is_type)(&parser->qparentE[parser->sp]));
2888 vg_assert(parser->qparentE[parser->sp].cuOff != D3_INVALID_CUOFF);
philippe5c5b8fc2014-05-06 20:15:55 +00002889 if (level != parser->qlevel[parser->sp]+1) goto_bad_DIE;
2890 if (parser->qparentE[parser->sp].tag != Te_TyEnum) goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00002891 /* Record this child in the parent */
sewardj9c606bd2008-09-18 18:12:50 +00002892 vg_assert(parser->qparentE[parser->sp].Te.TyEnum.atomRs);
2893 VG_(addToXA)( parser->qparentE[parser->sp].Te.TyEnum.atomRs,
2894 &atomE );
sewardjb8b79ad2008-03-03 01:35:41 +00002895 /* And record the child itself */
2896 goto acquire_Atom;
2897 }
2898
sewardj6a3a2842008-08-20 08:14:07 +00002899 /* Treat DW_TAG_class_type as if it was a DW_TAG_structure_type. I
2900 don't know if this is correct, but it at least makes this reader
2901 usable for gcc-4.3 produced Dwarf3. */
2902 if (dtag == DW_TAG_structure_type || dtag == DW_TAG_class_type
2903 || dtag == DW_TAG_union_type) {
sewardjb8b79ad2008-03-03 01:35:41 +00002904 Bool have_szB = False;
2905 Bool is_decl = False;
2906 Bool is_spec = False;
2907 /* Create a new Type to hold the results. */
sewardj9c606bd2008-09-18 18:12:50 +00002908 VG_(memset)(&typeE, 0, sizeof(typeE));
2909 typeE.cuOff = posn;
2910 typeE.tag = Te_TyStOrUn;
2911 typeE.Te.TyStOrUn.name = NULL;
dejanj0abc4192014-04-04 10:20:03 +00002912 typeE.Te.TyStOrUn.typeR = D3_INVALID_CUOFF;
sewardj9c606bd2008-09-18 18:12:50 +00002913 typeE.Te.TyStOrUn.fieldRs
2914 = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.pTD.struct_type.1",
2915 ML_(dinfo_free),
2916 sizeof(UWord) );
2917 typeE.Te.TyStOrUn.complete = True;
2918 typeE.Te.TyStOrUn.isStruct = dtag == DW_TAG_structure_type
2919 || dtag == DW_TAG_class_type;
philippe746e97e2014-06-15 10:51:14 +00002920 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00002921 while (True) {
philippe746e97e2014-06-15 10:51:14 +00002922 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
2923 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
2924 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00002925 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00002926 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
2927 if (attr == DW_AT_name && cts.szB < 0) {
sewardj9c606bd2008-09-18 18:12:50 +00002928 typeE.Te.TyStOrUn.name
sewardj5d616df2013-07-02 08:07:15 +00002929 = ML_(cur_read_strdup)( cts.u.cur,
2930 "di.readdwarf3.ptD.struct_type.2" );
sewardjb8b79ad2008-03-03 01:35:41 +00002931 }
sewardj5d616df2013-07-02 08:07:15 +00002932 if (attr == DW_AT_byte_size && cts.szB >= 0) {
2933 typeE.Te.TyStOrUn.szB = cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00002934 have_szB = True;
2935 }
sewardj5d616df2013-07-02 08:07:15 +00002936 if (attr == DW_AT_declaration && cts.szB > 0 && cts.u.val > 0) {
sewardjb8b79ad2008-03-03 01:35:41 +00002937 is_decl = True;
2938 }
sewardj5d616df2013-07-02 08:07:15 +00002939 if (attr == DW_AT_specification && cts.szB > 0 && cts.u.val > 0) {
sewardjb8b79ad2008-03-03 01:35:41 +00002940 is_spec = True;
2941 }
dejanj0abc4192014-04-04 10:20:03 +00002942 if (attr == DW_AT_signature && form == DW_FORM_ref_sig8
2943 && cts.szB > 0) {
2944 have_szB = True;
2945 typeE.Te.TyStOrUn.szB = 8;
2946 typeE.Te.TyStOrUn.typeR
2947 = cook_die_using_form( cc, (UWord)cts.u.val, form );
2948 }
sewardjb8b79ad2008-03-03 01:35:41 +00002949 }
2950 /* Do we have something that looks sane? */
2951 if (is_decl && (!is_spec)) {
2952 /* It's a DW_AT_declaration. We require the name but
2953 nothing else. */
sewardj00888882012-06-30 20:21:58 +00002954 /* JRS 2012-06-28: following discussion w/ tromey, if the the
2955 type doesn't have name, just make one up, and accept it.
2956 It might be referred to by other DIEs, so ignoring it
2957 doesn't seem like a safe option. */
sewardj9c606bd2008-09-18 18:12:50 +00002958 if (typeE.Te.TyStOrUn.name == NULL)
sewardj00888882012-06-30 20:21:58 +00002959 typeE.Te.TyStOrUn.name
2960 = ML_(dinfo_strdup)( "di.readdwarf3.ptD.struct_type.3",
2961 "<anon_struct_type>" );
sewardj9c606bd2008-09-18 18:12:50 +00002962 typeE.Te.TyStOrUn.complete = False;
sewardjb34bb1d2009-08-10 18:59:54 +00002963 /* JRS 2009 Aug 10: <possible kludge>? */
2964 /* Push this tyent on the stack, even though it's incomplete.
2965 It appears that gcc-4.4 on Fedora 11 will sometimes create
2966 DW_TAG_member entries for it, and so we need to have a
2967 plausible parent present in order for that to work. See
2968 #200029 comments 8 and 9. */
2969 typestack_push( cc, parser, td3, &typeE, level );
2970 /* </possible kludge> */
sewardjb8b79ad2008-03-03 01:35:41 +00002971 goto acquire_Type;
2972 }
2973 if ((!is_decl) /* && (!is_spec) */) {
2974 /* this is the common, ordinary case */
philippe5c5b8fc2014-05-06 20:15:55 +00002975 /* The name can be present, or not */
2976 if (!have_szB) {
2977 /* We must know the size.
2978 But in Ada, record with discriminants might have no size.
2979 But in C, VLA in the middle of a struct (gcc extension)
2980 might have no size.
2981 Instead, some GNAT dwarf extensions and/or dwarf entries
2982 allow to calculate the struct size at runtime.
2983 We cannot do that (yet?) so, the temporary kludge is to use
2984 a small size. */
2985 typeE.Te.TyStOrUn.szB = 1;
2986 }
sewardjb8b79ad2008-03-03 01:35:41 +00002987 /* On't stack! */
sewardj9c606bd2008-09-18 18:12:50 +00002988 typestack_push( cc, parser, td3, &typeE, level );
sewardjb8b79ad2008-03-03 01:35:41 +00002989 goto acquire_Type;
2990 }
2991 else {
2992 /* don't know how to handle any other variants just now */
philippe5c5b8fc2014-05-06 20:15:55 +00002993 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00002994 }
2995 }
2996
2997 if (dtag == DW_TAG_member) {
2998 /* Acquire member entries for both DW_TAG_structure_type and
2999 DW_TAG_union_type. They differ minorly, in that struct
3000 members must have a DW_AT_data_member_location expression
3001 whereas union members must not. */
3002 Bool parent_is_struct;
sewardj9c606bd2008-09-18 18:12:50 +00003003 VG_(memset)( &fieldE, 0, sizeof(fieldE) );
3004 fieldE.cuOff = posn;
3005 fieldE.tag = Te_Field;
3006 fieldE.Te.Field.typeR = D3_INVALID_CUOFF;
philippe746e97e2014-06-15 10:51:14 +00003007 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003008 while (True) {
philippe746e97e2014-06-15 10:51:14 +00003009 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
3010 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
3011 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00003012 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00003013 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
3014 if (attr == DW_AT_name && cts.szB < 0) {
sewardj9c606bd2008-09-18 18:12:50 +00003015 fieldE.Te.Field.name
sewardj5d616df2013-07-02 08:07:15 +00003016 = ML_(cur_read_strdup)( cts.u.cur,
3017 "di.readdwarf3.ptD.member.1" );
sewardjb8b79ad2008-03-03 01:35:41 +00003018 }
sewardj5d616df2013-07-02 08:07:15 +00003019 if (attr == DW_AT_type && cts.szB > 0) {
3020 fieldE.Te.Field.typeR
3021 = cook_die_using_form( cc, (UWord)cts.u.val, form );
sewardjb8b79ad2008-03-03 01:35:41 +00003022 }
tom3c9cf342009-11-12 13:28:34 +00003023 /* There are 2 different cases for DW_AT_data_member_location.
3024 If it is a constant class attribute, it contains byte offset
3025 from the beginning of the containing entity.
3026 Otherwise it is a location expression. */
sewardj5d616df2013-07-02 08:07:15 +00003027 if (attr == DW_AT_data_member_location && cts.szB > 0) {
tom3c9cf342009-11-12 13:28:34 +00003028 fieldE.Te.Field.nLoc = -1;
sewardj5d616df2013-07-02 08:07:15 +00003029 fieldE.Te.Field.pos.offset = cts.u.val;
3030 }
3031 if (attr == DW_AT_data_member_location && cts.szB <= 0) {
3032 fieldE.Te.Field.nLoc = (UWord)(-cts.szB);
tom3c9cf342009-11-12 13:28:34 +00003033 fieldE.Te.Field.pos.loc
sewardj5d616df2013-07-02 08:07:15 +00003034 = ML_(cur_read_memdup)( cts.u.cur,
3035 (SizeT)fieldE.Te.Field.nLoc,
3036 "di.readdwarf3.ptD.member.2" );
sewardjb8b79ad2008-03-03 01:35:41 +00003037 }
3038 }
3039 /* Do we have a plausible parent? */
philippe5c5b8fc2014-05-06 20:15:55 +00003040 if (typestack_is_empty(parser)) goto_bad_DIE;
sewardj9c606bd2008-09-18 18:12:50 +00003041 vg_assert(ML_(TyEnt__is_type)(&parser->qparentE[parser->sp]));
3042 vg_assert(parser->qparentE[parser->sp].cuOff != D3_INVALID_CUOFF);
philippe5c5b8fc2014-05-06 20:15:55 +00003043 if (level != parser->qlevel[parser->sp]+1) goto_bad_DIE;
3044 if (parser->qparentE[parser->sp].tag != Te_TyStOrUn) goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00003045 /* Do we have something that looks sane? If this a member of a
3046 struct, we must have a location expression; but if a member
3047 of a union that is irrelevant (D3 spec sec 5.6.6). We ought
3048 to reject in the latter case, but some compilers have been
3049 observed to emit constant-zero expressions. So just ignore
3050 them. */
3051 parent_is_struct
sewardj9c606bd2008-09-18 18:12:50 +00003052 = parser->qparentE[parser->sp].Te.TyStOrUn.isStruct;
3053 if (!fieldE.Te.Field.name)
3054 fieldE.Te.Field.name
3055 = ML_(dinfo_strdup)( "di.readdwarf3.ptD.member.3",
3056 "<anon_field>" );
3057 vg_assert(fieldE.Te.Field.name);
3058 if (fieldE.Te.Field.typeR == D3_INVALID_CUOFF)
philippe5c5b8fc2014-05-06 20:15:55 +00003059 goto_bad_DIE;
tom3c9cf342009-11-12 13:28:34 +00003060 if (fieldE.Te.Field.nLoc) {
tom07de5dd2009-08-03 14:39:54 +00003061 if (!parent_is_struct) {
3062 /* If this is a union type, pretend we haven't seen the data
3063 member location expression, as it is by definition
3064 redundant (it must be zero). */
tom3c9cf342009-11-12 13:28:34 +00003065 if (fieldE.Te.Field.nLoc > 0)
3066 ML_(dinfo_free)(fieldE.Te.Field.pos.loc);
3067 fieldE.Te.Field.pos.loc = NULL;
tom07de5dd2009-08-03 14:39:54 +00003068 fieldE.Te.Field.nLoc = 0;
3069 }
3070 /* Record this child in the parent */
3071 fieldE.Te.Field.isStruct = parent_is_struct;
3072 vg_assert(parser->qparentE[parser->sp].Te.TyStOrUn.fieldRs);
3073 VG_(addToXA)( parser->qparentE[parser->sp].Te.TyStOrUn.fieldRs,
3074 &posn );
3075 /* And record the child itself */
3076 goto acquire_Field;
3077 } else {
3078 /* Member with no location - this can happen with static
3079 const members in C++ code which are compile time constants
3080 that do no exist in the class. They're not of any interest
3081 to us so we ignore them. */
philipped9df0ea2012-02-28 20:10:05 +00003082 ML_(TyEnt__make_EMPTY)(&fieldE);
sewardjb8b79ad2008-03-03 01:35:41 +00003083 }
sewardjb8b79ad2008-03-03 01:35:41 +00003084 }
3085
3086 if (dtag == DW_TAG_array_type) {
sewardj9c606bd2008-09-18 18:12:50 +00003087 VG_(memset)(&typeE, 0, sizeof(typeE));
3088 typeE.cuOff = posn;
3089 typeE.tag = Te_TyArray;
3090 typeE.Te.TyArray.typeR = D3_INVALID_CUOFF;
3091 typeE.Te.TyArray.boundRs
3092 = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.ptD.array_type.1",
3093 ML_(dinfo_free),
3094 sizeof(UWord) );
philippe746e97e2014-06-15 10:51:14 +00003095 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003096 while (True) {
philippe746e97e2014-06-15 10:51:14 +00003097 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
3098 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
3099 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00003100 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00003101 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
3102 if (attr == DW_AT_type && cts.szB > 0) {
3103 typeE.Te.TyArray.typeR
3104 = cook_die_using_form( cc, (UWord)cts.u.val, form );
sewardjb8b79ad2008-03-03 01:35:41 +00003105 }
3106 }
sewardj9c606bd2008-09-18 18:12:50 +00003107 if (typeE.Te.TyArray.typeR == D3_INVALID_CUOFF)
philippe5c5b8fc2014-05-06 20:15:55 +00003108 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00003109 /* On't stack! */
sewardj9c606bd2008-09-18 18:12:50 +00003110 typestack_push( cc, parser, td3, &typeE, level );
sewardjb8b79ad2008-03-03 01:35:41 +00003111 goto acquire_Type;
3112 }
3113
sewardj2acc87c2011-02-01 23:10:14 +00003114 /* this is a subrange type defining the bounds of an array. */
3115 if (dtag == DW_TAG_subrange_type
3116 && subrange_type_denotes_array_bounds(parser, dtag)) {
sewardjb8b79ad2008-03-03 01:35:41 +00003117 Bool have_lower = False;
3118 Bool have_upper = False;
3119 Bool have_count = False;
3120 Long lower = 0;
3121 Long upper = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003122
3123 switch (parser->language) {
3124 case 'C': have_lower = True; lower = 0; break;
3125 case 'F': have_lower = True; lower = 1; break;
3126 case '?': have_lower = False; break;
sewardj2acc87c2011-02-01 23:10:14 +00003127 case 'A': have_lower = False; break;
sewardjb8b79ad2008-03-03 01:35:41 +00003128 default: vg_assert(0); /* assured us by handling of
3129 DW_TAG_compile_unit in this fn */
3130 }
sewardj9c606bd2008-09-18 18:12:50 +00003131
3132 VG_(memset)( &boundE, 0, sizeof(boundE) );
3133 boundE.cuOff = D3_INVALID_CUOFF;
3134 boundE.tag = Te_Bound;
philippe746e97e2014-06-15 10:51:14 +00003135 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003136 while (True) {
philippe746e97e2014-06-15 10:51:14 +00003137 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
3138 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
3139 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00003140 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00003141 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
3142 if (attr == DW_AT_lower_bound && cts.szB > 0) {
3143 lower = (Long)cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00003144 have_lower = True;
3145 }
sewardj5d616df2013-07-02 08:07:15 +00003146 if (attr == DW_AT_upper_bound && cts.szB > 0) {
3147 upper = (Long)cts.u.val;
sewardjb8b79ad2008-03-03 01:35:41 +00003148 have_upper = True;
3149 }
sewardj5d616df2013-07-02 08:07:15 +00003150 if (attr == DW_AT_count && cts.szB > 0) {
3151 /*count = (Long)cts.u.val;*/
sewardjb8b79ad2008-03-03 01:35:41 +00003152 have_count = True;
3153 }
3154 }
3155 /* FIXME: potentially skip the rest if no parent present, since
3156 it could be the case that this subrange type is free-standing
3157 (not being used to describe the bounds of a containing array
3158 type) */
3159 /* Do we have a plausible parent? */
philippe5c5b8fc2014-05-06 20:15:55 +00003160 if (typestack_is_empty(parser)) goto_bad_DIE;
sewardj9c606bd2008-09-18 18:12:50 +00003161 vg_assert(ML_(TyEnt__is_type)(&parser->qparentE[parser->sp]));
3162 vg_assert(parser->qparentE[parser->sp].cuOff != D3_INVALID_CUOFF);
philippe5c5b8fc2014-05-06 20:15:55 +00003163 if (level != parser->qlevel[parser->sp]+1) goto_bad_DIE;
3164 if (parser->qparentE[parser->sp].tag != Te_TyArray) goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00003165
3166 /* Figure out if we have a definite range or not */
3167 if (have_lower && have_upper && (!have_count)) {
sewardj9c606bd2008-09-18 18:12:50 +00003168 boundE.Te.Bound.knownL = True;
3169 boundE.Te.Bound.knownU = True;
3170 boundE.Te.Bound.boundL = lower;
3171 boundE.Te.Bound.boundU = upper;
tom72259922009-08-03 08:50:58 +00003172 }
sewardjb8b79ad2008-03-03 01:35:41 +00003173 else if (have_lower && (!have_upper) && (!have_count)) {
sewardj9c606bd2008-09-18 18:12:50 +00003174 boundE.Te.Bound.knownL = True;
3175 boundE.Te.Bound.knownU = False;
3176 boundE.Te.Bound.boundL = lower;
3177 boundE.Te.Bound.boundU = 0;
tom72259922009-08-03 08:50:58 +00003178 }
3179 else if ((!have_lower) && have_upper && (!have_count)) {
3180 boundE.Te.Bound.knownL = False;
3181 boundE.Te.Bound.knownU = True;
3182 boundE.Te.Bound.boundL = 0;
3183 boundE.Te.Bound.boundU = upper;
3184 }
3185 else if ((!have_lower) && (!have_upper) && (!have_count)) {
3186 boundE.Te.Bound.knownL = False;
3187 boundE.Te.Bound.knownU = False;
3188 boundE.Te.Bound.boundL = 0;
3189 boundE.Te.Bound.boundU = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003190 } else {
3191 /* FIXME: handle more cases */
philippe5c5b8fc2014-05-06 20:15:55 +00003192 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00003193 }
3194
3195 /* Record this bound in the parent */
sewardj9c606bd2008-09-18 18:12:50 +00003196 boundE.cuOff = posn;
3197 vg_assert(parser->qparentE[parser->sp].Te.TyArray.boundRs);
3198 VG_(addToXA)( parser->qparentE[parser->sp].Te.TyArray.boundRs,
sewardj471d6b32012-04-05 07:15:22 +00003199 &boundE.cuOff );
sewardjb8b79ad2008-03-03 01:35:41 +00003200 /* And record the child itself */
sewardj9c606bd2008-09-18 18:12:50 +00003201 goto acquire_Bound;
sewardjb8b79ad2008-03-03 01:35:41 +00003202 }
3203
sewardj2acc87c2011-02-01 23:10:14 +00003204 /* typedef or subrange_type other than array bounds. */
3205 if (dtag == DW_TAG_typedef
3206 || (dtag == DW_TAG_subrange_type
3207 && !subrange_type_denotes_array_bounds(parser, dtag))) {
3208 /* subrange_type other than array bound is only for Ada. */
3209 vg_assert (dtag == DW_TAG_typedef || parser->language == 'A');
3210 /* We can pick up a new typedef/subrange_type any time. */
sewardj9c606bd2008-09-18 18:12:50 +00003211 VG_(memset)(&typeE, 0, sizeof(typeE));
3212 typeE.cuOff = D3_INVALID_CUOFF;
3213 typeE.tag = Te_TyTyDef;
3214 typeE.Te.TyTyDef.name = NULL;
3215 typeE.Te.TyTyDef.typeR = D3_INVALID_CUOFF;
philippe746e97e2014-06-15 10:51:14 +00003216 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003217 while (True) {
philippe746e97e2014-06-15 10:51:14 +00003218 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
3219 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
3220 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00003221 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00003222 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
3223 if (attr == DW_AT_name && cts.szB < 0) {
sewardj9c606bd2008-09-18 18:12:50 +00003224 typeE.Te.TyTyDef.name
sewardj5d616df2013-07-02 08:07:15 +00003225 = ML_(cur_read_strdup)( cts.u.cur,
3226 "di.readdwarf3.ptD.typedef.1" );
sewardjb8b79ad2008-03-03 01:35:41 +00003227 }
sewardj5d616df2013-07-02 08:07:15 +00003228 if (attr == DW_AT_type && cts.szB > 0) {
3229 typeE.Te.TyTyDef.typeR
3230 = cook_die_using_form( cc, (UWord)cts.u.val, form );
sewardjb8b79ad2008-03-03 01:35:41 +00003231 }
3232 }
mjwe5cf4512013-11-24 17:19:35 +00003233 /* Do we have something that looks sane?
3234 gcc gnat Ada generates minimal typedef
3235 such as the below
3236 <6><91cc>: DW_TAG_typedef
3237 DW_AT_abstract_ori: <9066>
3238 g++ for OMP can generate artificial functions that have
3239 parameters that refer to pointers to unnamed typedefs.
3240 See https://bugs.kde.org/show_bug.cgi?id=273475
3241 So we cannot require a name for a DW_TAG_typedef.
3242 */
3243 goto acquire_Type;
sewardjb8b79ad2008-03-03 01:35:41 +00003244 }
3245
3246 if (dtag == DW_TAG_subroutine_type) {
3247 /* function type? just record that one fact and ask no
3248 further questions. */
sewardj9c606bd2008-09-18 18:12:50 +00003249 VG_(memset)(&typeE, 0, sizeof(typeE));
3250 typeE.cuOff = D3_INVALID_CUOFF;
3251 typeE.tag = Te_TyFn;
sewardjb8b79ad2008-03-03 01:35:41 +00003252 goto acquire_Type;
3253 }
3254
3255 if (dtag == DW_TAG_volatile_type || dtag == DW_TAG_const_type) {
3256 Int have_ty = 0;
sewardj9c606bd2008-09-18 18:12:50 +00003257 VG_(memset)(&typeE, 0, sizeof(typeE));
3258 typeE.cuOff = D3_INVALID_CUOFF;
3259 typeE.tag = Te_TyQual;
3260 typeE.Te.TyQual.qual
sewardjb8b79ad2008-03-03 01:35:41 +00003261 = dtag == DW_TAG_volatile_type ? 'V' : 'C';
3262 /* target type defaults to 'void' */
sewardj9c606bd2008-09-18 18:12:50 +00003263 typeE.Te.TyQual.typeR = D3_FAKEVOID_CUOFF;
philippe746e97e2014-06-15 10:51:14 +00003264 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003265 while (True) {
philippe746e97e2014-06-15 10:51:14 +00003266 DW_AT attr = (DW_AT) abbv->nf[nf_i].at_name;
3267 DW_FORM form = (DW_FORM)abbv->nf[nf_i].at_form;
3268 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00003269 if (attr == 0 && form == 0) break;
sewardj5d616df2013-07-02 08:07:15 +00003270 get_Form_contents( &cts, cc, c_die, False/*td3*/, form );
3271 if (attr == DW_AT_type && cts.szB > 0) {
3272 typeE.Te.TyQual.typeR
3273 = cook_die_using_form( cc, (UWord)cts.u.val, form );
sewardjb8b79ad2008-03-03 01:35:41 +00003274 have_ty++;
3275 }
3276 }
3277 /* gcc sometimes generates DW_TAG_const/volatile_type without
3278 DW_AT_type and GDB appears to interpret the type as 'const
3279 void' (resp. 'volatile void'). So just allow it .. */
3280 if (have_ty == 1 || have_ty == 0)
3281 goto acquire_Type;
3282 else
philippe5c5b8fc2014-05-06 20:15:55 +00003283 goto_bad_DIE;
sewardjb8b79ad2008-03-03 01:35:41 +00003284 }
3285
barte64c5a92012-01-16 17:11:07 +00003286 /*
3287 * Treat DW_TAG_unspecified_type as type void. An example of DW_TAG_unspecified_type:
3288 *
3289 * $ readelf --debug-dump /usr/lib/debug/usr/lib/libstdc++.so.6.0.16.debug
3290 * <1><10d4>: Abbrev Number: 53 (DW_TAG_unspecified_type)
3291 * <10d5> DW_AT_name : (indirect string, offset: 0xdb7): decltype(nullptr)
3292 */
3293 if (dtag == DW_TAG_unspecified_type) {
3294 VG_(memset)(&typeE, 0, sizeof(typeE));
3295 typeE.cuOff = D3_INVALID_CUOFF;
3296 typeE.tag = Te_TyQual;
3297 typeE.Te.TyQual.typeR = D3_FAKEVOID_CUOFF;
3298 goto acquire_Type;
3299 }
3300
sewardjb8b79ad2008-03-03 01:35:41 +00003301 /* else ignore this DIE */
3302 return;
3303 /*NOTREACHED*/
3304
3305 acquire_Type:
3306 if (0) VG_(printf)("YYYY Acquire Type\n");
sewardj9c606bd2008-09-18 18:12:50 +00003307 vg_assert(ML_(TyEnt__is_type)( &typeE ));
3308 vg_assert(typeE.cuOff == D3_INVALID_CUOFF || typeE.cuOff == posn);
3309 typeE.cuOff = posn;
3310 VG_(addToXA)( tyents, &typeE );
sewardjb8b79ad2008-03-03 01:35:41 +00003311 return;
3312 /*NOTREACHED*/
3313
3314 acquire_Atom:
3315 if (0) VG_(printf)("YYYY Acquire Atom\n");
sewardj9c606bd2008-09-18 18:12:50 +00003316 vg_assert(atomE.tag == Te_Atom);
3317 vg_assert(atomE.cuOff == D3_INVALID_CUOFF || atomE.cuOff == posn);
3318 atomE.cuOff = posn;
3319 VG_(addToXA)( tyents, &atomE );
sewardjb8b79ad2008-03-03 01:35:41 +00003320 return;
3321 /*NOTREACHED*/
3322
sewardj9c606bd2008-09-18 18:12:50 +00003323 acquire_Field:
sewardjb8b79ad2008-03-03 01:35:41 +00003324 /* For union members, Expr should be absent */
sewardj9c606bd2008-09-18 18:12:50 +00003325 if (0) VG_(printf)("YYYY Acquire Field\n");
3326 vg_assert(fieldE.tag == Te_Field);
tom3c9cf342009-11-12 13:28:34 +00003327 vg_assert(fieldE.Te.Field.nLoc <= 0 || fieldE.Te.Field.pos.loc != NULL);
3328 vg_assert(fieldE.Te.Field.nLoc != 0 || fieldE.Te.Field.pos.loc == NULL);
sewardj9c606bd2008-09-18 18:12:50 +00003329 if (fieldE.Te.Field.isStruct) {
tom3c9cf342009-11-12 13:28:34 +00003330 vg_assert(fieldE.Te.Field.nLoc != 0);
sewardj9c606bd2008-09-18 18:12:50 +00003331 } else {
3332 vg_assert(fieldE.Te.Field.nLoc == 0);
sewardjb8b79ad2008-03-03 01:35:41 +00003333 }
sewardj9c606bd2008-09-18 18:12:50 +00003334 vg_assert(fieldE.cuOff == D3_INVALID_CUOFF || fieldE.cuOff == posn);
3335 fieldE.cuOff = posn;
3336 VG_(addToXA)( tyents, &fieldE );
sewardjb8b79ad2008-03-03 01:35:41 +00003337 return;
3338 /*NOTREACHED*/
3339
sewardj9c606bd2008-09-18 18:12:50 +00003340 acquire_Bound:
3341 if (0) VG_(printf)("YYYY Acquire Bound\n");
3342 vg_assert(boundE.tag == Te_Bound);
3343 vg_assert(boundE.cuOff == D3_INVALID_CUOFF || boundE.cuOff == posn);
3344 boundE.cuOff = posn;
3345 VG_(addToXA)( tyents, &boundE );
sewardjb8b79ad2008-03-03 01:35:41 +00003346 return;
3347 /*NOTREACHED*/
3348
3349 bad_DIE:
philippea0a73932014-06-15 15:42:20 +00003350 dump_bad_die_and_barf(dtag, posn, level,
3351 c_die, saved_die_c_offset,
3352 abbv,
3353 cc);
sewardjb8b79ad2008-03-03 01:35:41 +00003354 /*NOTREACHED*/
3355}
3356
3357
3358/*------------------------------------------------------------*/
3359/*--- ---*/
sewardj9c606bd2008-09-18 18:12:50 +00003360/*--- Compression of type DIE information ---*/
3361/*--- ---*/
3362/*------------------------------------------------------------*/
3363
3364static UWord chase_cuOff ( Bool* changed,
3365 XArray* /* of TyEnt */ ents,
3366 TyEntIndexCache* ents_cache,
3367 UWord cuOff )
3368{
3369 TyEnt* ent;
3370 ent = ML_(TyEnts__index_by_cuOff)( ents, ents_cache, cuOff );
3371
3372 if (!ent) {
3373 VG_(printf)("chase_cuOff: no entry for 0x%05lx\n", cuOff);
3374 *changed = False;
3375 return cuOff;
3376 }
3377
3378 vg_assert(ent->tag != Te_EMPTY);
3379 if (ent->tag != Te_INDIR) {
3380 *changed = False;
3381 return cuOff;
3382 } else {
3383 vg_assert(ent->Te.INDIR.indR < cuOff);
3384 *changed = True;
3385 return ent->Te.INDIR.indR;
3386 }
3387}
3388
3389static
3390void chase_cuOffs_in_XArray ( Bool* changed,
3391 XArray* /* of TyEnt */ ents,
3392 TyEntIndexCache* ents_cache,
3393 /*MOD*/XArray* /* of UWord */ cuOffs )
3394{
3395 Bool b2 = False;
3396 Word i, n = VG_(sizeXA)( cuOffs );
3397 for (i = 0; i < n; i++) {
3398 Bool b = False;
3399 UWord* p = VG_(indexXA)( cuOffs, i );
3400 *p = chase_cuOff( &b, ents, ents_cache, *p );
3401 if (b)
3402 b2 = True;
3403 }
3404 *changed = b2;
3405}
3406
3407static Bool TyEnt__subst_R_fields ( XArray* /* of TyEnt */ ents,
3408 TyEntIndexCache* ents_cache,
3409 /*MOD*/TyEnt* te )
3410{
3411 Bool b, changed = False;
3412 switch (te->tag) {
3413 case Te_EMPTY:
3414 break;
3415 case Te_INDIR:
3416 te->Te.INDIR.indR
3417 = chase_cuOff( &b, ents, ents_cache, te->Te.INDIR.indR );
3418 if (b) changed = True;
3419 break;
3420 case Te_UNKNOWN:
3421 break;
3422 case Te_Atom:
3423 break;
3424 case Te_Field:
3425 te->Te.Field.typeR
3426 = chase_cuOff( &b, ents, ents_cache, te->Te.Field.typeR );
3427 if (b) changed = True;
3428 break;
3429 case Te_Bound:
3430 break;
3431 case Te_TyBase:
3432 break;
bart0e947cf2012-02-01 14:59:14 +00003433 case Te_TyPtr:
3434 case Te_TyRef:
3435 case Te_TyPtrMbr:
3436 case Te_TyRvalRef:
sewardj9c606bd2008-09-18 18:12:50 +00003437 te->Te.TyPorR.typeR
3438 = chase_cuOff( &b, ents, ents_cache, te->Te.TyPorR.typeR );
3439 if (b) changed = True;
3440 break;
3441 case Te_TyTyDef:
3442 te->Te.TyTyDef.typeR
3443 = chase_cuOff( &b, ents, ents_cache, te->Te.TyTyDef.typeR );
3444 if (b) changed = True;
3445 break;
3446 case Te_TyStOrUn:
3447 chase_cuOffs_in_XArray( &b, ents, ents_cache, te->Te.TyStOrUn.fieldRs );
3448 if (b) changed = True;
3449 break;
3450 case Te_TyEnum:
3451 chase_cuOffs_in_XArray( &b, ents, ents_cache, te->Te.TyEnum.atomRs );
3452 if (b) changed = True;
3453 break;
3454 case Te_TyArray:
3455 te->Te.TyArray.typeR
3456 = chase_cuOff( &b, ents, ents_cache, te->Te.TyArray.typeR );
3457 if (b) changed = True;
3458 chase_cuOffs_in_XArray( &b, ents, ents_cache, te->Te.TyArray.boundRs );
3459 if (b) changed = True;
3460 break;
3461 case Te_TyFn:
3462 break;
3463 case Te_TyQual:
3464 te->Te.TyQual.typeR
3465 = chase_cuOff( &b, ents, ents_cache, te->Te.TyQual.typeR );
3466 if (b) changed = True;
3467 break;
3468 case Te_TyVoid:
3469 break;
3470 default:
3471 ML_(pp_TyEnt)(te);
3472 vg_assert(0);
3473 }
3474 return changed;
3475}
3476
3477/* Make a pass over 'ents'. For each tyent, inspect the target of any
3478 'R' or 'Rs' fields (those which refer to other tyents), and replace
3479 any which point to INDIR nodes with the target of the indirection
3480 (which should not itself be an indirection). In summary, this
3481 routine shorts out all references to indirection nodes. */
3482static
3483Word dedup_types_substitution_pass ( /*MOD*/XArray* /* of TyEnt */ ents,
3484 TyEntIndexCache* ents_cache )
3485{
3486 Word i, n, nChanged = 0;
3487 Bool b;
3488 n = VG_(sizeXA)( ents );
3489 for (i = 0; i < n; i++) {
3490 TyEnt* ent = VG_(indexXA)( ents, i );
3491 vg_assert(ent->tag != Te_EMPTY);
3492 /* We have to substitute everything, even indirections, so as to
3493 ensure that chains of indirections don't build up. */
3494 b = TyEnt__subst_R_fields( ents, ents_cache, ent );
3495 if (b)
3496 nChanged++;
3497 }
3498
3499 return nChanged;
3500}
3501
3502
3503/* Make a pass over 'ents', building a dictionary of TyEnts as we go.
3504 Look up each new tyent in the dictionary in turn. If it is already
3505 in the dictionary, replace this tyent with an indirection to the
3506 existing one, and delete any malloc'd stuff hanging off this one.
3507 In summary, this routine commons up all tyents that are identical
3508 as defined by TyEnt__cmp_by_all_except_cuOff. */
3509static
3510Word dedup_types_commoning_pass ( /*MOD*/XArray* /* of TyEnt */ ents )
3511{
3512 Word n, i, nDeleted;
3513 WordFM* dict; /* TyEnt* -> void */
3514 TyEnt* ent;
3515 UWord keyW, valW;
3516
3517 dict = VG_(newFM)(
3518 ML_(dinfo_zalloc), "di.readdwarf3.dtcp.1",
3519 ML_(dinfo_free),
3520 (Word(*)(UWord,UWord)) ML_(TyEnt__cmp_by_all_except_cuOff)
3521 );
3522
3523 nDeleted = 0;
3524 n = VG_(sizeXA)( ents );
3525 for (i = 0; i < n; i++) {
3526 ent = VG_(indexXA)( ents, i );
3527 vg_assert(ent->tag != Te_EMPTY);
3528
3529 /* Ignore indirections, although check that they are
3530 not forming a cycle. */
3531 if (ent->tag == Te_INDIR) {
3532 vg_assert(ent->Te.INDIR.indR < ent->cuOff);
3533 continue;
3534 }
3535
3536 keyW = valW = 0;
3537 if (VG_(lookupFM)( dict, &keyW, &valW, (UWord)ent )) {
3538 /* it's already in the dictionary. */
3539 TyEnt* old = (TyEnt*)keyW;
3540 vg_assert(valW == 0);
3541 vg_assert(old != ent);
3542 vg_assert(old->tag != Te_INDIR);
3543 /* since we are traversing the array in increasing order of
3544 cuOff: */
3545 vg_assert(old->cuOff < ent->cuOff);
3546 /* So anyway, dump this entry and replace it with an
3547 indirection to the one in the dictionary. Note that the
3548 assertion above guarantees that we cannot create cycles of
3549 indirections, since we are always creating an indirection
3550 to a tyent with a cuOff lower than this one. */
3551 ML_(TyEnt__make_EMPTY)( ent );
3552 ent->tag = Te_INDIR;
3553 ent->Te.INDIR.indR = old->cuOff;
3554 nDeleted++;
3555 } else {
3556 /* not in dictionary; add it and keep going. */
3557 VG_(addToFM)( dict, (UWord)ent, 0 );
3558 }
3559 }
3560
3561 VG_(deleteFM)( dict, NULL, NULL );
3562
3563 return nDeleted;
3564}
3565
3566
3567static
3568void dedup_types ( Bool td3,
3569 /*MOD*/XArray* /* of TyEnt */ ents,
3570 TyEntIndexCache* ents_cache )
3571{
3572 Word m, n, i, nDel, nSubst, nThresh;
3573 if (0) td3 = True;
3574
3575 n = VG_(sizeXA)( ents );
3576
3577 /* If a commoning pass and a substitution pass both make fewer than
3578 this many changes, just stop. It's pointless to burn up CPU
3579 time trying to compress the last 1% or so out of the array. */
3580 nThresh = n / 200;
3581
3582 /* First we must sort .ents by its .cuOff fields, so we
3583 can index into it. */
florian6bd9dc12012-11-23 16:17:43 +00003584 VG_(setCmpFnXA)( ents, (XACmpFn_t) ML_(TyEnt__cmp_by_cuOff_only) );
sewardj9c606bd2008-09-18 18:12:50 +00003585 VG_(sortXA)( ents );
3586
3587 /* Now repeatedly do commoning and substitution passes over
3588 the array, until there are no more changes. */
3589 do {
3590 nDel = dedup_types_commoning_pass ( ents );
3591 nSubst = dedup_types_substitution_pass ( ents, ents_cache );
3592 vg_assert(nDel >= 0 && nSubst >= 0);
3593 TRACE_D3(" %ld deletions, %ld substitutions\n", nDel, nSubst);
3594 } while (nDel > nThresh || nSubst > nThresh);
3595
3596 /* Sanity check: all INDIR nodes should point at a non-INDIR thing.
3597 In fact this should be true at the end of every loop iteration
3598 above (a commoning pass followed by a substitution pass), but
3599 checking it on every iteration is excessively expensive. Note,
3600 this loop also computes 'm' for the stats printing below it. */
3601 m = 0;
3602 n = VG_(sizeXA)( ents );
3603 for (i = 0; i < n; i++) {
3604 TyEnt *ent, *ind;
3605 ent = VG_(indexXA)( ents, i );
3606 if (ent->tag != Te_INDIR) continue;
3607 m++;
3608 ind = ML_(TyEnts__index_by_cuOff)( ents, ents_cache,
3609 ent->Te.INDIR.indR );
3610 vg_assert(ind);
3611 vg_assert(ind->tag != Te_INDIR);
3612 }
3613
3614 TRACE_D3("Overall: %ld before, %ld after\n", n, n-m);
3615}
3616
3617
3618/*------------------------------------------------------------*/
3619/*--- ---*/
sewardjb8b79ad2008-03-03 01:35:41 +00003620/*--- Resolution of references to type DIEs ---*/
3621/*--- ---*/
3622/*------------------------------------------------------------*/
3623
sewardj9c606bd2008-09-18 18:12:50 +00003624/* Make a pass through the (temporary) variables array. Examine the
3625 type of each variable, check is it found, and chase any Te_INDIRs.
3626 Postcondition is: each variable has a typeR field that refers to a
3627 valid type in tyents, or a Te_UNKNOWN, and is certainly guaranteed
3628 not to refer to a Te_INDIR. (This is so that we can throw all the
3629 Te_INDIRs away later). */
sewardj59a2d182008-08-22 23:18:02 +00003630
sewardjb8b79ad2008-03-03 01:35:41 +00003631__attribute__((noinline))
sewardj9c606bd2008-09-18 18:12:50 +00003632static void resolve_variable_types (
florian6bd9dc12012-11-23 16:17:43 +00003633 void (*barf)( const HChar* ) __attribute__((noreturn)),
sewardj9c606bd2008-09-18 18:12:50 +00003634 /*R-O*/XArray* /* of TyEnt */ ents,
3635 /*MOD*/TyEntIndexCache* ents_cache,
3636 /*MOD*/XArray* /* of TempVar* */ vars
3637 )
sewardjb8b79ad2008-03-03 01:35:41 +00003638{
sewardj9c606bd2008-09-18 18:12:50 +00003639 Word i, n;
sewardj59a2d182008-08-22 23:18:02 +00003640 n = VG_(sizeXA)( vars );
3641 for (i = 0; i < n; i++) {
3642 TempVar* var = *(TempVar**)VG_(indexXA)( vars, i );
sewardj9c606bd2008-09-18 18:12:50 +00003643 /* This is the stated type of the variable. But it might be
3644 an indirection, so be careful. */
3645 TyEnt* ent = ML_(TyEnts__index_by_cuOff)( ents, ents_cache,
3646 var->typeR );
3647 if (ent && ent->tag == Te_INDIR) {
3648 ent = ML_(TyEnts__index_by_cuOff)( ents, ents_cache,
3649 ent->Te.INDIR.indR );
3650 vg_assert(ent);
3651 vg_assert(ent->tag != Te_INDIR);
3652 }
sewardjb8b79ad2008-03-03 01:35:41 +00003653
sewardj9c606bd2008-09-18 18:12:50 +00003654 /* Deal first with "normal" cases */
3655 if (ent && ML_(TyEnt__is_type)(ent)) {
3656 var->typeR = ent->cuOff;
3657 continue;
3658 }
3659
3660 /* If there's no ent, it probably we did not manage to read a
3661 type at the cuOffset which is stated as being this variable's
3662 type. Maybe a deficiency in parse_type_DIE. Complain. */
3663 if (ent == NULL) {
3664 VG_(printf)("\n: Invalid cuOff = 0x%05lx\n", var->typeR );
3665 barf("resolve_variable_types: "
3666 "cuOff does not refer to a known type");
3667 }
3668 vg_assert(ent);
3669 /* If ent has any other tag, something bad happened, along the
3670 lines of var->typeR not referring to a type at all. */
3671 vg_assert(ent->tag == Te_UNKNOWN);
3672 /* Just accept it; the type will be useless, but at least keep
3673 going. */
3674 var->typeR = ent->cuOff;
sewardjb8b79ad2008-03-03 01:35:41 +00003675 }
sewardjb8b79ad2008-03-03 01:35:41 +00003676}
3677
3678
3679/*------------------------------------------------------------*/
3680/*--- ---*/
3681/*--- Parsing of Compilation Units ---*/
3682/*--- ---*/
3683/*------------------------------------------------------------*/
3684
florian6bd9dc12012-11-23 16:17:43 +00003685static Int cmp_TempVar_by_dioff ( const void* v1, const void* v2 ) {
florian3e798632012-11-24 19:41:54 +00003686 const TempVar* t1 = *(const TempVar *const *)v1;
3687 const TempVar* t2 = *(const TempVar *const *)v2;
sewardjb8b79ad2008-03-03 01:35:41 +00003688 if (t1->dioff < t2->dioff) return -1;
3689 if (t1->dioff > t2->dioff) return 1;
3690 return 0;
3691}
3692
sewardj9c606bd2008-09-18 18:12:50 +00003693static void read_DIE (
3694 /*MOD*/WordFM* /* of (XArray* of AddrRange, void) */ rangestree,
3695 /*MOD*/XArray* /* of TyEnt */ tyents,
3696 /*MOD*/XArray* /* of TempVar* */ tempvars,
3697 /*MOD*/XArray* /* of GExpr* */ gexprs,
3698 /*MOD*/D3TypeParser* typarser,
3699 /*MOD*/D3VarParser* varparser,
philippea0a73932014-06-15 15:42:20 +00003700 /*MOD*/D3InlParser* inlparser,
sewardj9c606bd2008-09-18 18:12:50 +00003701 Cursor* c, Bool td3, CUConst* cc, Int level
3702)
sewardjb8b79ad2008-03-03 01:35:41 +00003703{
philippe746e97e2014-06-15 10:51:14 +00003704 g_abbv *abbv;
sewardjb8b79ad2008-03-03 01:35:41 +00003705 ULong atag, abbv_code;
philippe746e97e2014-06-15 10:51:14 +00003706 UInt nf_i;
sewardjb8b79ad2008-03-03 01:35:41 +00003707 UWord posn;
3708 UInt has_children;
philippe746e97e2014-06-15 10:51:14 +00003709 UWord start_die_c_offset;
3710 UWord after_die_c_offset;
philippea0a73932014-06-15 15:42:20 +00003711 // If the DIE we will parse has a sibling and the parser(s) are
3712 // all indicating that parse_children is not necessary, then
3713 // we will skip the children by jumping to the sibling of this DIE
3714 // (if it has a sibling).
3715 UWord sibling = 0;
3716 Bool parse_children = False;
sewardjb8b79ad2008-03-03 01:35:41 +00003717
3718 /* --- Deal with this DIE --- */
sewardjd9350682012-04-05 07:55:47 +00003719 posn = cook_die( cc, get_position_of_Cursor( c ) );
sewardjb8b79ad2008-03-03 01:35:41 +00003720 abbv_code = get_ULEB128( c );
philippe746e97e2014-06-15 10:51:14 +00003721 abbv = get_abbv(cc, abbv_code);
3722 atag = abbv->atag;
sewardjb8b79ad2008-03-03 01:35:41 +00003723 TRACE_D3("\n");
3724 TRACE_D3(" <%d><%lx>: Abbrev Number: %llu (%s)\n",
3725 level, posn, abbv_code, ML_(pp_DW_TAG)( atag ) );
3726
3727 if (atag == 0)
3728 cc->barf("read_DIE: invalid zero tag on DIE");
3729
philippe746e97e2014-06-15 10:51:14 +00003730 has_children = abbv->has_children;
sewardjb8b79ad2008-03-03 01:35:41 +00003731 if (has_children != DW_children_no && has_children != DW_children_yes)
3732 cc->barf("read_DIE: invalid has_children value");
3733
3734 /* We're set up to look at the fields of this DIE. Hand it off to
3735 any parser(s) that want to see it. Since they will in general
3736 advance both the DIE and abbrev cursors, remember their current
3737 settings so that we can then back up and do one final pass over
3738 the DIE, to print out its contents. */
3739
3740 start_die_c_offset = get_position_of_Cursor( c );
sewardjb8b79ad2008-03-03 01:35:41 +00003741
philippe746e97e2014-06-15 10:51:14 +00003742 nf_i = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00003743 while (True) {
sewardj5d616df2013-07-02 08:07:15 +00003744 FormContents cts;
philippe746e97e2014-06-15 10:51:14 +00003745 ULong at_name = abbv->nf[nf_i].at_name;
3746 ULong at_form = abbv->nf[nf_i].at_form;
3747 nf_i++;
sewardjb8b79ad2008-03-03 01:35:41 +00003748 if (at_name == 0 && at_form == 0) break;
3749 TRACE_D3(" %18s: ", ML_(pp_DW_AT)(at_name));
3750 /* Get the form contents, but ignore them; the only purpose is
3751 to print them, if td3 is True */
sewardj5d616df2013-07-02 08:07:15 +00003752 get_Form_contents( &cts, cc, c, td3, (DW_FORM)at_form );
philippea0a73932014-06-15 15:42:20 +00003753 /* Except that we remember if this DIE has a sibling. */
3754 if (UNLIKELY(at_name == DW_AT_sibling && cts.szB > 0)) {
3755 sibling = cts.u.val;
3756 }
sewardjb8b79ad2008-03-03 01:35:41 +00003757 TRACE_D3("\t");
3758 TRACE_D3("\n");
3759 }
3760
3761 after_die_c_offset = get_position_of_Cursor( c );
sewardjb8b79ad2008-03-03 01:35:41 +00003762
philippea0a73932014-06-15 15:42:20 +00003763 if (VG_(clo_read_var_info)) {
3764 set_position_of_Cursor( c, start_die_c_offset );
sewardjb8b79ad2008-03-03 01:35:41 +00003765
philippea0a73932014-06-15 15:42:20 +00003766 parse_type_DIE( tyents,
3767 typarser,
3768 (DW_TAG)atag,
3769 posn,
3770 level,
3771 c, /* DIE cursor */
3772 abbv, /* abbrev */
3773 cc,
3774 td3 );
sewardjb8b79ad2008-03-03 01:35:41 +00003775
philippea0a73932014-06-15 15:42:20 +00003776 set_position_of_Cursor( c, start_die_c_offset );
sewardjb8b79ad2008-03-03 01:35:41 +00003777
philippea0a73932014-06-15 15:42:20 +00003778 parse_var_DIE( rangestree,
3779 tempvars,
3780 gexprs,
3781 varparser,
3782 (DW_TAG)atag,
3783 posn,
3784 level,
3785 c, /* DIE cursor */
3786 abbv, /* abbrev */
3787 cc,
3788 td3 );
3789
3790 parse_children = True;
3791 // type and var parsers do not have logic to skip childrens.
3792 }
3793
3794 if (VG_(clo_read_inline_info)) {
3795 set_position_of_Cursor( c, start_die_c_offset );
3796
3797 parse_children =
3798 parse_inl_DIE( inlparser,
3799 (DW_TAG)atag,
3800 posn,
3801 level,
3802 c, /* DIE cursor */
3803 abbv, /* abbrev */
3804 cc,
3805 td3 )
3806 || parse_children;
3807 }
sewardjb8b79ad2008-03-03 01:35:41 +00003808
3809 set_position_of_Cursor( c, after_die_c_offset );
sewardjb8b79ad2008-03-03 01:35:41 +00003810
philippea0a73932014-06-15 15:42:20 +00003811 /* --- Now recurse into its children, if any
3812 and the parsing of the children is requested by a parser --- */
sewardjb8b79ad2008-03-03 01:35:41 +00003813 if (has_children == DW_children_yes) {
philippea0a73932014-06-15 15:42:20 +00003814 if (parse_children || sibling == 0) {
3815 if (0) TRACE_D3("BEGIN children of level %d\n", level);
3816 while (True) {
3817 atag = peek_ULEB128( c );
3818 if (atag == 0) break;
3819 read_DIE( rangestree, tyents, tempvars, gexprs,
3820 typarser, varparser, inlparser,
3821 c, td3, cc, level+1 );
3822 }
3823 /* Now we need to eat the terminating zero */
3824 atag = get_ULEB128( c );
3825 vg_assert(atag == 0);
3826 if (0) TRACE_D3("END children of level %d\n", level);
3827 } else {
3828 // We can skip the childrens, by jumping to the sibling
3829 TRACE_D3("SKIPPING DIE's children,"
3830 "jumping to sibling <%d><%lx>\n",
3831 level, sibling);
3832 set_position_of_Cursor( c, sibling );
sewardjb8b79ad2008-03-03 01:35:41 +00003833 }
sewardjb8b79ad2008-03-03 01:35:41 +00003834 }
3835
3836}
3837
philippec9831572014-06-15 18:06:20 +00003838static void trace_debug_loc (struct _DebugInfo* di,
3839 __attribute__((noreturn)) void (*barf)( const HChar* ),
3840 DiSlice escn_debug_loc)
sewardjb8b79ad2008-03-03 01:35:41 +00003841{
sewardjb8b79ad2008-03-03 01:35:41 +00003842#if 0
3843 /* This doesn't work properly because it assumes all entries are
3844 packed end to end, with no holes. But that doesn't always
3845 appear to be the case, so it loses sync. And the D3 spec
3846 doesn't appear to require a no-hole situation either. */
3847 /* Display .debug_loc */
3848 Addr dl_base;
3849 UWord dl_offset;
3850 Cursor loc; /* for showing .debug_loc */
philippec9831572014-06-15 18:06:20 +00003851 Bool td3 = di->trace_symtab;
3852
sewardjb8b79ad2008-03-03 01:35:41 +00003853 TRACE_SYMTAB("\n");
3854 TRACE_SYMTAB("\n------ The contents of .debug_loc ------\n");
3855 TRACE_SYMTAB(" Offset Begin End Expression\n");
philippec9831572014-06-15 18:06:20 +00003856 if (ML_(sli_is_valid)(escn_debug_loc)) {
3857 init_Cursor( &loc, escn_debug_loc, 0, barf,
3858 "Overrun whilst reading .debug_loc section(1)" );
3859 dl_base = 0;
3860 dl_offset = 0;
3861 while (True) {
3862 UWord w1, w2;
3863 UWord len;
3864 if (is_at_end_Cursor( &loc ))
3865 break;
sewardjb8b79ad2008-03-03 01:35:41 +00003866
philippec9831572014-06-15 18:06:20 +00003867 /* Read a (host-)word pair. This is something of a hack since
3868 the word size to read is really dictated by the ELF file;
3869 however, we assume we're reading a file with the same
3870 word-sizeness as the host. Reasonably enough. */
3871 w1 = get_UWord( &loc );
3872 w2 = get_UWord( &loc );
sewardjb8b79ad2008-03-03 01:35:41 +00003873
philippec9831572014-06-15 18:06:20 +00003874 if (w1 == 0 && w2 == 0) {
3875 /* end of list. reset 'base' */
3876 TRACE_D3(" %08lx <End of list>\n", dl_offset);
3877 dl_base = 0;
3878 dl_offset = get_position_of_Cursor( &loc );
3879 continue;
3880 }
3881
3882 if (w1 == -1UL) {
3883 /* new value for 'base' */
3884 TRACE_D3(" %08lx %16lx %08lx (base address)\n",
3885 dl_offset, w1, w2);
3886 dl_base = w2;
3887 continue;
3888 }
3889
3890 /* else a location expression follows */
3891 TRACE_D3(" %08lx %08lx %08lx ",
3892 dl_offset, w1 + dl_base, w2 + dl_base);
3893 len = (UWord)get_UShort( &loc );
3894 while (len > 0) {
3895 UChar byte = get_UChar( &loc );
3896 TRACE_D3("%02x", (UInt)byte);
3897 len--;
3898 }
3899 TRACE_SYMTAB("\n");
sewardjb8b79ad2008-03-03 01:35:41 +00003900 }
sewardjb8b79ad2008-03-03 01:35:41 +00003901 }
3902#endif
philippec9831572014-06-15 18:06:20 +00003903}
3904
3905static void trace_debug_ranges (struct _DebugInfo* di,
3906 __attribute__((noreturn)) void (*barf)( const HChar* ),
3907 DiSlice escn_debug_ranges)
3908{
3909 Cursor ranges; /* for showing .debug_ranges */
3910 Addr dr_base;
3911 UWord dr_offset;
3912 Bool td3 = di->trace_symtab;
sewardjb8b79ad2008-03-03 01:35:41 +00003913
3914 /* Display .debug_ranges */
3915 TRACE_SYMTAB("\n");
3916 TRACE_SYMTAB("\n------ The contents of .debug_ranges ------\n");
3917 TRACE_SYMTAB(" Offset Begin End\n");
sewardj5d616df2013-07-02 08:07:15 +00003918 if (ML_(sli_is_valid)(escn_debug_ranges)) {
3919 init_Cursor( &ranges, escn_debug_ranges, 0, barf,
3920 "Overrun whilst reading .debug_ranges section(1)" );
3921 dr_base = 0;
3922 dr_offset = 0;
3923 while (True) {
3924 UWord w1, w2;
sewardjb8b79ad2008-03-03 01:35:41 +00003925
sewardj5d616df2013-07-02 08:07:15 +00003926 if (is_at_end_Cursor( &ranges ))
3927 break;
sewardjb8b79ad2008-03-03 01:35:41 +00003928
sewardj5d616df2013-07-02 08:07:15 +00003929 /* Read a (host-)word pair. This is something of a hack since
3930 the word size to read is really dictated by the ELF file;
3931 however, we assume we're reading a file with the same
3932 word-sizeness as the host. Reasonably enough. */
3933 w1 = get_UWord( &ranges );
3934 w2 = get_UWord( &ranges );
sewardjb8b79ad2008-03-03 01:35:41 +00003935
sewardj5d616df2013-07-02 08:07:15 +00003936 if (w1 == 0 && w2 == 0) {
3937 /* end of list. reset 'base' */
3938 TRACE_D3(" %08lx <End of list>\n", dr_offset);
3939 dr_base = 0;
3940 dr_offset = get_position_of_Cursor( &ranges );
3941 continue;
3942 }
3943
3944 if (w1 == -1UL) {
3945 /* new value for 'base' */
3946 TRACE_D3(" %08lx %16lx %08lx (base address)\n",
3947 dr_offset, w1, w2);
3948 dr_base = w2;
3949 continue;
3950 }
3951
3952 /* else a range [w1+base, w2+base) is denoted */
3953 TRACE_D3(" %08lx %08lx %08lx\n",
3954 dr_offset, w1 + dr_base, w2 + dr_base);
sewardjb8b79ad2008-03-03 01:35:41 +00003955 }
sewardjb8b79ad2008-03-03 01:35:41 +00003956 }
philippec9831572014-06-15 18:06:20 +00003957}
3958
3959static void trace_debug_abbrev (struct _DebugInfo* di,
3960 __attribute__((noreturn)) void (*barf)( const HChar* ),
3961 DiSlice escn_debug_abbv)
3962{
3963 Cursor abbv; /* for showing .debug_abbrev */
3964 Bool td3 = di->trace_symtab;
sewardjb8b79ad2008-03-03 01:35:41 +00003965
sewardjb8b79ad2008-03-03 01:35:41 +00003966 /* Display .debug_abbrev */
sewardjb8b79ad2008-03-03 01:35:41 +00003967 TRACE_SYMTAB("\n");
3968 TRACE_SYMTAB("\n------ The contents of .debug_abbrev ------\n");
sewardj5d616df2013-07-02 08:07:15 +00003969 if (ML_(sli_is_valid)(escn_debug_abbv)) {
3970 init_Cursor( &abbv, escn_debug_abbv, 0, barf,
3971 "Overrun whilst reading .debug_abbrev section" );
sewardjb8b79ad2008-03-03 01:35:41 +00003972 while (True) {
sewardj5d616df2013-07-02 08:07:15 +00003973 if (is_at_end_Cursor( &abbv ))
3974 break;
3975 /* Read one abbreviation table */
3976 TRACE_D3(" Number TAG\n");
sewardjb8b79ad2008-03-03 01:35:41 +00003977 while (True) {
sewardj5d616df2013-07-02 08:07:15 +00003978 ULong atag;
3979 UInt has_children;
3980 ULong acode = get_ULEB128( &abbv );
3981 if (acode == 0) break; /* end of the table */
3982 atag = get_ULEB128( &abbv );
3983 has_children = get_UChar( &abbv );
3984 TRACE_D3(" %llu %s [%s]\n",
3985 acode, ML_(pp_DW_TAG)(atag),
3986 ML_(pp_DW_children)(has_children));
3987 while (True) {
3988 ULong at_name = get_ULEB128( &abbv );
3989 ULong at_form = get_ULEB128( &abbv );
3990 if (at_name == 0 && at_form == 0) break;
3991 TRACE_D3(" %18s %s\n",
3992 ML_(pp_DW_AT)(at_name), ML_(pp_DW_FORM)(at_form));
3993 }
sewardjb8b79ad2008-03-03 01:35:41 +00003994 }
3995 }
3996 }
philippec9831572014-06-15 18:06:20 +00003997}
3998
3999static
4000void new_dwarf3_reader_wrk (
4001 struct _DebugInfo* di,
4002 __attribute__((noreturn)) void (*barf)( const HChar* ),
4003 DiSlice escn_debug_info, DiSlice escn_debug_types,
4004 DiSlice escn_debug_abbv, DiSlice escn_debug_line,
4005 DiSlice escn_debug_str, DiSlice escn_debug_ranges,
4006 DiSlice escn_debug_loc, DiSlice escn_debug_info_alt,
4007 DiSlice escn_debug_abbv_alt, DiSlice escn_debug_line_alt,
4008 DiSlice escn_debug_str_alt
4009)
4010{
4011 XArray* /* of TyEnt */ tyents = NULL;
4012 XArray* /* of TyEnt */ tyents_to_keep = NULL;
4013 XArray* /* of GExpr* */ gexprs = NULL;
4014 XArray* /* of TempVar* */ tempvars = NULL;
4015 WordFM* /* of (XArray* of AddrRange, void) */ rangestree = NULL;
4016 TyEntIndexCache* tyents_cache = NULL;
4017 TyEntIndexCache* tyents_to_keep_cache = NULL;
4018 TempVar *varp, *varp2;
4019 GExpr* gexpr;
4020 Cursor info; /* primary cursor for parsing .debug_info */
4021 D3TypeParser typarser;
4022 D3VarParser varparser;
4023 D3InlParser inlparser;
4024 Word i, j, n;
4025 Bool td3 = di->trace_symtab;
4026 XArray* /* of TempVar* */ dioff_lookup_tab;
4027 Int pass;
4028 VgHashTable signature_types = NULL;
4029
4030 /* Display/trace various information, if requested. */
4031 if (td3) {
4032 trace_debug_loc (di, barf, escn_debug_loc);
4033 trace_debug_ranges (di, barf, escn_debug_ranges);
4034 trace_debug_abbrev (di, barf, escn_debug_abbv);
4035 TRACE_SYMTAB("\n");
4036 }
4037
sewardjb8b79ad2008-03-03 01:35:41 +00004038
philippea0a73932014-06-15 15:42:20 +00004039 if (VG_(clo_read_var_info)) {
4040 /* We'll park the harvested type information in here. Also create
4041 a fake "void" entry with offset D3_FAKEVOID_CUOFF, so we always
4042 have at least one type entry to refer to. D3_FAKEVOID_CUOFF is
4043 huge and presumably will not occur in any valid DWARF3 file --
4044 it would need to have a .debug_info section 4GB long for that to
4045 happen. These type entries end up in the DebugInfo. */
4046 tyents = VG_(newXA)( ML_(dinfo_zalloc),
4047 "di.readdwarf3.ndrw.1 (TyEnt temp array)",
4048 ML_(dinfo_free), sizeof(TyEnt) );
4049 { TyEnt tyent;
4050 VG_(memset)(&tyent, 0, sizeof(tyent));
4051 tyent.tag = Te_TyVoid;
4052 tyent.cuOff = D3_FAKEVOID_CUOFF;
4053 tyent.Te.TyVoid.isFake = True;
4054 VG_(addToXA)( tyents, &tyent );
4055 }
4056 { TyEnt tyent;
4057 VG_(memset)(&tyent, 0, sizeof(tyent));
4058 tyent.tag = Te_UNKNOWN;
4059 tyent.cuOff = D3_INVALID_CUOFF;
4060 VG_(addToXA)( tyents, &tyent );
4061 }
4062
4063 /* This is a tree used to unique-ify the range lists that are
4064 manufactured by parse_var_DIE. References to the keys in the
4065 tree wind up in .rngMany fields in TempVars. We'll need to
4066 delete this tree, and the XArrays attached to it, at the end of
4067 this function. */
4068 rangestree = VG_(newFM)( ML_(dinfo_zalloc),
4069 "di.readdwarf3.ndrw.2 (rangestree)",
4070 ML_(dinfo_free),
4071 (Word(*)(UWord,UWord))cmp__XArrays_of_AddrRange );
4072
4073 /* List of variables we're accumulating. These don't end up in the
4074 DebugInfo; instead their contents are handed to ML_(addVar) and
4075 the list elements are then deleted. */
4076 tempvars = VG_(newXA)( ML_(dinfo_zalloc),
4077 "di.readdwarf3.ndrw.3 (TempVar*s array)",
4078 ML_(dinfo_free),
4079 sizeof(TempVar*) );
4080
4081 /* List of GExprs we're accumulating. These wind up in the
4082 DebugInfo. */
4083 gexprs = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.ndrw.4",
4084 ML_(dinfo_free), sizeof(GExpr*) );
4085
4086 /* We need a D3TypeParser to keep track of partially constructed
4087 types. It'll be discarded as soon as we've completed the CU,
4088 since the resulting information is tipped in to 'tyents' as it
4089 is generated. */
4090 VG_(memset)( &typarser, 0, sizeof(typarser) );
4091 typarser.sp = -1;
4092 typarser.language = '?';
4093 for (i = 0; i < N_D3_TYPE_STACK; i++) {
4094 typarser.qparentE[i].tag = Te_EMPTY;
4095 typarser.qparentE[i].cuOff = D3_INVALID_CUOFF;
4096 }
4097
4098 VG_(memset)( &varparser, 0, sizeof(varparser) );
4099 varparser.sp = -1;
4100
4101 signature_types = VG_(HT_construct) ("signature_types");
sewardj9c606bd2008-09-18 18:12:50 +00004102 }
4103
philippea0a73932014-06-15 15:42:20 +00004104 if (VG_(clo_read_inline_info))
4105 VG_(memset)( &inlparser, 0, sizeof(inlparser) );
sewardjb8b79ad2008-03-03 01:35:41 +00004106
sewardjd9350682012-04-05 07:55:47 +00004107 /* Do an initial pass to scan the .debug_types section, if any, and
4108 fill in the signatured types hash table. This lets us handle
4109 mapping from a type signature to a (cooked) DIE offset directly
4110 in get_Form_contents. */
philippea0a73932014-06-15 15:42:20 +00004111 if (VG_(clo_read_var_info) && ML_(sli_is_valid)(escn_debug_types)) {
sewardj5d616df2013-07-02 08:07:15 +00004112 init_Cursor( &info, escn_debug_types, 0, barf,
sewardjd9350682012-04-05 07:55:47 +00004113 "Overrun whilst reading .debug_types section" );
sewardj5d616df2013-07-02 08:07:15 +00004114 TRACE_D3("\n------ Collecting signatures from "
4115 ".debug_types section ------\n");
sewardjb8b79ad2008-03-03 01:35:41 +00004116
sewardjd9350682012-04-05 07:55:47 +00004117 while (True) {
4118 UWord cu_start_offset, cu_offset_now;
4119 CUConst cc;
4120
4121 cu_start_offset = get_position_of_Cursor( &info );
4122 TRACE_D3("\n");
4123 TRACE_D3(" Compilation Unit @ offset 0x%lx:\n", cu_start_offset);
philippe746e97e2014-06-15 10:51:14 +00004124 /* parse_CU_header initialises the CU's abbv hash table. */
sewardj5d616df2013-07-02 08:07:15 +00004125 parse_CU_Header( &cc, td3, &info, escn_debug_abbv, True, False );
sewardjd9350682012-04-05 07:55:47 +00004126
4127 /* Needed by cook_die. */
sewardj5d616df2013-07-02 08:07:15 +00004128 cc.types_cuOff_bias = escn_debug_info.szB;
sewardjd9350682012-04-05 07:55:47 +00004129
4130 record_signatured_type( signature_types, cc.type_signature,
4131 cook_die( &cc, cc.type_offset ));
4132
4133 /* Until proven otherwise we assume we don't need the icc9
4134 workaround in this case; see the DIE-reading loop below
4135 for details. */
4136 cu_offset_now = (cu_start_offset + cc.unit_length
4137 + (cc.is_dw64 ? 12 : 4));
4138
philippe746e97e2014-06-15 10:51:14 +00004139 if (cu_offset_now >= escn_debug_types.szB) {
4140 clear_CUConst ( &cc);
sewardjd9350682012-04-05 07:55:47 +00004141 break;
philippe746e97e2014-06-15 10:51:14 +00004142 }
sewardjd9350682012-04-05 07:55:47 +00004143
4144 set_position_of_Cursor ( &info, cu_offset_now );
4145 }
4146 }
4147
sewardjf7c97142012-07-14 09:59:01 +00004148 /* Perform three DIE-reading passes. The first pass reads DIEs from
4149 alternate .debug_info (if any), the second pass reads DIEs from
4150 .debug_info, and the third pass reads DIEs from .debug_types.
sewardjd9350682012-04-05 07:55:47 +00004151 Moving the body of this loop into a separate function would
4152 require a large number of arguments to be passed in, so it is
4153 kept inline instead. */
sewardjf7c97142012-07-14 09:59:01 +00004154 for (pass = 0; pass < 3; ++pass) {
sewardj5d616df2013-07-02 08:07:15 +00004155 ULong section_size;
sewardjd9350682012-04-05 07:55:47 +00004156
4157 if (pass == 0) {
sewardj5d616df2013-07-02 08:07:15 +00004158 if (!ML_(sli_is_valid)(escn_debug_info_alt))
sewardjf7c97142012-07-14 09:59:01 +00004159 continue;
4160 /* Now loop over the Compilation Units listed in the alternate
4161 .debug_info section (see D3SPEC sec 7.5) paras 1 and 2.
4162 Each compilation unit contains a Compilation Unit Header
4163 followed by precisely one DW_TAG_compile_unit or
4164 DW_TAG_partial_unit DIE. */
sewardj5d616df2013-07-02 08:07:15 +00004165 init_Cursor( &info, escn_debug_info_alt, 0, barf,
sewardjf7c97142012-07-14 09:59:01 +00004166 "Overrun whilst reading alternate .debug_info section" );
sewardj5d616df2013-07-02 08:07:15 +00004167 section_size = escn_debug_info_alt.szB;
sewardjf7c97142012-07-14 09:59:01 +00004168
4169 TRACE_D3("\n------ Parsing alternate .debug_info section ------\n");
4170 } else if (pass == 1) {
sewardjd9350682012-04-05 07:55:47 +00004171 /* Now loop over the Compilation Units listed in the .debug_info
4172 section (see D3SPEC sec 7.5) paras 1 and 2. Each compilation
4173 unit contains a Compilation Unit Header followed by precisely
4174 one DW_TAG_compile_unit or DW_TAG_partial_unit DIE. */
sewardj5d616df2013-07-02 08:07:15 +00004175 init_Cursor( &info, escn_debug_info, 0, barf,
sewardjd9350682012-04-05 07:55:47 +00004176 "Overrun whilst reading .debug_info section" );
sewardj5d616df2013-07-02 08:07:15 +00004177 section_size = escn_debug_info.szB;
sewardjd9350682012-04-05 07:55:47 +00004178
4179 TRACE_D3("\n------ Parsing .debug_info section ------\n");
4180 } else {
sewardj5d616df2013-07-02 08:07:15 +00004181 if (!ML_(sli_is_valid)(escn_debug_types))
sewardjd9350682012-04-05 07:55:47 +00004182 continue;
sewardj5d616df2013-07-02 08:07:15 +00004183 init_Cursor( &info, escn_debug_types, 0, barf,
sewardjd9350682012-04-05 07:55:47 +00004184 "Overrun whilst reading .debug_types section" );
sewardj5d616df2013-07-02 08:07:15 +00004185 section_size = escn_debug_types.szB;
sewardjd9350682012-04-05 07:55:47 +00004186
4187 TRACE_D3("\n------ Parsing .debug_types section ------\n");
sewardjb8b79ad2008-03-03 01:35:41 +00004188 }
4189
sewardjd9350682012-04-05 07:55:47 +00004190 while (True) {
sewardj5d616df2013-07-02 08:07:15 +00004191 ULong cu_start_offset, cu_offset_now;
sewardjd9350682012-04-05 07:55:47 +00004192 CUConst cc;
4193 /* It may be that the stated size of this CU is larger than the
4194 amount of stuff actually in it. icc9 seems to generate CUs
4195 thusly. We use these variables to figure out if this is
4196 indeed the case, and if so how many bytes we need to skip to
4197 get to the start of the next CU. Not skipping those bytes
4198 causes us to misidentify the start of the next CU, and it all
4199 goes badly wrong after that (not surprisingly). */
4200 UWord cu_size_including_IniLen, cu_amount_used;
sewardjb8b79ad2008-03-03 01:35:41 +00004201
sewardjd9350682012-04-05 07:55:47 +00004202 /* It seems icc9 finishes the DIE info before debug_info_sz
4203 bytes have been used up. So be flexible, and declare the
4204 sequence complete if there is not enough remaining bytes to
4205 hold even the smallest conceivable CU header. (11 bytes I
4206 reckon). */
4207 /* JRS 23Jan09: I suspect this is no longer necessary now that
4208 the code below contains a 'while (cu_amount_used <
4209 cu_size_including_IniLen ...' style loop, which skips over
4210 any leftover bytes at the end of a CU in the case where the
4211 CU's stated size is larger than its actual size (as
4212 determined by reading all its DIEs). However, for prudence,
4213 I'll leave the following test in place. I can't see that a
4214 CU header can be smaller than 11 bytes, so I don't think
4215 there's any harm possible through the test -- it just adds
4216 robustness. */
4217 Word avail = get_remaining_length_Cursor( &info );
4218 if (avail < 11) {
4219 if (avail > 0)
4220 TRACE_D3("new_dwarf3_reader_wrk: warning: "
4221 "%ld unused bytes after end of DIEs\n", avail);
4222 break;
4223 }
sewardjb8b79ad2008-03-03 01:35:41 +00004224
philippea0a73932014-06-15 15:42:20 +00004225 if (VG_(clo_read_var_info)) {
4226 /* Check the varparser's stack is in a sane state. */
4227 vg_assert(varparser.sp == -1);
4228 for (i = 0; i < N_D3_VAR_STACK; i++) {
4229 vg_assert(varparser.ranges[i] == NULL);
4230 vg_assert(varparser.level[i] == 0);
4231 }
4232 for (i = 0; i < N_D3_TYPE_STACK; i++) {
4233 vg_assert(typarser.qparentE[i].cuOff == D3_INVALID_CUOFF);
4234 vg_assert(typarser.qparentE[i].tag == Te_EMPTY);
4235 vg_assert(typarser.qlevel[i] == 0);
4236 }
sewardjd9350682012-04-05 07:55:47 +00004237 }
sewardjb8b79ad2008-03-03 01:35:41 +00004238
sewardjd9350682012-04-05 07:55:47 +00004239 cu_start_offset = get_position_of_Cursor( &info );
4240 TRACE_D3("\n");
sewardj5d616df2013-07-02 08:07:15 +00004241 TRACE_D3(" Compilation Unit @ offset 0x%llx:\n", cu_start_offset);
sewardjd9350682012-04-05 07:55:47 +00004242 /* parse_CU_header initialises the CU's set_abbv_Cursor cache
4243 (saC_cache) */
sewardj5d616df2013-07-02 08:07:15 +00004244 if (pass == 0) {
4245 parse_CU_Header( &cc, td3, &info, escn_debug_abbv_alt,
sewardjf7c97142012-07-14 09:59:01 +00004246 False, True );
sewardj5d616df2013-07-02 08:07:15 +00004247 } else {
4248 parse_CU_Header( &cc, td3, &info, escn_debug_abbv,
sewardjf7c97142012-07-14 09:59:01 +00004249 pass == 2, False );
sewardj5d616df2013-07-02 08:07:15 +00004250 }
4251 cc.escn_debug_str = pass == 0 ? escn_debug_str_alt
4252 : escn_debug_str;
4253 cc.escn_debug_ranges = escn_debug_ranges;
4254 cc.escn_debug_loc = escn_debug_loc;
4255 cc.escn_debug_line = pass == 0 ? escn_debug_line_alt
4256 : escn_debug_line;
4257 cc.escn_debug_info = pass == 0 ? escn_debug_info_alt
4258 : escn_debug_info;
4259 cc.escn_debug_types = escn_debug_types;
4260 cc.escn_debug_info_alt = escn_debug_info_alt;
4261 cc.escn_debug_str_alt = escn_debug_str_alt;
4262 cc.types_cuOff_bias = escn_debug_info.szB;
4263 cc.alt_cuOff_bias = escn_debug_info.szB + escn_debug_types.szB;
4264 cc.cu_start_offset = cu_start_offset;
sewardjd9350682012-04-05 07:55:47 +00004265 cc.di = di;
4266 /* The CU's svma can be deduced by looking at the AT_low_pc
4267 value in the top level TAG_compile_unit, which is the topmost
4268 DIE. We'll leave it for the 'varparser' to acquire that info
4269 and fill it in -- since it is the only party to want to know
4270 it. */
4271 cc.cu_svma_known = False;
4272 cc.cu_svma = 0;
sewardjb8b79ad2008-03-03 01:35:41 +00004273
philippea0a73932014-06-15 15:42:20 +00004274 if (VG_(clo_read_var_info)) {
4275 cc.signature_types = signature_types;
sewardjb8b79ad2008-03-03 01:35:41 +00004276
philippea0a73932014-06-15 15:42:20 +00004277 /* Create a fake outermost-level range covering the entire
4278 address range. So we always have *something* to catch all
4279 variable declarations. */
4280 varstack_push( &cc, &varparser, td3,
4281 unitary_range_list(0UL, ~0UL),
4282 -1, False/*isFunc*/, NULL/*fbGX*/ );
sewardj055b0f82009-01-24 00:04:28 +00004283
philippea0a73932014-06-15 15:42:20 +00004284 /* And set up the file name table. When we come across the top
4285 level DIE for this CU (which is what the next call to
4286 read_DIE should process) we will copy all the file names out
4287 of the .debug_line img area and use this table to look up the
4288 copies when we later see filename numbers in DW_TAG_variables
4289 etc. */
4290 vg_assert(!varparser.filenameTable );
4291 varparser.filenameTable
4292 = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.ndrw.5var",
4293 ML_(dinfo_free),
4294 sizeof(UChar*) );
4295 vg_assert(varparser.filenameTable);
4296 }
4297
4298 if (VG_(clo_read_inline_info)) {
4299 /* filename table for the inlined call parser */
4300 vg_assert(!inlparser.filenameTable );
4301 inlparser.filenameTable
4302 = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.ndrw.5inl",
4303 ML_(dinfo_free),
4304 sizeof(UChar*) );
4305 vg_assert(inlparser.filenameTable);
4306 }
sewardj055b0f82009-01-24 00:04:28 +00004307
sewardjd9350682012-04-05 07:55:47 +00004308 /* Now read the one-and-only top-level DIE for this CU. */
philippea0a73932014-06-15 15:42:20 +00004309 vg_assert(!VG_(clo_read_var_info) || varparser.sp == 0);
sewardjd9350682012-04-05 07:55:47 +00004310 read_DIE( rangestree,
4311 tyents, tempvars, gexprs,
philippea0a73932014-06-15 15:42:20 +00004312 &typarser, &varparser, &inlparser,
sewardjd9350682012-04-05 07:55:47 +00004313 &info, td3, &cc, 0 );
sewardj055b0f82009-01-24 00:04:28 +00004314
sewardj055b0f82009-01-24 00:04:28 +00004315 cu_offset_now = get_position_of_Cursor( &info );
sewardjd9350682012-04-05 07:55:47 +00004316
sewardj5d616df2013-07-02 08:07:15 +00004317 if (0) VG_(printf)("Travelled: %llu size %llu\n",
sewardjd9350682012-04-05 07:55:47 +00004318 cu_offset_now - cc.cu_start_offset,
4319 cc.unit_length + (cc.is_dw64 ? 12 : 4));
4320
4321 /* How big the CU claims it is .. */
4322 cu_size_including_IniLen = cc.unit_length + (cc.is_dw64 ? 12 : 4);
4323 /* .. vs how big we have found it to be */
sewardj055b0f82009-01-24 00:04:28 +00004324 cu_amount_used = cu_offset_now - cc.cu_start_offset;
sewardjd9350682012-04-05 07:55:47 +00004325
sewardj5d616df2013-07-02 08:07:15 +00004326 if (1) TRACE_D3("offset now %lld, d-i-size %lld\n",
sewardjd9350682012-04-05 07:55:47 +00004327 cu_offset_now, section_size);
4328 if (cu_offset_now > section_size)
4329 barf("toplevel DIEs beyond end of CU");
4330
4331 /* If the CU is bigger than it claims to be, we've got a serious
4332 problem. */
4333 if (cu_amount_used > cu_size_including_IniLen)
4334 barf("CU's actual size appears to be larger than it claims it is");
4335
4336 /* If the CU is smaller than it claims to be, we need to skip some
4337 bytes. Loop updates cu_offset_new and cu_amount_used. */
4338 while (cu_amount_used < cu_size_including_IniLen
4339 && get_remaining_length_Cursor( &info ) > 0) {
4340 if (0) VG_(printf)("SKIP\n");
4341 (void)get_UChar( &info );
4342 cu_offset_now = get_position_of_Cursor( &info );
4343 cu_amount_used = cu_offset_now - cc.cu_start_offset;
4344 }
4345
philippea0a73932014-06-15 15:42:20 +00004346 if (VG_(clo_read_var_info)) {
4347 /* Preen to level -2. DIEs have level >= 0 so -2 cannot occur
4348 anywhere else at all. Our fake the-entire-address-space
4349 range is at level -1, so preening to -2 should completely
4350 empty the stack out. */
4351 TRACE_D3("\n");
4352 varstack_preen( &varparser, td3, -2 );
4353 /* Similarly, empty the type stack out. */
4354 typestack_preen( &typarser, td3, -2 );
4355 }
sewardjd9350682012-04-05 07:55:47 +00004356
philippea0a73932014-06-15 15:42:20 +00004357 if (VG_(clo_read_var_info)) {
4358 vg_assert(varparser.filenameTable );
4359 VG_(deleteXA)( varparser.filenameTable );
4360 varparser.filenameTable = NULL;
4361 vg_assert(inlparser.filenameTable );
4362 }
4363 if (VG_(clo_read_inline_info)) {
4364 VG_(deleteXA)( inlparser.filenameTable );
4365 inlparser.filenameTable = NULL;
4366 }
philippe746e97e2014-06-15 10:51:14 +00004367 clear_CUConst(&cc);
sewardjd9350682012-04-05 07:55:47 +00004368
4369 if (cu_offset_now == section_size)
4370 break;
4371 /* else keep going */
sewardj055b0f82009-01-24 00:04:28 +00004372 }
sewardjb8b79ad2008-03-03 01:35:41 +00004373 }
4374
sewardjb8b79ad2008-03-03 01:35:41 +00004375
philippea0a73932014-06-15 15:42:20 +00004376 if (VG_(clo_read_var_info)) {
4377 /* From here on we're post-processing the stuff we got
4378 out of the .debug_info section. */
sewardjb8b79ad2008-03-03 01:35:41 +00004379 if (td3) {
philippea0a73932014-06-15 15:42:20 +00004380 TRACE_D3("\n");
4381 ML_(pp_TyEnts)(tyents, "Initial type entity (TyEnt) array");
4382 TRACE_D3("\n");
4383 TRACE_D3("------ Compressing type entries ------\n");
4384 }
4385
4386 tyents_cache = ML_(dinfo_zalloc)( "di.readdwarf3.ndrw.6",
4387 sizeof(TyEntIndexCache) );
4388 ML_(TyEntIndexCache__invalidate)( tyents_cache );
4389 dedup_types( td3, tyents, tyents_cache );
4390 if (td3) {
4391 TRACE_D3("\n");
4392 ML_(pp_TyEnts)(tyents, "After type entity (TyEnt) compression");
4393 }
4394
4395 TRACE_D3("\n");
4396 TRACE_D3("------ Resolving the types of variables ------\n" );
4397 resolve_variable_types( barf, tyents, tyents_cache, tempvars );
4398
4399 /* Copy all the non-INDIR tyents into a new table. For large
4400 .so's, about 90% of the tyents will by now have been resolved to
4401 INDIRs, and we no longer need them, and so don't need to store
4402 them. */
4403 tyents_to_keep
4404 = VG_(newXA)( ML_(dinfo_zalloc),
4405 "di.readdwarf3.ndrw.7 (TyEnt to-keep array)",
4406 ML_(dinfo_free), sizeof(TyEnt) );
4407 n = VG_(sizeXA)( tyents );
4408 for (i = 0; i < n; i++) {
4409 TyEnt* ent = VG_(indexXA)( tyents, i );
4410 if (ent->tag != Te_INDIR)
4411 VG_(addToXA)( tyents_to_keep, ent );
4412 }
4413
4414 VG_(deleteXA)( tyents );
4415 tyents = NULL;
4416 ML_(dinfo_free)( tyents_cache );
4417 tyents_cache = NULL;
4418
4419 /* Sort tyents_to_keep so we can lookup in it. A complete (if
4420 minor) waste of time, since tyents itself is sorted, but
4421 necessary since VG_(lookupXA) refuses to cooperate if we
4422 don't. */
4423 VG_(setCmpFnXA)( tyents_to_keep, (XACmpFn_t) ML_(TyEnt__cmp_by_cuOff_only) );
4424 VG_(sortXA)( tyents_to_keep );
4425
4426 /* Enable cacheing on tyents_to_keep */
4427 tyents_to_keep_cache
4428 = ML_(dinfo_zalloc)( "di.readdwarf3.ndrw.8",
4429 sizeof(TyEntIndexCache) );
4430 ML_(TyEntIndexCache__invalidate)( tyents_to_keep_cache );
4431
4432 /* And record the tyents in the DebugInfo. We do this before
4433 starting to hand variables to ML_(addVar), since if ML_(addVar)
4434 wants to do debug printing (of the types of said vars) then it
4435 will need the tyents.*/
4436 vg_assert(!di->admin_tyents);
4437 di->admin_tyents = tyents_to_keep;
4438
4439 /* Bias all the location expressions. */
4440 TRACE_D3("\n");
4441 TRACE_D3("------ Biasing the location expressions ------\n" );
4442
4443 n = VG_(sizeXA)( gexprs );
4444 for (i = 0; i < n; i++) {
4445 gexpr = *(GExpr**)VG_(indexXA)( gexprs, i );
4446 bias_GX( gexpr, di );
4447 }
4448
4449 TRACE_D3("\n");
4450 TRACE_D3("------ Acquired the following variables: ------\n\n");
4451
4452 /* Park (pointers to) all the vars in an XArray, so we can look up
4453 abstract origins quickly. The array is sorted (hence, looked-up
4454 by) the .dioff fields. Since the .dioffs should be in strictly
4455 ascending order, there is no need to sort the array after
4456 construction. The ascendingness is however asserted for. */
4457 dioff_lookup_tab
4458 = VG_(newXA)( ML_(dinfo_zalloc), "di.readdwarf3.ndrw.9",
4459 ML_(dinfo_free),
4460 sizeof(TempVar*) );
4461 vg_assert(dioff_lookup_tab);
4462
4463 n = VG_(sizeXA)( tempvars );
4464 Word first_primary_var = 0;
4465 for (first_primary_var = 0;
4466 escn_debug_info_alt.szB/*really?*/ && first_primary_var < n;
4467 first_primary_var++) {
4468 varp = *(TempVar**)VG_(indexXA)( tempvars, first_primary_var );
4469 if (varp->dioff < escn_debug_info.szB + escn_debug_types.szB)
4470 break;
4471 }
4472 for (i = 0; i < n; i++) {
4473 varp = *(TempVar**)VG_(indexXA)( tempvars, (i + first_primary_var) % n );
4474 if (i > first_primary_var) {
4475 varp2 = *(TempVar**)VG_(indexXA)( tempvars,
4476 (i + first_primary_var - 1) % n );
4477 /* why should this hold? Only, I think, because we've
4478 constructed the array by reading .debug_info sequentially,
4479 and so the array .dioff fields should reflect that, and be
4480 strictly ascending. */
4481 vg_assert(varp2->dioff < varp->dioff);
sewardjb8b79ad2008-03-03 01:35:41 +00004482 }
philippea0a73932014-06-15 15:42:20 +00004483 VG_(addToXA)( dioff_lookup_tab, &varp );
4484 }
4485 VG_(setCmpFnXA)( dioff_lookup_tab, cmp_TempVar_by_dioff );
4486 VG_(sortXA)( dioff_lookup_tab ); /* POINTLESS; FIXME: rm */
4487
4488 /* Now visit each var. Collect up as much info as possible for
4489 each var and hand it to ML_(addVar). */
4490 n = VG_(sizeXA)( tempvars );
4491 for (j = 0; j < n; j++) {
4492 TyEnt* ent;
4493 varp = *(TempVar**)VG_(indexXA)( tempvars, j );
4494
4495 /* Possibly show .. */
4496 if (td3) {
4497 VG_(printf)("<%lx> addVar: level %d: %s :: ",
4498 varp->dioff,
4499 varp->level,
4500 varp->name ? varp->name : "<anon_var>" );
4501 if (varp->typeR) {
4502 ML_(pp_TyEnt_C_ishly)( tyents_to_keep, varp->typeR );
4503 } else {
4504 VG_(printf)("NULL");
4505 }
4506 VG_(printf)("\n Loc=");
4507 if (varp->gexpr) {
4508 ML_(pp_GX)(varp->gexpr);
4509 } else {
4510 VG_(printf)("NULL");
4511 }
sewardjb8b79ad2008-03-03 01:35:41 +00004512 VG_(printf)("\n");
philippea0a73932014-06-15 15:42:20 +00004513 if (varp->fbGX) {
4514 VG_(printf)(" FrB=");
4515 ML_(pp_GX)( varp->fbGX );
4516 VG_(printf)("\n");
4517 } else {
4518 VG_(printf)(" FrB=none\n");
4519 }
4520 VG_(printf)(" declared at: %s:%d\n",
4521 varp->fName ? varp->fName : "NULL",
4522 varp->fLine );
4523 if (varp->absOri != (UWord)D3_INVALID_CUOFF)
4524 VG_(printf)(" abstract origin: <%lx>\n", varp->absOri);
sewardjb8b79ad2008-03-03 01:35:41 +00004525 }
sewardjb8b79ad2008-03-03 01:35:41 +00004526
philippea0a73932014-06-15 15:42:20 +00004527 /* Skip variables which have no location. These must be
4528 abstract instances; they are useless as-is since with no
4529 location they have no specified memory location. They will
4530 presumably be referred to via the absOri fields of other
4531 variables. */
4532 if (!varp->gexpr) {
4533 TRACE_D3(" SKIP (no location)\n\n");
sewardjbdee9182010-10-08 23:57:25 +00004534 continue;
4535 }
sewardjb8b79ad2008-03-03 01:35:41 +00004536
philippea0a73932014-06-15 15:42:20 +00004537 /* So it has a location, at least. If it refers to some other
4538 entry through its absOri field, pull in further info through
4539 that. */
4540 if (varp->absOri != (UWord)D3_INVALID_CUOFF) {
4541 Bool found;
4542 Word ixFirst, ixLast;
4543 TempVar key;
4544 TempVar* keyp = &key;
4545 TempVar *varAI;
4546 VG_(memset)(&key, 0, sizeof(key)); /* not necessary */
4547 key.dioff = varp->absOri; /* this is what we want to find */
4548 found = VG_(lookupXA)( dioff_lookup_tab, &keyp,
4549 &ixFirst, &ixLast );
4550 if (!found) {
4551 /* barf("DW_AT_abstract_origin can't be resolved"); */
4552 TRACE_D3(" SKIP (DW_AT_abstract_origin can't be resolved)\n\n");
4553 continue;
4554 }
4555 /* If the following fails, there is more than one entry with
4556 the same dioff. Which can't happen. */
4557 vg_assert(ixFirst == ixLast);
4558 varAI = *(TempVar**)VG_(indexXA)( dioff_lookup_tab, ixFirst );
4559 /* stay sane */
4560 vg_assert(varAI);
4561 vg_assert(varAI->dioff == varp->absOri);
sewardjb8b79ad2008-03-03 01:35:41 +00004562
philippea0a73932014-06-15 15:42:20 +00004563 /* Copy what useful info we can. */
4564 if (varAI->typeR && !varp->typeR)
4565 varp->typeR = varAI->typeR;
4566 if (varAI->name && !varp->name)
4567 varp->name = varAI->name;
4568 if (varAI->fName && !varp->fName)
4569 varp->fName = varAI->fName;
4570 if (varAI->fLine > 0 && varp->fLine == 0)
4571 varp->fLine = varAI->fLine;
4572 }
sewardjb8b79ad2008-03-03 01:35:41 +00004573
philippea0a73932014-06-15 15:42:20 +00004574 /* Give it a name if it doesn't have one. */
4575 if (!varp->name)
4576 varp->name = ML_(addStr)( di, "<anon_var>", -1 );
sewardj9c606bd2008-09-18 18:12:50 +00004577
philippea0a73932014-06-15 15:42:20 +00004578 /* So now does it have enough info to be useful? */
4579 /* NOTE: re typeR: this is a hack. If typeR is Te_UNKNOWN then
4580 the type didn't get resolved. Really, in that case
4581 something's broken earlier on, and should be fixed, rather
4582 than just skipping the variable. */
4583 ent = ML_(TyEnts__index_by_cuOff)( tyents_to_keep,
4584 tyents_to_keep_cache,
4585 varp->typeR );
4586 /* The next two assertions should be guaranteed by
4587 our previous call to resolve_variable_types. */
4588 vg_assert(ent);
4589 vg_assert(ML_(TyEnt__is_type)(ent) || ent->tag == Te_UNKNOWN);
sewardj9c606bd2008-09-18 18:12:50 +00004590
philippea0a73932014-06-15 15:42:20 +00004591 if (ent->tag == Te_UNKNOWN) continue;
sewardjb8b79ad2008-03-03 01:35:41 +00004592
philippea0a73932014-06-15 15:42:20 +00004593 vg_assert(varp->gexpr);
4594 vg_assert(varp->name);
4595 vg_assert(varp->typeR);
4596 vg_assert(varp->level >= 0);
4597
4598 /* Ok. So we're going to keep it. Call ML_(addVar) once for
4599 each address range in which the variable exists. */
4600 TRACE_D3(" ACQUIRE for range(s) ");
4601 { AddrRange oneRange;
4602 AddrRange* varPcRanges;
4603 Word nVarPcRanges;
4604 /* Set up to iterate over address ranges, however
4605 represented. */
4606 if (varp->nRanges == 0 || varp->nRanges == 1) {
4607 vg_assert(!varp->rngMany);
4608 if (varp->nRanges == 0) {
4609 vg_assert(varp->rngOneMin == 0);
4610 vg_assert(varp->rngOneMax == 0);
4611 }
4612 nVarPcRanges = varp->nRanges;
4613 oneRange.aMin = varp->rngOneMin;
4614 oneRange.aMax = varp->rngOneMax;
4615 varPcRanges = &oneRange;
4616 } else {
4617 vg_assert(varp->rngMany);
sewardjb8b79ad2008-03-03 01:35:41 +00004618 vg_assert(varp->rngOneMin == 0);
4619 vg_assert(varp->rngOneMax == 0);
philippea0a73932014-06-15 15:42:20 +00004620 nVarPcRanges = VG_(sizeXA)(varp->rngMany);
4621 vg_assert(nVarPcRanges >= 2);
4622 vg_assert(nVarPcRanges == (Word)varp->nRanges);
4623 varPcRanges = VG_(indexXA)(varp->rngMany, 0);
sewardjb8b79ad2008-03-03 01:35:41 +00004624 }
philippea0a73932014-06-15 15:42:20 +00004625 if (varp->level == 0)
4626 vg_assert( nVarPcRanges == 1 );
4627 /* and iterate */
4628 for (i = 0; i < nVarPcRanges; i++) {
4629 Addr pcMin = varPcRanges[i].aMin;
4630 Addr pcMax = varPcRanges[i].aMax;
4631 vg_assert(pcMin <= pcMax);
4632 /* Level 0 is the global address range. So at level 0 we
4633 don't want to bias pcMin/pcMax; but at all other levels
4634 we do since those are derived from svmas in the Dwarf
4635 we're reading. Be paranoid ... */
4636 if (varp->level == 0) {
4637 vg_assert(pcMin == (Addr)0);
4638 vg_assert(pcMax == ~(Addr)0);
4639 } else {
4640 /* vg_assert(pcMin > (Addr)0);
4641 No .. we can legitimately expect to see ranges like
4642 0x0-0x11D (pre-biasing, of course). */
4643 vg_assert(pcMax < ~(Addr)0);
4644 }
4645
4646 /* Apply text biasing, for non-global variables. */
4647 if (varp->level > 0) {
4648 pcMin += di->text_debug_bias;
4649 pcMax += di->text_debug_bias;
4650 }
4651
4652 if (i > 0 && (i%2) == 0)
4653 TRACE_D3("\n ");
4654 TRACE_D3("[%#lx,%#lx] ", pcMin, pcMax );
4655
4656 ML_(addVar)(
4657 di, varp->level,
4658 pcMin, pcMax,
4659 varp->name, varp->typeR,
4660 varp->gexpr, varp->fbGX,
4661 varp->fName, varp->fLine, td3
4662 );
sewardjb8b79ad2008-03-03 01:35:41 +00004663 }
philippea0a73932014-06-15 15:42:20 +00004664 }
sewardjb8b79ad2008-03-03 01:35:41 +00004665
philippea0a73932014-06-15 15:42:20 +00004666 TRACE_D3("\n\n");
4667 /* and move on to the next var */
sewardjb8b79ad2008-03-03 01:35:41 +00004668 }
4669
philippea0a73932014-06-15 15:42:20 +00004670 /* Now free all the TempVars */
4671 n = VG_(sizeXA)( tempvars );
4672 for (i = 0; i < n; i++) {
4673 varp = *(TempVar**)VG_(indexXA)( tempvars, i );
4674 ML_(dinfo_free)(varp);
4675 }
4676 VG_(deleteXA)( tempvars );
4677 tempvars = NULL;
4678
4679 /* and the temp lookup table */
4680 VG_(deleteXA)( dioff_lookup_tab );
4681
4682 /* and the ranges tree. Note that we need to also free the XArrays
4683 which constitute the keys, hence pass VG_(deleteXA) as a
4684 key-finalizer. */
4685 VG_(deleteFM)( rangestree, (void(*)(UWord))VG_(deleteXA), NULL );
4686
4687 /* and the tyents_to_keep cache */
4688 ML_(dinfo_free)( tyents_to_keep_cache );
4689 tyents_to_keep_cache = NULL;
4690
4691 vg_assert( varparser.filenameTable == NULL );
4692
4693 /* And the signatured type hash. */
4694 VG_(HT_destruct) ( signature_types, ML_(dinfo_free) );
4695
4696 /* record the GExprs in di so they can be freed later */
4697 vg_assert(!di->admin_gexprs);
4698 di->admin_gexprs = gexprs;
sewardjb8b79ad2008-03-03 01:35:41 +00004699 }
sewardjb8b79ad2008-03-03 01:35:41 +00004700}
4701
4702
4703/*------------------------------------------------------------*/
4704/*--- ---*/
4705/*--- The "new" DWARF3 reader -- top level control logic ---*/
4706/*--- ---*/
4707/*------------------------------------------------------------*/
4708
sewardj6c591e12011-04-11 16:17:51 +00004709static Bool d3rd_jmpbuf_valid = False;
florian6bd9dc12012-11-23 16:17:43 +00004710static const HChar* d3rd_jmpbuf_reason = NULL;
sewardj97d3ebb2011-04-11 18:36:34 +00004711static VG_MINIMAL_JMP_BUF(d3rd_jmpbuf);
sewardjb8b79ad2008-03-03 01:35:41 +00004712
florian6bd9dc12012-11-23 16:17:43 +00004713static __attribute__((noreturn)) void barf ( const HChar* reason ) {
sewardjb8b79ad2008-03-03 01:35:41 +00004714 vg_assert(d3rd_jmpbuf_valid);
4715 d3rd_jmpbuf_reason = reason;
sewardj6c591e12011-04-11 16:17:51 +00004716 VG_MINIMAL_LONGJMP(d3rd_jmpbuf);
sewardjb8b79ad2008-03-03 01:35:41 +00004717 /*NOTREACHED*/
4718 vg_assert(0);
4719}
4720
4721
4722void
4723ML_(new_dwarf3_reader) (
4724 struct _DebugInfo* di,
sewardj5d616df2013-07-02 08:07:15 +00004725 DiSlice escn_debug_info, DiSlice escn_debug_types,
4726 DiSlice escn_debug_abbv, DiSlice escn_debug_line,
4727 DiSlice escn_debug_str, DiSlice escn_debug_ranges,
4728 DiSlice escn_debug_loc, DiSlice escn_debug_info_alt,
4729 DiSlice escn_debug_abbv_alt, DiSlice escn_debug_line_alt,
4730 DiSlice escn_debug_str_alt
sewardjb8b79ad2008-03-03 01:35:41 +00004731)
4732{
4733 volatile Int jumped;
4734 volatile Bool td3 = di->trace_symtab;
4735
4736 /* Run the _wrk function to read the dwarf3. If it succeeds, it
4737 just returns normally. If there is any failure, it longjmp's
4738 back here, having first set d3rd_jmpbuf_reason to something
4739 useful. */
4740 vg_assert(d3rd_jmpbuf_valid == False);
4741 vg_assert(d3rd_jmpbuf_reason == NULL);
4742
4743 d3rd_jmpbuf_valid = True;
sewardj6c591e12011-04-11 16:17:51 +00004744 jumped = VG_MINIMAL_SETJMP(d3rd_jmpbuf);
sewardjb8b79ad2008-03-03 01:35:41 +00004745 if (jumped == 0) {
4746 /* try this ... */
4747 new_dwarf3_reader_wrk( di, barf,
sewardj5d616df2013-07-02 08:07:15 +00004748 escn_debug_info, escn_debug_types,
4749 escn_debug_abbv, escn_debug_line,
4750 escn_debug_str, escn_debug_ranges,
4751 escn_debug_loc, escn_debug_info_alt,
4752 escn_debug_abbv_alt, escn_debug_line_alt,
4753 escn_debug_str_alt );
sewardjb8b79ad2008-03-03 01:35:41 +00004754 d3rd_jmpbuf_valid = False;
4755 TRACE_D3("\n------ .debug_info reading was successful ------\n");
4756 } else {
4757 /* It longjmp'd. */
4758 d3rd_jmpbuf_valid = False;
4759 /* Can't longjump without giving some sort of reason. */
4760 vg_assert(d3rd_jmpbuf_reason != NULL);
4761
4762 TRACE_D3("\n------ .debug_info reading failed ------\n");
4763
4764 ML_(symerr)(di, True, d3rd_jmpbuf_reason);
4765 }
4766
4767 d3rd_jmpbuf_valid = False;
4768 d3rd_jmpbuf_reason = NULL;
4769}
4770
4771
4772
4773/* --- Unused code fragments which might be useful one day. --- */
4774
4775#if 0
4776 /* Read the arange tables */
4777 TRACE_SYMTAB("\n");
4778 TRACE_SYMTAB("\n------ The contents of .debug_arange ------\n");
4779 init_Cursor( &aranges, debug_aranges_img,
4780 debug_aranges_sz, 0, barf,
4781 "Overrun whilst reading .debug_aranges section" );
4782 while (True) {
4783 ULong len, d_i_offset;
4784 Bool is64;
4785 UShort version;
4786 UChar asize, segsize;
4787
4788 if (is_at_end_Cursor( &aranges ))
4789 break;
4790 /* Read one arange thingy */
4791 /* initial_length field */
4792 len = get_Initial_Length( &is64, &aranges,
4793 "in .debug_aranges: invalid initial-length field" );
4794 version = get_UShort( &aranges );
4795 d_i_offset = get_Dwarfish_UWord( &aranges, is64 );
4796 asize = get_UChar( &aranges );
4797 segsize = get_UChar( &aranges );
4798 TRACE_D3(" Length: %llu\n", len);
4799 TRACE_D3(" Version: %d\n", (Int)version);
4800 TRACE_D3(" Offset into .debug_info: %llx\n", d_i_offset);
4801 TRACE_D3(" Pointer Size: %d\n", (Int)asize);
4802 TRACE_D3(" Segment Size: %d\n", (Int)segsize);
4803 TRACE_D3("\n");
4804 TRACE_D3(" Address Length\n");
4805
4806 while ((get_position_of_Cursor( &aranges ) % (2 * asize)) > 0) {
4807 (void)get_UChar( & aranges );
4808 }
4809 while (True) {
4810 ULong address = get_Dwarfish_UWord( &aranges, asize==8 );
4811 ULong length = get_Dwarfish_UWord( &aranges, asize==8 );
4812 TRACE_D3(" 0x%016llx 0x%llx\n", address, length);
4813 if (address == 0 && length == 0) break;
4814 }
4815 }
4816 TRACE_SYMTAB("\n");
4817#endif
4818
njn8b68b642009-06-24 00:37:09 +00004819#endif // defined(VGO_linux) || defined(VGO_darwin)
njnf76d27a2009-05-28 01:53:07 +00004820
sewardjb8b79ad2008-03-03 01:35:41 +00004821/*--------------------------------------------------------------------*/
njn8b68b642009-06-24 00:37:09 +00004822/*--- end ---*/
sewardjb8b79ad2008-03-03 01:35:41 +00004823/*--------------------------------------------------------------------*/