njn | a9104c0 | 2005-06-30 00:54:02 +0000 | [diff] [blame] | 1 | |
| 2 | ----------------------------------------------------------------------------- |
| 3 | Info about the relationship between Segments and SegInfos |
| 4 | ----------------------------------------------------------------------------- |
| 5 | |
| 6 | SegInfo is from the very original Valgrind code, and so it predates |
| 7 | Segments. It's poorly named now; its really just a container for all |
| 8 | the object file metadata (symbols, debug info, etc). |
| 9 | |
| 10 | Segments describe memory mapped into the address space, and so any |
| 11 | address-space chaging operation needs to update the Segment structure. |
| 12 | After the process is initalized, this means one of: |
| 13 | |
| 14 | * mmap |
| 15 | * munmap |
| 16 | * mprotect |
| 17 | * brk |
| 18 | * stack growth |
| 19 | |
| 20 | A piece of address space may or may not be mmaped from a file. |
| 21 | |
| 22 | A SegInfo specifically describes memory mmaped from an ELF object file. |
| 23 | Because a single ELF file may be mmaped with multiple Segments, multiple |
| 24 | Segments can point to one Seginfo. A SegInfo can relate to a memory |
| 25 | range which is not yet mmaped. For example, if the process mmaps the |
| 26 | first page of an ELF file (the one containing the header), a SegInfo |
| 27 | will be created for that ELF file's mappings, which will include memory |
| 28 | which will be later mmaped by the client's ELF loader. If a new mmap |
| 29 | appears in the address range of an existing SegInfo, it will have that |
| 30 | SegInfo attached to it, presumably because its part of a .so file. |
| 31 | Similarly, if a Segment gets split (by mprotect, for example), the two |
| 32 | pieces will still be associated with the same SegInfo. For this reason, |
| 33 | the address/length info in a SegInfo is not a duplicate of the Segment |
| 34 | address/length. |
| 35 | |
| 36 | This is complex for several reasons: |
| 37 | |
| 38 | 1. We assume that if a process is mmaping a file which contains an |
| 39 | ELF header, it intends to use it as an ELF object. If a program |
| 40 | which just mmaps ELF files but just uses it as raw data (copy, for |
| 41 | example), we still treat it as a shared-library opening. |
| 42 | 2. Even if it is being loaded as a shared library/other ELF object, |
| 43 | Valgrind doesn't control the mmaps. It just observes the mmaps |
| 44 | being generated by the client and has to cope. One of the reasons |
| 45 | that Valgrind has to make its own mmap of each .so for reading |
| 46 | symtab information is because the client won't necessary mmap the |
| 47 | right pieces, or do so in the wrong order for us. |
| 48 | |
| 49 | SegInfos are reference counted, and freed when no Segments point to them any |
| 50 | more. |
| 51 | |
| 52 | > Aha. So the range of a SegInfo will always be equal to or greater |
| 53 | > than the range of its parent Segment? Or can you eg. mmap a whole |
| 54 | > file plus some extra pages, and then the SegInfo won't cover the extra |
| 55 | > part of the range? |
| 56 | |
| 57 | That would be unusual, but possible. You could imagine ld generating an |
| 58 | ELF file via a mapping this way (which would probably upset Valgrind no |
| 59 | end). |
njn | 3ff8ebc | 2005-09-26 01:55:14 +0000 | [diff] [blame] | 60 | |
| 61 | ----------------------------------------------------------------------------- |
| 62 | More from John Reiser |
| 63 | ----------------------------------------------------------------------------- |
| 64 | > Can a Segment get split (eg. by mprotect)? |
| 65 | |
| 66 | This happens when a debugger inserts a breakpoint, or when ld-linux |
| 67 | relocates a module that has DT_TEXTREL, or when a co-resident monitor |
| 68 | rewrites some instructions. On x86, a shared lib with relocations to |
| 69 | .text "works" just fine. The modified pages are no longer sharable, |
| 70 | but the instruction stream is functional. It's even rather common, |
| 71 | when a builder forgets to use -fpic for one or more files. It |
| 72 | can be done on purpose when the modularity is more important than |
| 73 | the page sharing. Non-pic code is faster, too: register %ebx is |
| 74 | not dedicated to _GLOBAL_OFFSET_TABLE_ addressing, and global variables |
| 75 | can be accessed by [relocated] inline 32-bit offset rather than by |
| 76 | address fetched from the GOT. |
| 77 | |
| 78 | > Can a new mmap appear in the address range of an existing SegInfo? |
| 79 | |
| 80 | On x86_64 the static linker ld inserts a 1MB "hole" between .text |
| 81 | and .data. This is on advice from the hardware performance mavens, |
| 82 | because various caching+prefetching hardware can look ahead that far. |
| 83 | Currently ld-linux leaves this as PROT_NONE, but anybody else is |
| 84 | free to override that assignment. |
| 85 | |
| 86 | > From peering at various /proc/*/maps files, the following scheme |
| 87 | > sounds plausible: |
| 88 | > |
| 89 | > Load symbols following an mmap if: |
| 90 | > |
| 91 | > map is to a file |
| 92 | > map has r-x permissions |
| 93 | > file has a valid ELF header |
| 94 | > possibly: mapping is > 1 page (catches the case of mapping first |
| 95 | > page just to examine the header) |
| 96 | > |
| 97 | > If the client wants to subsequently chop up the mapping, or change its |
| 98 | > permissions, we ignore that. I have never seen any evidence in |
| 99 | > proc/*/maps that ld.so does such things. |
| 100 | |
| 101 | glibc-2.3.5 ld-linux does. It finds the minimum interval of pages which |
| 102 | covers the p_memsz of all PT_LOAD, mmap()s that much from the file [even if |
| 103 | this maps beyond EOF of the file], then munmap()s [or mprotect(,,PROT_NONE)] |
| 104 | everything that is not covered by the first PT_LOAD, then |
| 105 | mmap(,,,MAP_FIXED,,) each remaining PT_LOAD. This is done to overcome the |
| 106 | possibility that a kernel which randomizes the placement of mmap(0, ...) |
| 107 | might place the first PT_LOAD so that subsequent PT_LOAD [must maintain |
| 108 | relative addressing to other PT_LOAD from the same file] would evict |
| 109 | something else. Needless to say, ld-linux assumes that it is the only actor |
| 110 | (well, dlopen() does try for mutual exclusion) and that any "holes" between |
| 111 | PT_LOAD from the same module are ignorable as far as allocation is |
| 112 | concerned. Also, there is nothing to stop a file from having PT_LOAD that |
| 113 | overlap, or appear in non-ascending order, etc. The results might depend on |
| 114 | order of processing, but always it has been by order of appearance in the |
| 115 | file. [Probably this is a good way to trigger "bugs" in ld-linux and/or the |
| 116 | kernel.] |
| 117 | |
| 118 | Some algorithms and data structures internal to glibc-2.3.5 assume that |
| 119 | modules do not overlap. In particular, ld-linux sometimes searches |
| 120 | for __builtin_return_address_(0) in a set of intervals in order to determine |
| 121 | which shared lib called ld-linux. This matters for dlsym(), dlmopen(), |
| 122 | etc., and assumes that the intervals are a disjoint cover of any |
| 123 | "legal" callers. ld-linux tries to hide all of this from the prying |
| 124 | eyes of anyone else [the internal version of struct link_map contains |
| 125 | much more than specified in <link.h>]. Some of this is good because |
| 126 | it changes very frequently, but some parts are bad because in the past |
| 127 | ld-linux has been slow to provide needed services [such as |
| 128 | dl_iterate_phdr()] and even antagonistic towards anybody else |
| 129 | trying for peaceful co-existence without the blessing of ld-linux. |
| 130 | |