blob: 3ab670308eace0514f53ce1ab1f157c73951476f [file] [log] [blame]
Petr Machata65af2a52012-10-23 18:12:58 +02001-*-org-*-
2* TODO
Petr Machata0ffc0852013-11-06 18:35:56 +01003** Keep exit code of traced process
4 See https://bugzilla.redhat.com/show_bug.cgi?id=105371 for details.
5
Petr Machata65af2a52012-10-23 18:12:58 +02006** Automatic prototype discovery:
7*** Use debuginfo if available
8 Alternatively, use debuginfo to generate configure file.
Petr Machata26994452013-09-19 23:43:50 +02009*** Mangled identifiers contain partial prototypes themselves
10 They don't contain return type info, which can change the
11 parameter passing convention. We could use it and hope for the
Petr Machatacecec2e2013-11-05 02:21:18 +010012 best. Also they don't include the potentially present hidden this
13 pointer.
Petr Machata65af2a52012-10-23 18:12:58 +020014** Automatically update list of syscalls?
Petr Machata65af2a52012-10-23 18:12:58 +020015** More operating systems (solaris?)
16** Get rid of EVENT_ARCH_SYSCALL and EVENT_ARCH_SYSRET
17** Implement displaced tracing
18 A technique used in GDB (and in uprobes, I believe), whereby the
19 instruction under breakpoint is moved somewhere else, and followed
20 by a jump back to original place. When the breakpoint hits, the IP
21 is moved to the displaced instruction, and the process is
22 continued. We avoid all the fuss with singlestepping and
23 reenablement.
24** Create different ltrace processes to trace different children
Petr Machata834844a2012-10-25 03:39:08 +020025** Config file syntax
Petr Machata26994452013-09-19 23:43:50 +020026*** mark some symbols as exported
27 For PLT hits, only exported prototypes would be considered. For
28 symtab entry point hits, all would be.
29
Petr Machata4d73ff52012-11-09 20:02:03 +010030*** named arguments
31 This would be useful for replacing the arg1, emt2 etc.
32
33*** parameter pack improvements
34 The above format tweaks require that packs that expand to no types
35 at all be supported. If this works, then it should be relatively
36 painless to implement conditionals:
37
38 | void ptrace(REQ=enum(PTRACE_TRACEME=0,...),
39 | if[REQ==0](pack(),pack(pid_t, void*, void *)))
40
41 This is of course dangerously close to a programming language, and
42 I think ltrace should be careful to stay as simple as possible.
43 (We can hook into Lua, or TinyScheme, or some such if we want more
44 general scripting capabilities. Implementing something ad-hoc is
45 undesirable.) But the above can be nicely expressed by pattern
46 matching:
47
48 | void ptrace(REQ=enum[int](...)):
49 | [REQ==0] => ()
50 | [REQ==1 or REQ==2] => (pid_t, void*)
51 | [true] => (pid_t, void*, void*);
52
53 Or:
54
55 | int open(string, FLAGS=flags[int](O_RDONLY=00,...,O_CREAT=0100,...)):
56 | [(FLAGS & 0100) != 0] => (flags[int](S_IRWXU,...))
57
58 This would still require pretty complete expression evaluation.
59 _Including_ pointer dereferences and such. And e.g. in accept, we
60 need subtraction:
61
62 | int accept(int, +struct(short, +array(hex(char), X-2))*, (X=uint)*);
63
64 Perhaps we should hook to something after all.
65
Petr Machata8eacf652013-10-24 10:35:54 +020066*** system call error returns
67
68 This is closely related to above. Take the following syscall
69 prototype:
70
71 | long read(int,+string0,ulong);
72
73 string0 means the same as string(array(char, zero(retval))*). But
74 if read returns a negative value, that signifies errno. But zero
75 takes this at face value and is suspicious:
76
77 | read@SYS(3 <no return ...>
78 | error: maximum array length seems negative
79 | , "\n\003\224\003\n", 4096) = -11
80
81 Ideally we would do what strace does, e.g.:
82
83 | read@SYS(3, 0x12345678, 4096) = -EAGAIN
84
Petr Machata4d73ff52012-11-09 20:02:03 +010085*** errno tracking
86 Some calls result in setting errno. Somehow mark those, and on
Petr Machata8eacf652013-10-24 10:35:54 +020087 failure, show errno. System calls return errno as a negative
88 value (see the previous point).
Petr Machata4d73ff52012-11-09 20:02:03 +010089
90*** second conversions?
91 This definitely calls for some general scripting. The goal is to
92 have seconds in adjtimex calls show as e.g. 10s, 1m15s or some
93 such.
94
Petr Machata834844a2012-10-25 03:39:08 +020095*** format should take arguments like string does
Petr Machata4d73ff52012-11-09 20:02:03 +010096 Format should take value argument describing the value that should
97 be analyzed. The following overwriting rules would then apply:
Petr Machata9daea452012-10-26 02:08:08 +020098
Petr Machata4d73ff52012-11-09 20:02:03 +010099 | format | format(array(char, zero)*) |
100 | format(LENS) | X=LENS, format[X] |
Petr Machata9daea452012-10-26 02:08:08 +0200101
Petr Machata4d73ff52012-11-09 20:02:03 +0100102 The latter expanded form would be canonical.
Petr Machata9daea452012-10-26 02:08:08 +0200103
Petr Machata4d73ff52012-11-09 20:02:03 +0100104 This depends on named arguments and parameter pack improvements
105 (we need to be able to construct parameter packs that expand to
106 nothing).
Petr Machata9daea452012-10-26 02:08:08 +0200107
Petr Machata4d73ff52012-11-09 20:02:03 +0100108*** More fine-tuned control of right arguments
109 Combination of named arguments and some extensions could take care
110 of that:
Petr Machata9daea452012-10-26 02:08:08 +0200111
Petr Machata4d73ff52012-11-09 20:02:03 +0100112 | void func(X=hide(int*), long*, +pack(X)); |
Petr Machata9daea452012-10-26 02:08:08 +0200113
Petr Machata4d73ff52012-11-09 20:02:03 +0100114 This would show long* as input argument (i.e. the function could
115 mangle it), and later show the pre-fetched X. The "pack" syntax is
116 utterly undeveloped as of now. The general idea is to produce
117 arguments that expand to some mix of types and values. But maybe
118 all we need is something like
Petr Machata9daea452012-10-26 02:08:08 +0200119
Petr Machata4d73ff52012-11-09 20:02:03 +0100120 | void func(out int*, long*); |
121
122 ltrace would know that out/inout/in arguments are given in the
123 right order, but left pass should display in and inout arguments
124 only, and right pass then out and inout. + would be
125 backward-compatible syntactic sugar, expanded like so:
126
127 | void func(int*, int*, +long*, long*); |
128 | void func(in int*, in int*, out long*, out long*); |
129
Petr Machatac00837c2013-11-11 02:24:42 +0100130 This is useful in particular for:
131
Petr Machata6e570e52013-11-11 19:33:37 +0100132 | ulong mbsrtowcs(+wstring3_t, string*, ulong, addr); |
133 | ulong wcsrtombs(+string3, wstring_t*, ulong, addr); |
Petr Machatac00837c2013-11-11 02:24:42 +0100134
135 Where we would like to render arg2 on the way in, and arg1 on the
136 way out.
137
Petr Machata4d73ff52012-11-09 20:02:03 +0100138 But sometimes we may want to see a different type on the way in and
139 on the way out. E.g. in asprintf, what's interesting on the way in
140 is the address, but on the way out we want to see buffer contents.
141 Does something like the following make sense?
142
143 | void func(X=void*, long*, out string(X)); |
Petr Machata9daea452012-10-26 02:08:08 +0200144
Petr Machata3eb32282012-11-02 02:56:58 +0100145** Support for functions that never return
146 This would be useful for __cxa_throw, presumably also for longjmp
147 (do we handle that at all?) and perhaps a handful of others.
148
149** Support flag fields
150 enum-like syntax, except disjunction of several values is assumed.
Petr Machata4d73ff52012-11-09 20:02:03 +0100151** Support long long
152 We currently can't define time_t on 32bit machines. That mean we
153 can't describe a range of time-related functions.
Petr Machata3eb32282012-11-02 02:56:58 +0100154
Petr Machata61b4c492012-11-18 21:54:54 +0100155** Support signed char, unsigned char, char
156 Also, don't format it as characted by default, string lens can do
157 it. Perhaps introduce byte and ubyte and leave 'char' as alias of
158 one of those with string lens applied by default.
159
160** Support fixed-width types
161 Really we should keep everything as {u,}int{8,16,32,64} internally,
162 and have long, short and others be translated to one of those
163 according to architecture rules. Maybe this could be achieved by a
164 per-arch config file with typedefs such as:
165
Petr Machata26994452013-09-19 23:43:50 +0200166 | typedef ulong = uint8_t; |
Petr Machata61b4c492012-11-18 21:54:54 +0100167
Petr Machata5aca6512013-09-26 14:03:14 +0200168** Support for ARM/AARCH64 types
169 - ARM and AARCH64 both support half-precision floating point
170 - there are two different half-precision formats, IEEE 754-2008
171 and "alternative". Both have 10 bits of mantissa and 5 bits of
172 exponent, and differ only in how exponent==0x1F is handled. In
173 IEEE format, we get NaN's and infinities; in alternative
174 format, this encodes normalized value -1S × 2¹⁶ × (1.mant)
175 - The Floating-Point Control Register, FPCR, controls: — The
176 half-precision format where applicable, FPCR.AHP bit.
177 - AARCH64 supports fixed-point interpretation of {,double}words
178 - e.g. fixed(int, X) (int interpreted as a decimal number with X
179 binary digits of fraction).
180 - AARCH64 supports 128-bit quad words in SIMD
181
Petr Machataf1977272012-11-29 15:49:16 +0100182** Some more functions in vect might be made to take const*
183 Or even marked __attribute__((pure)).
Petr Machata7467b942012-11-20 02:36:56 +0100184
Petr Machata26994452013-09-19 23:43:50 +0200185** pretty printer support
186 GDB supports python pretty printers. We migh want to hook this in
187 and use it to format certain types.
188
Petr Machata0f6f30c2014-01-07 11:57:36 +0100189** support new Linux kernel features
190 - PTRACE_SIEZE
191 - /proc/PID/map_files/* (but only root seems to be able to read
192 this as of now)
193
Petr Machata65af2a52012-10-23 18:12:58 +0200194* BUGS
195** After a clone(), syscalls may be seen as sysrets in s390 (see trace.c:syscall_p())