sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 1 | |
| 2 | Dealing with missing system call or ioctl wrappers in Valgrind |
| 3 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 4 | You're probably reading this because Valgrind bombed out whilst |
| 5 | running your program, and advised you to read this file. The good |
| 6 | news is that, in general, it's easy to write the missing syscall or |
| 7 | ioctl wrappers you need, so that you can continue your debugging. If |
| 8 | you send the resulting patches to me, then you'll be doing a favour to |
| 9 | all future Valgrind users too. |
| 10 | |
| 11 | Note that an "ioctl" is just a special kind of system call, really; so |
| 12 | there's not a lot of need to distinguish them (at least conceptually) |
| 13 | in the discussion that follows. |
| 14 | |
tom | 645de78 | 2005-10-05 08:27:08 +0000 | [diff] [blame] | 15 | All this machinery is in coregrind/m_syswrap. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 16 | |
| 17 | |
| 18 | What are syscall/ioctl wrappers? What do they do? |
| 19 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 20 | Valgrind does what it does, in part, by keeping track of everything your |
| 21 | program does. When a system call happens, for example a request to read |
| 22 | part of a file, control passes to the Linux kernel, which fulfills the |
| 23 | request, and returns control to your program. The problem is that the |
| 24 | kernel will often change the status of some part of your program's memory |
nethercote | 137bc55 | 2003-11-14 17:47:54 +0000 | [diff] [blame] | 25 | as a result, and tools (instrumentation plug-ins) may need to know about |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 26 | this. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 27 | |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 28 | Syscall and ioctl wrappers have two jobs: |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 29 | |
nethercote | 137bc55 | 2003-11-14 17:47:54 +0000 | [diff] [blame] | 30 | 1. Tell a tool what's about to happen, before the syscall takes place. A |
| 31 | tool could perform checks beforehand, eg. if memory about to be written |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 32 | is actually writeable. This part is useful, but not strictly |
| 33 | essential. |
| 34 | |
nethercote | 137bc55 | 2003-11-14 17:47:54 +0000 | [diff] [blame] | 35 | 2. Tell a tool what just happened, after a syscall takes place. This is |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 36 | so it can update its view of the program's state, eg. that memory has |
| 37 | just been written to. This step is essential. |
| 38 | |
| 39 | The "happenings" mostly involve reading/writing of memory. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 40 | |
| 41 | So, let's look at an example of a wrapper for a system call which |
| 42 | should be familiar to many Unix programmers. |
| 43 | |
| 44 | |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 45 | The syscall wrapper for time() |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 46 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 47 | The wrapper for the time system call looks like this: |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 48 | |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 49 | PRE(sys_time) |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 50 | { |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 51 | /* time_t time(time_t *t); */ |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 52 | PRINT("sys_time ( %p )",ARG1); |
| 53 | PRE_REG_READ1(long, "time", int *, t); |
| 54 | if (ARG1 != 0) { |
| 55 | PRE_MEM_WRITE( "time(t)", ARG1, sizeof(vki_time_t) ); |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 56 | } |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 57 | } |
| 58 | |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 59 | POST(sys_time) |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 60 | { |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 61 | if (ARG1 != 0) { |
| 62 | POST_MEM_WRITE( ARG1, sizeof(vki_time_t) ); |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 63 | } |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 64 | } |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 65 | |
bart | 3af9e99 | 2008-05-01 12:23:48 +0000 | [diff] [blame] | 66 | The first thing we do happens before the syscall occurs, in the PRE() function. |
| 67 | The PRE() function typically starts with invoking to the PRINT() macro. This |
| 68 | PRINT() macro implements support for the --trace-syscalls command line option. |
| 69 | Next, the tool is told the return type of the syscall, that the syscall has |
| 70 | one argument, the type of the syscall argument and that the argument is being |
| 71 | read from a register: |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 72 | |
| 73 | PRE_REG_READ1(long, "time", int *, t); |
| 74 | |
| 75 | Next, if a non-NULL buffer is passed in as the argument, tell the tool that the |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 76 | buffer is about to be written to: |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 77 | |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 78 | if (ARG1 != 0) { |
| 79 | PRE_MEM_WRITE( "time", ARG1, sizeof(vki_time_t) ); |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 80 | } |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 81 | |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 82 | Finally, the really important bit, after the syscall occurs, in the POST() |
| 83 | function: if, and only if, the system call was successful, tell the tool that |
| 84 | the memory was written: |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 85 | |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 86 | if (ARG1 != 0) { |
| 87 | POST_MEM_WRITE( ARG1, sizeof(vki_time_t) ); |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 88 | } |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 89 | |
fitzhardinge | 603e8c5 | 2004-01-19 22:02:43 +0000 | [diff] [blame] | 90 | The POST() function won't be called if the syscall failed, so you |
| 91 | don't need to worry about checking that in the POST() function. |
| 92 | (Note: this is sometimes a bug; some syscalls do return results when |
| 93 | they "fail" - for example, nanosleep returns the amount of unslept |
| 94 | time if interrupted. TODO: add another per-syscall flag for this |
| 95 | case.) |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 96 | |
nethercote | ef0c766 | 2004-11-06 15:38:43 +0000 | [diff] [blame] | 97 | Note that we use the type 'vki_time_t'. This is a copy of the kernel |
| 98 | type, with 'vki_' prefixed. Our copies of such types are kept in the |
| 99 | appropriate vki*.h file(s). We don't include kernel headers or glibc headers |
| 100 | directly. |
| 101 | |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 102 | |
| 103 | Writing your own syscall wrappers (see below for ioctl wrappers) |
| 104 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 105 | If Valgrind tells you that system call NNN is unimplemented, do the |
| 106 | following: |
| 107 | |
| 108 | 1. Find out the name of the system call: |
| 109 | |
| 110 | grep NNN /usr/include/asm/unistd.h |
| 111 | |
| 112 | This should tell you something like __NR_mysyscallname. |
njn | 71043fe | 2007-03-10 00:52:54 +0000 | [diff] [blame] | 113 | Copy this entry to include/vki/vki-scnums-$(VG_PLATFORM).h. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 114 | |
nethercote | ef0c766 | 2004-11-06 15:38:43 +0000 | [diff] [blame] | 115 | |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 116 | 2. Do 'man 2 mysyscallname' to get some idea of what the syscall |
fitzhardinge | 603e8c5 | 2004-01-19 22:02:43 +0000 | [diff] [blame] | 117 | does. Note that the actual kernel interface can differ from this, |
| 118 | so you might also want to check a version of the Linux kernel |
| 119 | source. |
| 120 | |
| 121 | NOTE: any syscall which has something to do with signals or |
| 122 | threads is probably "special", and needs more careful handling. |
| 123 | Post something to valgrind-developers if you aren't sure. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 124 | |
| 125 | |
| 126 | 3. Add a case to the already-huge collection of wrappers in |
tom | 645de78 | 2005-10-05 08:27:08 +0000 | [diff] [blame] | 127 | the coregrind/m_syswrap/syswrap-*.c files. |
| 128 | For each in-memory parameter which is read or written by |
| 129 | the syscall, do one of |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 130 | |
nethercote | ef0c766 | 2004-11-06 15:38:43 +0000 | [diff] [blame] | 131 | PRE_MEM_READ( ... ) |
| 132 | PRE_MEM_RASCIIZ( ... ) |
| 133 | PRE_MEM_WRITE( ... ) |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 134 | |
| 135 | for that parameter. Then do the syscall. Then, if the syscall |
nethercote | ef0c766 | 2004-11-06 15:38:43 +0000 | [diff] [blame] | 136 | succeeds, issue suitable POST_MEM_WRITE( ... ) calls. |
| 137 | (There's no need for POST_MEM_READ calls.) |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 138 | |
tom | 645de78 | 2005-10-05 08:27:08 +0000 | [diff] [blame] | 139 | Also, add it to the syscall_table[] array; use one of GENX_, GENXY |
| 140 | LINX_, LINXY, PLAX_, PLAXY. |
| 141 | GEN* for generic syscalls (in syswrap-generic.c), LIN* for linux |
| 142 | specific ones (in syswrap-linux.c) and PLA* for the platform |
| 143 | dependant ones (in syswrap-$(PLATFORM)-linux.c). |
| 144 | The *XY variant if it requires a PRE() and POST() function, and |
| 145 | the *X_ variant if it only requires a PRE() |
bart | f477549 | 2008-04-26 10:47:29 +0000 | [diff] [blame] | 146 | function. |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 147 | |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 148 | If you find this difficult, read the wrappers for other syscalls |
| 149 | for ideas. A good tip is to look for the wrapper for a syscall |
| 150 | which has a similar behaviour to yours, and use it as a |
| 151 | starting point. |
| 152 | |
nethercote | 73b526f | 2004-10-31 18:48:21 +0000 | [diff] [blame] | 153 | If you need structure definitions and/or constants for your syscall, |
| 154 | copy them from the kernel headers into include/vki.h and co., with |
| 155 | the appropriate vki_*/VKI_* name mangling. Don't #include any |
| 156 | kernel headers. And certainly don't #include any glibc headers. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 157 | |
| 158 | Test it. |
| 159 | |
nethercote | ef0c766 | 2004-11-06 15:38:43 +0000 | [diff] [blame] | 160 | Note that a common error is to call POST_MEM_WRITE( ... ) |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 161 | with 0 (NULL) as the first (address) argument. This usually means |
| 162 | your logic is slightly inadequate. It's a sufficiently common bug |
| 163 | that there's a built-in check for it, and you'll get a "probably |
| 164 | sanity check failure" for the syscall wrapper you just made, if this |
| 165 | is the case. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 166 | |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 167 | |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 168 | 4. Once happy, send us the patch. Pretty please. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 169 | |
| 170 | |
| 171 | |
| 172 | |
| 173 | Writing your own ioctl wrappers |
| 174 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
fitzhardinge | 603e8c5 | 2004-01-19 22:02:43 +0000 | [diff] [blame] | 175 | |
| 176 | Is pretty much the same as writing syscall wrappers, except that all |
| 177 | the action happens within PRE(ioctl) and POST(ioctl). |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 178 | |
njn | 75b31b3 | 2003-06-12 11:24:10 +0000 | [diff] [blame] | 179 | There's a default case, sometimes it isn't correct and you have to write a |
| 180 | more specific case to get the right behaviour. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 181 | |
nethercote | 98ae6da | 2004-01-19 19:32:30 +0000 | [diff] [blame] | 182 | As above, please create a bug report and attach the patch as described |
njn | 272b216 | 2005-05-17 03:22:38 +0000 | [diff] [blame] | 183 | on http://www.valgrind.org. |
sewardj | de4a1d0 | 2002-03-22 01:27:54 +0000 | [diff] [blame] | 184 | |