Update, obviously hasn't been for a while, was majorly out of date and
undoubtedly confusing to anyone who read it.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1679 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/README_MISSING_SYSCALL_OR_IOCTL b/README_MISSING_SYSCALL_OR_IOCTL
index 4545f83..7678bb7 100644
--- a/README_MISSING_SYSCALL_OR_IOCTL
+++ b/README_MISSING_SYSCALL_OR_IOCTL
@@ -12,74 +12,69 @@
there's not a lot of need to distinguish them (at least conceptually)
in the discussion that follows.
-All this machinery is in vg_syscall_mem.c.
+All this machinery is in coregrind/vg_syscalls.c.
What are syscall/ioctl wrappers? What do they do?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Valgrind does what it does, in part, by keeping track of the status of
-all bytes of memory accessible by your program. When a system call
-happens, for example a request to read part of a file, control passes
-to the Linux kernel, which fulfills the request, and returns control
-to your program. The problem is that the kernel will often change the
-status of some part of your program's memory as a result.
+Valgrind does what it does, in part, by keeping track of everything your
+program does. When a system call happens, for example a request to read
+part of a file, control passes to the Linux kernel, which fulfills the
+request, and returns control to your program. The problem is that the
+kernel will often change the status of some part of your program's memory
+as a result, and skins (instrumentation plug-ins) may need to know about
+this.
-The job of syscall and ioctl wrappers is to spot such system calls,
-and update Valgrind's memory status maps accordingly. This is
-essential, because not doing so would cause you to be flooded with
-errors later on, and, in general, because it's important that
-Valgrind's idea of accessible memory corresponds to that of the Linux
-kernel's. And for other reasons too.
+Syscall and ioctl wrappers have two jobs:
-In addition, Valgrind takes the opportunity to perform some sanity
-checks on the parameters you are presenting to system calls. This
-isn't essential for the correct operation of Valgrind, but it does
-allow it to warn you about various kinds of misuses which would
-otherwise mean your program just dies without warning, usually with a
-segmentation fault.
+1. Tell a skin what's about to happen, before the syscall takes place. A
+ skin could perform checks beforehand, eg. if memory about to be written
+ is actually writeable. This part is useful, but not strictly
+ essential.
+
+2. Tell a skin what just happened, after a syscall takes place. This is
+ so it can update its view of the program's state, eg. that memory has
+ just been written to. This step is essential.
+
+The "happenings" mostly involve reading/writing of memory.
So, let's look at an example of a wrapper for a system call which
should be familiar to many Unix programmers.
-The syscall wrapper for read()
+The syscall wrapper for time()
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Removing the debug printing clutter, it looks like this:
- case __NR_read: /* syscall 3 */
- /* size_t read(int fd, void *buf, size_t count); */
- must_be_writable( "read(buf)", arg2, arg3 );
- KERNEL_DO_SYSCALL(res);
- if (!VG_(is_kerror)(res) && res > 0) {
- make_readable( arg2, res );
- }
- break;
+ case __NR_time: /* syscall 13 */
+ /* time_t time(time_t *t); */
+ if (arg1 != (UInt)NULL) {
+ SYSCALL_TRACK( pre_mem_write, tst, "time", arg1, sizeof(time_t) );
+ }
+ KERNEL_DO_SYSCALL(tid,res);
+ if (!VG_(is_kerror)(res) && arg1 != (UInt)NULL) {
+ VG_TRACK( post_mem_write, arg1, sizeof(time_t) );
+ }
+ break;
-The first thing we do is check that the buffer, which you planned to
-have the result written to, really is addressible ("writable", here).
-Hence:
+The first thing we do is, if a non-NULL buffer is passed in as the argument,
+tell the skin that the buffer is about to be written to:
- must_be_writable( "read(buf)", arg2, arg3 );
+ if (arg1 != (UInt)NULL) {
+ SYSCALL_TRACK( pre_mem_write, tst, "time", arg1, sizeof(time_t) );
+ }
-which causes Valgrind to issue a warning if the address range
-[arg2 .. arg2 + arg3 - 1] is not writable. This is one of those
-nice-to-have-but-not-essential checks mentioned above. Note that
-the syscall args are always called arg1, arg2, arg3, etc. Here,
-arg1 corresponds to "fd" in the prototype, arg2 to "buf", and arg3
-to "count".
+Now Valgrind asks the kernel to actally do the system call, for the thread
+identified by thread ID "tid", depositing the return value in "res":
-Now Valgrind asks the kernel to do the system call, depositing the
-return code in "res":
-
- KERNEL_DO_SYSCALL(res);
+ KERNEL_DO_SYSCALL(tid, res);
Finally, the really important bit. If, and only if, the system call
-was successful, mark the buffer as readable (ie, as having valid
-data), for as many bytes as were actually read:
+was successful, tell the skin that the memory was written:
- if (!VG_(is_kerror)(res) && res > 0) {
- make_readable( arg2, res );
- }
+ if (!VG_(is_kerror)(res) && arg1 != (UInt)NULL) {
+ VG_TRACK( post_mem_write, arg1, sizeof(time_t) );
+ }
The function VG_(is_kerror) tells you whether or not its argument
represents a Linux kernel return error code. Hence the test.
@@ -102,13 +97,17 @@
3. Add a case to the already-huge collection of wrappers in
- vg_syscall_mem.c. For each in-memory parameter which is read
- by the syscall, do a must_be_readable or must_be_readable_asciiz
- on that parameter. Then do the syscall. Then, if the syscall
- succeeds, issue suitable make_readable/writable/noaccess calls
- afterwards, so as to update Valgrind's memory maps to reflect
- the state change caused by the call.
-
+ vg_syscall_mem.c. For each in-memory parameter which is read or
+ written by the syscall, do one of
+
+ SYSCALL_TRACK( pre_mem_read, ... )
+ SYSCALL_TRACK( pre_mem_read_asciiz, ... )
+ SYSCALL_TRACK( pre_mem_write, ... )
+
+ for that parameter. Then do the syscall. Then, if the syscall
+ succeeds, issue suitable VG_TRACK( post_mem_write, ... ) calls.
+ (There's no need for post_mem_read calls.)
+
If you find this difficult, read the wrappers for other syscalls
for ideas. A good tip is to look for the wrapper for a syscall
which has a similar behaviour to yours, and use it as a
@@ -119,12 +118,12 @@
Test it.
- Note that a common error is to call make_readable or make_writable
- with 0 (NULL) as the first (address) argument. This usually means your
- logic is slightly inadequate. It's a sufficiently common bug that
- there's a built-in check for it, and you'll get a "probably sanity
- check failure" for the syscall wrapper you just made, if this is
- the case.
+ Note that a common error is to call VG_TRACK( post_mem_write, ... )
+ with 0 (NULL) as the first (address) argument. This usually means
+ your logic is slightly inadequate. It's a sufficiently common bug
+ that there's a built-in check for it, and you'll get a "probably
+ sanity check failure" for the syscall wrapper you just made, if this
+ is the case.
Note that many syscalls are bracketed by #if defined(__NR_mysyscall)
... #endif, because they exist only in the 2.4 kernel and not
@@ -141,11 +140,8 @@
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Is pretty much the same as writing syscall wrappers.
-If you can't be bothered, do a cheap hack: add it (the ioctl number
-emitted in Valgrind's panic-message) to the long list of IOCTLs which
-are noted but not fully handled by Valgrind (search for the text
-"noted but unhandled ioctl" in vg_syscall_mem.c). This will get you
-going immediately, at the risk of giving you spurious value errors.
+There's a default case, sometimes it isn't correct and you have to write a
+more specific case to get the right behaviour.
As above, please do send me the resulting patch.