blob: 8c320a35be554b8dab8c2d907530093d8d19f7f7 [file] [log] [blame]
njnf76d27a2009-05-28 01:53:07 +00001
2Valgrind-developer notes, re the MacOSX port
3~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4
5JRS 22 Mar 09: re these comments in m_libc* and m_debuglog:
6
7/* IMPORTANT: on Darwin it is essential to use the _nocancel versions
8 of syscalls rather than the vanilla version, if a _nocancel version
9 is available. See docs/internals/Darwin-notes.txt for the reason
10 why. */
11
12when Valgrind does (for its own purposes, not for the client)
13read/write/open/close etc syscalls, it really is critical to use the
14_nocancel versions of syscalls rather than the vanilla versions. This
15holds throughout the entire code base: whenever V does a syscall for
16its own purposes, we must use the _nocancel version if it exists.
17This is of course most prevalent in m_libc* since all of our
18own-purpose (non-client) syscalls should get routed through there.
19
20Why? Because on Darwin, pthread cancellation is done within the
21kernel (unlike on Linux, iiuc). And read/write/open/close and a whole
22bunch of other syscalls to do with stream I/O are cancellation points.
23So what can happen is, client informs the kernel that a given thread
24is to be cancelled. Then at the next (eg) VG_(printf) call by that
25thread, which leads to a sys_write, the write syscall gets hit by the
26cancellation request, and is duly nuked by the kernel. Of course from
27the outside it looks as if the thread had mysteriously disappeared off
28the radar for no reason.
29
30In short, we need to use _nocancel versions in order to ensure that
31cancellation requests only take effect at the places where the client
32does a syscall, and not the places where Valgrind does syscalls.
33
34How observed: using the standard pipe-based implementation in
35coregrind/m_scheduler/sema.c, none/tests/pth_cancel1 would hang
36(compared to succeeding using native Darwin semaphores). And if the
37"pause()" call in said test is turned into a spin ("while (1) ;") then
38the entire Valgrind run mysteriously disappears, rather than spinning
39using native Darwin semaphores.
40
41Because the pipe-based semaphore intensively uses sys_read/sys_write,
Elliott Hughesed398002017-06-21 14:41:24 -070042it is not surprising that it inadvertently was eating up cancellation
njnf76d27a2009-05-28 01:53:07 +000043requests directed to client threads. With abovementioned change in
44force the pipe-based semaphore appears to work correctly.
45
46
47
48Valgrind-developer notes, things removed from the original MacOSX port
49~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50There was a broken debugstub implementation. It was removed over several
51commits: r9477, which removed most of it, and r9711, r9759, and r10012,
52which cleaned up remaining bits.
53
54There was machinery to read function names from Dwarf3 debug info. But we
55already read function names from the symbol tables, so this was duplicated
56functionality. Furthermore, a Darwin-specific hack was required in
57storage.c to choose between symbol table names vs. Dwarf3 names. So this
58machinery was removed in r10155.
59
60
61Valgrind-developer notes, todos re the MacOSX port
62~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
63
64* m_syswrap/syscall-amd64-darwin.S
65 - correct signal mask is not applied during syscall
66 - restart-labels are completely bogus
67
68* m_syswrap/syswrap-darwin.c:
69 - PRE(sys_posix_spawn) completely ignores signal issues, and
70 also ignores the file_actions argument
71
njnf76d27a2009-05-28 01:53:07 +000072* Cleanups: sort wrappers in syswrap-darwin.c and priv_syswrap-darwin.h
73 alphabetically. Also, some aren't properly implemented -- check and
74 print warnings
75
76* Cleanups: m_scheduler/sema.c: use pipe implementation
77 (but this apparently causes none/tests/pth_cancel1 to hang.
78 I have no idea why, despite quite some investigation).
79
80* Cleanups: m_debugstub: move to attic
81
82* syswrap-darwin.c: sys_{f,}chmod_extended: handling of ARG5 is way
83 wrong
84
85* Cleanups (Linux,AIX5): bogus launcher-path mangling logic in
86 PRE(sys_execve)
87
88* Cleanups (ALL PLATFORMS): m_signals.c: are the _MY_SIGRETURN
89 assembly stubs actually necessary for anything? I don't know.
90
91* Cleanups: check that changes to VG_(stat) and VG_(stat64) have
92 not broken 64-bit statting on 32-bit Linux
93
94* Cleanups: #if !HAVE_PROC in m_main (to do with /proc/<pid>/cmdline
95
96--------
97
98m_main doesn't read symbols for the valgrind exe itself, which is
99annoying. On minimal investigation it seems that the executable isn't
100even listed by aspacem. This is very strange and not in accordance
101with the Linux or AIX ports.
102
103
104m_main: relatedly, Darwin version does not collect/give out
105initial debuginfo handles; hence ptrcheck won't work
106
107
108m_main: Darwin port relies on blocking out big sections of address
109space with mmap at startup. We know from history that this is a bad
110idea. (It's also really slow on 64-bit builds, taking 3--4 seconds.)
111Also, startup is not done on the interim startup stack -- why not?
112
113
114VG_(di_notify_mmap): Linux version is also used for Darwin, and
115contains some ifdeffery. Clean up.
116
117
118PRE(sys_fork), #ifdeffery
119
120
121syswrap-generic.c: VG_(init_preopened_fds) is #ifdefd for Darwin
122
123
124scheduler.c: #ifdeffery in VG_(get_thread_out_of_syscall)
125
126
127look at notes in coregrind/Makefile.am re Mach RPC interface
128definitions. See if we can get rid of any more stuff now that
129m_debugstub is gone.