Blame - docs/internals/Darwin-notes.txt - platform/external/valgrind

blob: 8c320a35be554b8dab8c2d907530093d8d19f7f7 [file] [log] [blame]

njn	f76d27a	2009-05-28 01:53:07 +0000	[diff] [blame]	1
				2	Valgrind-developer notes, re the MacOSX port
				3	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				4
				5	JRS 22 Mar 09: re these comments in m_libc* and m_debuglog:
				6
				7	/* IMPORTANT: on Darwin it is essential to use the _nocancel versions
				8	of syscalls rather than the vanilla version, if a _nocancel version
				9	is available. See docs/internals/Darwin-notes.txt for the reason
				10	why. */
				11
				12	when Valgrind does (for its own purposes, not for the client)
				13	read/write/open/close etc syscalls, it really is critical to use the
				14	_nocancel versions of syscalls rather than the vanilla versions. This
				15	holds throughout the entire code base: whenever V does a syscall for
				16	its own purposes, we must use the _nocancel version if it exists.
				17	This is of course most prevalent in m_libc* since all of our
				18	own-purpose (non-client) syscalls should get routed through there.
				19
				20	Why? Because on Darwin, pthread cancellation is done within the
				21	kernel (unlike on Linux, iiuc). And read/write/open/close and a whole
				22	bunch of other syscalls to do with stream I/O are cancellation points.
				23	So what can happen is, client informs the kernel that a given thread
				24	is to be cancelled. Then at the next (eg) VG_(printf) call by that
				25	thread, which leads to a sys_write, the write syscall gets hit by the
				26	cancellation request, and is duly nuked by the kernel. Of course from
				27	the outside it looks as if the thread had mysteriously disappeared off
				28	the radar for no reason.
				29
				30	In short, we need to use _nocancel versions in order to ensure that
				31	cancellation requests only take effect at the places where the client
				32	does a syscall, and not the places where Valgrind does syscalls.
				33
				34	How observed: using the standard pipe-based implementation in
				35	coregrind/m_scheduler/sema.c, none/tests/pth_cancel1 would hang
				36	(compared to succeeding using native Darwin semaphores). And if the
				37	"pause()" call in said test is turned into a spin ("while (1) ;") then
				38	the entire Valgrind run mysteriously disappears, rather than spinning
				39	using native Darwin semaphores.
				40
				41	Because the pipe-based semaphore intensively uses sys_read/sys_write,
Elliott Hughes	ed39800	2017-06-21 14:41:24 -0700	[diff] [blame^]	42	it is not surprising that it inadvertently was eating up cancellation
njn	f76d27a	2009-05-28 01:53:07 +0000	[diff] [blame]	43	requests directed to client threads. With abovementioned change in
				44	force the pipe-based semaphore appears to work correctly.
				45
				46
				47
				48	Valgrind-developer notes, things removed from the original MacOSX port
				49	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				50	There was a broken debugstub implementation. It was removed over several
				51	commits: r9477, which removed most of it, and r9711, r9759, and r10012,
				52	which cleaned up remaining bits.
				53
				54	There was machinery to read function names from Dwarf3 debug info. But we
				55	already read function names from the symbol tables, so this was duplicated
				56	functionality. Furthermore, a Darwin-specific hack was required in
				57	storage.c to choose between symbol table names vs. Dwarf3 names. So this
				58	machinery was removed in r10155.
				59
				60
				61	Valgrind-developer notes, todos re the MacOSX port
				62	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				63
				64	* m_syswrap/syscall-amd64-darwin.S
				65	- correct signal mask is not applied during syscall
				66	- restart-labels are completely bogus
				67
				68	* m_syswrap/syswrap-darwin.c:
				69	- PRE(sys_posix_spawn) completely ignores signal issues, and
				70	also ignores the file_actions argument
				71
njn	f76d27a	2009-05-28 01:53:07 +0000	[diff] [blame]	72	* Cleanups: sort wrappers in syswrap-darwin.c and priv_syswrap-darwin.h
				73	alphabetically. Also, some aren't properly implemented -- check and
				74	print warnings
				75
				76	* Cleanups: m_scheduler/sema.c: use pipe implementation
				77	(but this apparently causes none/tests/pth_cancel1 to hang.
				78	I have no idea why, despite quite some investigation).
				79
				80	* Cleanups: m_debugstub: move to attic
				81
				82	* syswrap-darwin.c: sys_{f,}chmod_extended: handling of ARG5 is way
				83	wrong
				84
				85	* Cleanups (Linux,AIX5): bogus launcher-path mangling logic in
				86	PRE(sys_execve)
				87
				88	* Cleanups (ALL PLATFORMS): m_signals.c: are the _MY_SIGRETURN
				89	assembly stubs actually necessary for anything? I don't know.
				90
				91	* Cleanups: check that changes to VG_(stat) and VG_(stat64) have
				92	not broken 64-bit statting on 32-bit Linux
				93
				94	* Cleanups: #if !HAVE_PROC in m_main (to do with /proc/<pid>/cmdline
				95
				96	--------
				97
				98	m_main doesn't read symbols for the valgrind exe itself, which is
				99	annoying. On minimal investigation it seems that the executable isn't
				100	even listed by aspacem. This is very strange and not in accordance
				101	with the Linux or AIX ports.
				102
				103
				104	m_main: relatedly, Darwin version does not collect/give out
				105	initial debuginfo handles; hence ptrcheck won't work
				106
				107
				108	m_main: Darwin port relies on blocking out big sections of address
				109	space with mmap at startup. We know from history that this is a bad
				110	idea. (It's also really slow on 64-bit builds, taking 3--4 seconds.)
				111	Also, startup is not done on the interim startup stack -- why not?
				112
				113
				114	VG_(di_notify_mmap): Linux version is also used for Darwin, and
				115	contains some ifdeffery. Clean up.
				116
				117
				118	PRE(sys_fork), #ifdeffery
				119
				120
				121	syswrap-generic.c: VG_(init_preopened_fds) is #ifdefd for Darwin
				122
				123
				124	scheduler.c: #ifdeffery in VG_(get_thread_out_of_syscall)
				125
				126
				127	look at notes in coregrind/Makefile.am re Mach RPC interface
				128	definitions. See if we can get rid of any more stuff now that
				129	m_debugstub is gone.