Blame - THREADS_SYSCALLS_SIGNALS.txt - fp2-dev/platform/external/valgrind

blob: 1f5426b6054c90783846492f2e634f82706efbd5 [file] [log] [blame]

sewardj	79f76f2	2005-03-14 13:35:15 +0000	[diff] [blame^]	1
				2	/* Make a thread the running thread. The thread must previously been
				3	sleeping, and not holding the CPU semaphore. This will set the
				4	thread state to VgTs_Runnable, and the thread will attempt to take
				5	the CPU semaphore. By the time it returns, tid will be the running
				6	thread. */
				7	extern void VG_(set_running) ( ThreadId tid );
				8
				9	/* Set a thread into a sleeping state. Before the call, the thread
				10	must be runnable, and holding the CPU semaphore. When this call
				11	returns, the thread will be set to the specified sleeping state,
				12	and will not be holding the CPU semaphore. Note that another
				13	thread could be running by the time this call returns, so the
				14	caller must be careful not to touch any shared state. It is also
				15	the caller's responsibility to actually block until the thread is
				16	ready to run again. */
				17	extern void VG_(set_sleeping) ( ThreadId tid, ThreadStatus state );
				18
				19
				20	The master semaphore is run_sema in vg_scheduler.c.
				21
				22	--------------------------------------------------------------------
				23
				24	Re: New World signal handling
				25	From: Jeremy Fitzhardinge <jeremy@goop.org>
				26	To: Julian Seward <jseward@acm.org>
				27	Date: Mon Mar 14 09:03:51 2005
				28
				29	Well, the big-picture things to be clear about are:
				30
				31	1. signal handlers are process-wide global state
				32	2. signal masks are per-thread (there's no notion of a process-wide
				33	signal mask)
				34	3. a signal can be targeted to either
				35	1. the whole process (any eligable thread is picked for
				36	delivery), or
				37	2. a specific thread
				38
				39	1 is why it is always a bug to temporarily reset a signal handler (say,
				40	for SIGSEGV), because if any other thread happens to be sent one in that
				41	window it will cause havok (I think there's still one instance of this
				42	in the symtab stuff).
				43	2 is the meat of your questions; more below.
				44	3 is responsible for some of the nitty detail in the signal stuff, so
				45	its worth bearing in mind to understand it all. (Note that even if a
				46	signal is targeting the whole process, its only ever delivered to one
				47	particular thread; there's no such thing as a broadcast signal.)
				48
				49	While a thread are running core code or generated code, it has almost
				50	all its signals blocked (all but the fault signals: SEGV, BUS, ILL, etc).
				51
				52	Every N basic blocks, each thread calls VG_(poll_signals) to see what
				53	signals are pending for it. poll_signals grabs the next pending signal
				54	which the client signal mask doesn't block, and sets it up for delivery;
				55	it uses the sigtimedwait() syscall to fetch blocked pending signals
				56	rather than have them delivered to a signal handler. This means that
				57	we avoid the complexity of having signals delivered asynchronously via
				58	the signal handlers; we can just poll for them synchronously when
				59	they're easy to deal with.
				60
				61	Fault signals, being caused by a specific instruction, are the exception
				62	because they can't be held off; if they're blocked when an instruction
				63	raises one, the kernel will just summarily kill the process. Therefore,
				64	they need to be always unblocked, and the signal handler is called when
				65	an instruction raises one of these exceptions. (It's also necessary to
				66	call poll_signals after any syscall which may raise a signal, since
				67	signal-raising syscalls are considered to be synchronous with respect to
				68	their signal; ie, calling kill(getpid(), SIGUSR1) will call the handler
				69	for SIGUSR1 before kill is seen to complete.)
				70
				71	The one time when the thread's real signal mask actually matches the
				72	client's requested signal mask is while running a blocking syscall. We
				73	have to set things up to accept signals during a syscall so that we get
				74	the right signal-interrupts-syscall semantics. The tricky part about
				75	this is that there's no general atomic
				76	set-signal-mask-and-block-in-syscall mechanism, so we need to fake it
				77	with the stuff in VGA_(_client_syscall)/VGA_(interrupted_syscall).
				78	These two basically form an explicit state machine, where the state
				79	variable is the instruction pointer, which allows it to determine what
				80	point the syscall got to when the async signal happens. By keeping the
				81	window where signals are actually unblocked very narrow, the number of
				82	possible states is pretty small.
				83
				84	This is all quite nice because the kernel does almost all the work of
				85	determining which thread should get a signal, what the correct action
				86	for a syscall when it has been interrupted is, etc. Particularly nice
				87	is that we don't need to worry about all the queuing semantics, and the
				88	per-signal special cases (which is, roughly, signals 1-32 are not queued
				89	except when they are, and signals 33-64 are queued except when they aren't).
				90
				91	BUT, there's another complexity: because the Unix signal mechanism has
				92	been overloaded to deal with two separate kinds of events (asynchronous
				93	signals raised by kill(), and synchronous faults raised by an
				94	instruction), we can't block a signal for one form and not the other.
				95	That is, because we have to leave SIGSEGV unblocked for faulting
				96	instructions, it also leaves us open to getting an async SIGSEGV sent
				97	with kill(pid, SIGSEGV).
				98
				99	To handle this case, there's a small per-thread signal queue set up to
				100	deal with this case (I'm using tid 0's queue for "signals sent to the
				101	whole process" - a hack, I'll admit). If an async SIGSEGV (etc) signal
				102	appears, then it is pushed onto the appropriate queue.
				103	VG_(poll_signals) also checks these queues for pending signals to decide
				104	what signal to deliver next. These queues are only manipulated with
				105	all signals blocked, so there's no risk of two concurrent async signal
				106	handlers modifying the queues at once. Also, because the liklihood of
				107	actually being sent an async SIGSEGV is pretty low, the queues are only
				108	allocated on demand.
				109
				110
				111
				112	There are two mechanisms to prevent disaster if multiple threads get
				113	signals concurrently. One is that a signal handler is set up to block a
				114	set of signals while the signal is being delivered. Valgrind's handlers
				115	block all signals, so there's no risk of a new signal being delivered to
				116	the same thread until the old handler has finished.
				117
				118	The other is that if the thread which recieves the signal is not running
				119	(ie, doesn't hold the run_sema, which implies it must be waiting for a
				120	syscall to complete), then the signal handler will grab the run_sema
				121	before making any global state changes. Since the only time we can get
				122	an async signal asynchronously is during a blocking syscall, this should
				123	be all the time. (And since synchronous signals are always the result of
				124	running an instruction, we should already be holding run_sema.)
				125
				126
				127	Valgrind will occasionally generate signals for itself. These are always
				128	synchronous faults as a result instruction fetch or something an
				129	instruction did. The two mechanims are the synth_fault_* functions,
				130	which are used to signal a problem while fetching an instruction, or by
				131	getting generated code to call a helper which contains a fault-raising
				132	instruction (used to deal with illegal/unimplemented instructions and
				133	for instructions who's only job is to raise exceptions).
				134
				135	That all explains how signals come in, but the second part is how they
				136	get delivered.
				137
				138	The main function for this is VG_(deliver_signal). There are three cases:
				139
				140	1. the process is ignoring the signal (SIG_IGN)
				141	2. the process is using the default handler (SIG_DFL)
				142	3. the process has a handler for the signal
				143
				144	In general, VG_(deliver_signal) shouldn't be called for ignored signals;
				145	if it has been called, it assumes the ignore is being overridden (if an
				146	instruction gets a SEGV etc, SIG_IGN is ignored and treated as SIG_DFL).
				147
				148	VG_(deliver_signal) handles the default handler case, and the
				149	client-specified signal handler case.
				150
				151	The default handler case is relatively easy: the signal's default action
				152	is either Terminate, or Ignore. We can ignore Ignore.
				153
				154	Terminate always kills the entire process; there's no such thing as a
				155	thread-specific signal death. Terminate comes in two forms: with
				156	coredump, or without. vg_default_action() will write a core file, and
				157	then will tell all the threads to start terminating; it then longjmps
				158	back to the current thread's scheduler loop. The scheduler loop will
				159	terminate immediately, and the master_tid thread will wait for all the
				160	others to exit before shutting down the process (this is the same
				161	mechanism as exit_group).
				162
				163	Delivering a signal to a client-side handler modifys the thread state so
				164	that there's a signal frame on the stack, and the instruction pointer is
				165	pointing to the handler. The fiddly bit is that there are two
				166	completely different signal frame formats: old and RT. While in theory
				167	the exact shape of these frames on stack is abstracted, there are real
				168	programs which know exactly where various parts of the structures are on
				169	stack (most notably, g++'s exception throwing code), which is why it has
				170	to have two separate pieces of code for each frame format. Another
				171	tricky case is dealing with the client stack running out/overflowing
				172	while setting up the signal frame.
				173
				174	Signal return is also interesting. There are two syscalls, sigreturn
				175	and rt_sigreturn, which a signal handler will use to resume execution.
				176	The client will call the right one for the frame it was passed, so the
				177	core doesn't need to track that state. The tricky part is moving the
				178	frame's register state back into the thread's state, particularly all
				179	the FPU state reformatting gunk. Also, *sigreturn checks for new
				180	pending signals after the old frame has been cleaned up, since there's a
				181	requirement that all deliverable pending signals are delivered before
				182	the mainline code makes progress. This means that a program could
				183	live-lock on signals, but that's what would happen running natively...
				184
				185	Another thing to watch for: programs which unwind the stack (like gdb,
				186	or exception throwers) recognize the existence of a signal frame by
				187	looking at the code the return address points to: if it is one of the
				188	two specific signal return sequences, it knows its a signal frame.
				189	That's why the signal handler return address must point to a very
				190	specific set of instructions.
				191
				192
				193	What else. Ah, the two internal signals.
				194
				195	SIGVGKILL is pretty straightforward: its just used to dislodge a thread
				196	from being blocked in a syscall, so that we can get the thread to
				197	terminate in a timely fashion.
				198
				199	SIGVGCHLD is used by a thread to tell the master_tid that it has
				200	exited. However, the only time the master_tid cares about this is when
				201	it has already exited, and its waiting for everyone else to exit. If
				202	the master_tid hasn't exited, then this signal is ignored. It isn't
				203	enough to simply block it, because that will cause a pile of queued
				204	SIGVGCHLDs to build up, eventually clogging the kernel's signal delivery
				205	mechanism. If its unblocked and ignored, it doesn't interrupt syscalls
				206	and it doesn't accumulate.
				207
				208
				209	I hope that helps clarify things. And explain why there's so much stuff
				210	in there: it's tracking a very complex and arcane underlying set of
				211	machinery.
				212
				213	J