blob: 1f5426b6054c90783846492f2e634f82706efbd5 [file] [log] [blame]
sewardj79f76f22005-03-14 13:35:15 +00001
2/* Make a thread the running thread. The thread must previously been
3 sleeping, and not holding the CPU semaphore. This will set the
4 thread state to VgTs_Runnable, and the thread will attempt to take
5 the CPU semaphore. By the time it returns, tid will be the running
6 thread. */
7extern void VG_(set_running) ( ThreadId tid );
8
9/* Set a thread into a sleeping state. Before the call, the thread
10 must be runnable, and holding the CPU semaphore. When this call
11 returns, the thread will be set to the specified sleeping state,
12 and will not be holding the CPU semaphore. Note that another
13 thread could be running by the time this call returns, so the
14 caller must be careful not to touch any shared state. It is also
15 the caller's responsibility to actually block until the thread is
16 ready to run again. */
17extern void VG_(set_sleeping) ( ThreadId tid, ThreadStatus state );
18
19
20The master semaphore is run_sema in vg_scheduler.c.
21
22--------------------------------------------------------------------
23
24Re: New World signal handling
25From: Jeremy Fitzhardinge <jeremy@goop.org>
26To: Julian Seward <jseward@acm.org>
27Date: Mon Mar 14 09:03:51 2005
28
29Well, the big-picture things to be clear about are:
30
31 1. signal handlers are process-wide global state
32 2. signal masks are per-thread (there's no notion of a process-wide
33 signal mask)
34 3. a signal can be targeted to either
35 1. the whole process (any eligable thread is picked for
36 delivery), or
37 2. a specific thread
38
391 is why it is always a bug to temporarily reset a signal handler (say,
40for SIGSEGV), because if any other thread happens to be sent one in that
41window it will cause havok (I think there's still one instance of this
42in the symtab stuff).
432 is the meat of your questions; more below.
443 is responsible for some of the nitty detail in the signal stuff, so
45its worth bearing in mind to understand it all. (Note that even if a
46signal is targeting the whole process, its only ever delivered to one
47particular thread; there's no such thing as a broadcast signal.)
48
49While a thread are running core code or generated code, it has almost
50all its signals blocked (all but the fault signals: SEGV, BUS, ILL, etc).
51
52Every N basic blocks, each thread calls VG_(poll_signals) to see what
53signals are pending for it. poll_signals grabs the next pending signal
54which the client signal mask doesn't block, and sets it up for delivery;
55it uses the sigtimedwait() syscall to fetch blocked pending signals
56rather than have them delivered to a signal handler. This means that
57we avoid the complexity of having signals delivered asynchronously via
58the signal handlers; we can just poll for them synchronously when
59they're easy to deal with.
60
61Fault signals, being caused by a specific instruction, are the exception
62because they can't be held off; if they're blocked when an instruction
63raises one, the kernel will just summarily kill the process. Therefore,
64they need to be always unblocked, and the signal handler is called when
65an instruction raises one of these exceptions. (It's also necessary to
66call poll_signals after any syscall which may raise a signal, since
67signal-raising syscalls are considered to be synchronous with respect to
68their signal; ie, calling kill(getpid(), SIGUSR1) will call the handler
69for SIGUSR1 before kill is seen to complete.)
70
71The one time when the thread's real signal mask actually matches the
72client's requested signal mask is while running a blocking syscall. We
73have to set things up to accept signals during a syscall so that we get
74the right signal-interrupts-syscall semantics. The tricky part about
75this is that there's no general atomic
76set-signal-mask-and-block-in-syscall mechanism, so we need to fake it
77with the stuff in VGA_(_client_syscall)/VGA_(interrupted_syscall).
78These two basically form an explicit state machine, where the state
79variable is the instruction pointer, which allows it to determine what
80point the syscall got to when the async signal happens. By keeping the
81window where signals are actually unblocked very narrow, the number of
82possible states is pretty small.
83
84This is all quite nice because the kernel does almost all the work of
85determining which thread should get a signal, what the correct action
86for a syscall when it has been interrupted is, etc. Particularly nice
87is that we don't need to worry about all the queuing semantics, and the
88per-signal special cases (which is, roughly, signals 1-32 are not queued
89except when they are, and signals 33-64 are queued except when they aren't).
90
91BUT, there's another complexity: because the Unix signal mechanism has
92been overloaded to deal with two separate kinds of events (asynchronous
93signals raised by kill(), and synchronous faults raised by an
94instruction), we can't block a signal for one form and not the other.
95That is, because we have to leave SIGSEGV unblocked for faulting
96instructions, it also leaves us open to getting an async SIGSEGV sent
97with kill(pid, SIGSEGV).
98
99To handle this case, there's a small per-thread signal queue set up to
100deal with this case (I'm using tid 0's queue for "signals sent to the
101whole process" - a hack, I'll admit). If an async SIGSEGV (etc) signal
102appears, then it is pushed onto the appropriate queue.
103VG_(poll_signals) also checks these queues for pending signals to decide
104what signal to deliver next. These queues are only manipulated with
105*all* signals blocked, so there's no risk of two concurrent async signal
106handlers modifying the queues at once. Also, because the liklihood of
107actually being sent an async SIGSEGV is pretty low, the queues are only
108allocated on demand.
109
110
111
112There are two mechanisms to prevent disaster if multiple threads get
113signals concurrently. One is that a signal handler is set up to block a
114set of signals while the signal is being delivered. Valgrind's handlers
115block all signals, so there's no risk of a new signal being delivered to
116the same thread until the old handler has finished.
117
118The other is that if the thread which recieves the signal is not running
119(ie, doesn't hold the run_sema, which implies it must be waiting for a
120syscall to complete), then the signal handler will grab the run_sema
121before making any global state changes. Since the only time we can get
122an async signal asynchronously is during a blocking syscall, this should
123be all the time. (And since synchronous signals are always the result of
124running an instruction, we should already be holding run_sema.)
125
126
127Valgrind will occasionally generate signals for itself. These are always
128synchronous faults as a result instruction fetch or something an
129instruction did. The two mechanims are the synth_fault_* functions,
130which are used to signal a problem while fetching an instruction, or by
131getting generated code to call a helper which contains a fault-raising
132instruction (used to deal with illegal/unimplemented instructions and
133for instructions who's only job is to raise exceptions).
134
135That all explains how signals come in, but the second part is how they
136get delivered.
137
138The main function for this is VG_(deliver_signal). There are three cases:
139
140 1. the process is ignoring the signal (SIG_IGN)
141 2. the process is using the default handler (SIG_DFL)
142 3. the process has a handler for the signal
143
144In general, VG_(deliver_signal) shouldn't be called for ignored signals;
145if it has been called, it assumes the ignore is being overridden (if an
146instruction gets a SEGV etc, SIG_IGN is ignored and treated as SIG_DFL).
147
148VG_(deliver_signal) handles the default handler case, and the
149client-specified signal handler case.
150
151The default handler case is relatively easy: the signal's default action
152is either Terminate, or Ignore. We can ignore Ignore.
153
154Terminate always kills the entire process; there's no such thing as a
155thread-specific signal death. Terminate comes in two forms: with
156coredump, or without. vg_default_action() will write a core file, and
157then will tell all the threads to start terminating; it then longjmps
158back to the current thread's scheduler loop. The scheduler loop will
159terminate immediately, and the master_tid thread will wait for all the
160others to exit before shutting down the process (this is the same
161mechanism as exit_group).
162
163Delivering a signal to a client-side handler modifys the thread state so
164that there's a signal frame on the stack, and the instruction pointer is
165pointing to the handler. The fiddly bit is that there are two
166completely different signal frame formats: old and RT. While in theory
167the exact shape of these frames on stack is abstracted, there are real
168programs which know exactly where various parts of the structures are on
169stack (most notably, g++'s exception throwing code), which is why it has
170to have two separate pieces of code for each frame format. Another
171tricky case is dealing with the client stack running out/overflowing
172while setting up the signal frame.
173
174Signal return is also interesting. There are two syscalls, sigreturn
175and rt_sigreturn, which a signal handler will use to resume execution.
176The client will call the right one for the frame it was passed, so the
177core doesn't need to track that state. The tricky part is moving the
178frame's register state back into the thread's state, particularly all
179the FPU state reformatting gunk. Also, *sigreturn checks for new
180pending signals after the old frame has been cleaned up, since there's a
181requirement that all deliverable pending signals are delivered before
182the mainline code makes progress. This means that a program could
183live-lock on signals, but that's what would happen running natively...
184
185Another thing to watch for: programs which unwind the stack (like gdb,
186or exception throwers) recognize the existence of a signal frame by
187looking at the code the return address points to: if it is one of the
188two specific signal return sequences, it knows its a signal frame.
189That's why the signal handler return address must point to a very
190specific set of instructions.
191
192
193What else. Ah, the two internal signals.
194
195SIGVGKILL is pretty straightforward: its just used to dislodge a thread
196from being blocked in a syscall, so that we can get the thread to
197terminate in a timely fashion.
198
199SIGVGCHLD is used by a thread to tell the master_tid that it has
200exited. However, the only time the master_tid cares about this is when
201it has already exited, and its waiting for everyone else to exit. If
202the master_tid hasn't exited, then this signal is ignored. It isn't
203enough to simply block it, because that will cause a pile of queued
204SIGVGCHLDs to build up, eventually clogging the kernel's signal delivery
205mechanism. If its unblocked and ignored, it doesn't interrupt syscalls
206and it doesn't accumulate.
207
208
209I hope that helps clarify things. And explain why there's so much stuff
210in there: it's tracking a very complex and arcane underlying set of
211machinery.
212
213 J