The code in this module governs when worker threads should go to sleep. The system used in this code was introduced in Rayon RFC #5. There is also a video walkthrough available. Both of those may be valuable resources to understanding the code, though naturally they will also grow stale over time. The comments in this file are extracted from the RFC and meant to be kept up to date.
Sleep
structThe Sleep
struct is embedded into each registry. It performs several functions:
There are three main thread states:
We sometimes refer to the final two states collectively as inactive. Threads begin as idle but transition to idle and finally sleeping when they're unable to find work to do.
There is one other special state worth mentioning. During the idle state, threads can get sleepy. A sleepy thread is still idle, in that it is still searching for work, but it is about to go to sleep after it does one more search (or some other number, potentially). When a thread enters the sleepy state, it signals (via the jobs event counter, described below) that it is about to go to sleep. If new work is published, this will lead to the counter being adjusted. When the thread actually goes to sleep, it will (hopefully, but not guaranteed) see that the counter has changed and elect not to sleep, but instead to search again. See the section on the jobs event counter for more details.
One of the key structs in the sleep module is AtomicCounters
, found in counters.rs
. It packs three counters into one atomically managed value:
There are two thread counters, one that tracks inactive threads and one that tracks sleeping threads. From this, one can deduce the number of threads that are idle by subtracting sleeping threads from inactive threads. We track the counters in this way because it permits simpler atomic operations. One can increment the number of sleeping threads (and thus decrease the number of idle threads) simply by doing one atomic increment, for example. Similarly, one can decrease the number of sleeping threads (and increase the number of idle threads) through one atomic decrement.
These counters are adjusted as follows:
The final counter is the jobs event counter. The role of this counter is to help sleepy threads detect when new work is posted in a lightweight fashion. In its simplest form, we would simply have a counter that gets incremented each time a new job is posted. This way, when a thread gets sleepy, it could read the counter, and then compare to see if the value has changed before it actually goes to sleep. But this turns out to be too expensive in practice, so we use a somewhat more complex scheme.
The idea is that the counter toggles between two states, depending on whether its value is even or odd (or, equivalently, on the value of its low bit):
When new work is posted, we check the value of the counter: if it is even, then we increment it by one, so that it becomes odd.
When a worker thread gets sleepy, it will read the value of the counter. If the counter is odd, it will increment the counter so that it is even. Either way, it remembers the final value of the counter. The final value will be used later, when the thread is going to sleep. If at that time the counter has not changed, then we can assume no new jobs have been posted (though note the remote possibility of rollover, discussed in detail below).
The full protocol for a thread to post work is as follows
The full protocol for a thread to fall asleep is as follows:
get_sleepy
above, remembering the final_value
of the JEC. It does one more search for work.final_value
.As described in the section on the JEC, the main concern around going to sleep is avoiding a race condition wherein:
The JEC protocol largely prevents this, but due to rollover, this prevention is not complete. It is possible -- if unlikely -- that enough activity occurs for Thread A to observe the same JEC value that it saw when getting sleepy. If the new work being published came from inside the thread-pool, then this race condition isn't too harmful. It means that we have fewer workers processing the work then we should, but we won't deadlock. This seems like an acceptable risk given that this is unlikely in practice.
However, if the work was posted as an external job, that is a problem. In that case, it's possible that all of our workers could go to sleep, and the external job would never get processed. To prevent that, the sleeping protocol includes one final check to see if the injector queue is empty before fully falling asleep. Note that this final check occurs after the number of sleeping threads has been incremented. We are not concerned therefore with races against injections that occur after that increment, only before.
Unfortunately, there is one rather subtle point concerning this final check: we wish to avoid the possibility that:
This is possible because the C++ memory model typically offers guarantees of the form "if you see the access A, then you must see those other accesses" -- but it doesn't guarantee that you will see the access A (i.e., if you think of processors with independent caches, you may be operating on very out of date cache state).
To overcome this problem, we have inserted two sequentially consistent fence operations into the protocols above:
What follows is a "proof sketch" that the protocol is deadlock free. We model two relevant bits of memory, the job injector queue J and the atomic counters C.
Consider the actions of the injecting thread:
Meanwhile, the sleepy thread does the following:
Either PushFence or SleepFence must come first: