| |
| MSMProp2, a simplified but functionally equivalent version of MSMProp1 |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Julian Seward, OpenWorks Ltd, 19 August 2008 |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| Note that this file does NOT describe the state machine used in the |
| svn://svn.valgrind.org/branches/YARD version of Helgrind. That state |
| machine is different again from any previously described machine. |
| |
| See the file README_YARD.txt for more details on YARD. |
| |
| ---------------------- |
| |
| In early 2008 Konstantin Serebryany proposed "MSMProp1", a memory |
| state machine for data race detection. It is described at |
| http://code.google.com/p/data-race-test/wiki/MSMProp1 |
| |
| Implementation experiences show MSMProp1 is useful, but difficult to |
| implement efficiently. In particular keeping the memory usage under |
| control is complex and difficult. |
| |
| This note points out a key simplification of MSMProp1, which makes it |
| easier to implement without changing the functionality. |
| |
| |
| The idea |
| ~~~~~~~~ |
| |
| The core of the idea pertains to the "Condition" entry for MSMProp1 |
| state machine rules E5 and E6(r). These are, respectively: |
| |
| HB(SS, currS) and its negation |
| ! HB(SS, currS). |
| |
| Here, SS is a set of segments, and currS is a single segment. Each |
| segment contains a vector timestamp. The expression "HB(SS, currS)" |
| is intended to denote |
| |
| for each segment S in SS . happens_before(S,currS) |
| |
| where happens_before(S,T) means that S's vector timestamp is ordered |
| before-or-equal to T's vector timestamp. |
| |
| In words, the expression |
| |
| for each segment S in SS . happens_before(S,currS) |
| |
| is equivalent to saying that currS has a timestamp which is |
| greater-than-equal to the timestamps of all the segments in SS. |
| |
| The key observation is that this is equivalent to |
| |
| happens_before( JOIN(SS), currS ) |
| |
| where JOIN is the lattice-theoretic "max" or "least upper bound" |
| operation on vector clocks. Given the definition of HB, |
| happens_before and (binary) JOIN, this is easy to prove. |
| |
| |
| The consequences |
| ~~~~~~~~~~~~~~~~ |
| |
| With that observation in place, it is a short step to observe that |
| storing segment sets in MSMProp1 is unnecessary. Instead of |
| storing a segment set in each shadow value, just store and |
| update a single vector timestamp. The following two equivalences |
| hold: |
| |
| MSMProp1 MSMProp2 |
| |
| adding a segment S join-ing S's vector timestamp |
| to the segment-set to the current vector timestamp |
| |
| HB(SS,currS) happens_before( |
| currS's timestamp, |
| current vector timestamp ) |
| |
| Once it is no longer necessary to represent segment sets, it then |
| also becomes unnecessary to represent segments. This constitutes |
| a significant simplication to the implementation. |
| |
| |
| The resulting state machine, MSMProp2 |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| MSMProp2 is isomorphic to MSMProp1, with the following changes: |
| |
| States are New, Read(VTS,LS), Write(VTS,LS) |
| |
| where LS is a lockset (as before) and VTS is a vector timestamp. |
| |
| For a thread T with current lockset 'currLS' and current VTS 'currVTS' |
| making a memory access, the new rules are |
| |
| Name Old-State Op Guard New-State Race-If |
| |
| E1 New rd True Read(currVTS,currLS) False |
| |
| E2 New wr True Write(currVTS,currLS) False |
| |
| E3 Read(oldVTS,oldLS) rd True Read(newVTS,newLS) False |
| |
| E4 Read(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0 |
| && !hb(oldVTS,currVTS) |
| |
| E5 Write(oldVTS,oldLS) rd hb(oldVTS, Read(currVTS,currLS) False |
| currVTS) |
| |
| E6r Write(oldVTS,oldLS) rd !hb(oldVTS, Write(newVTS,newLS) #newLS == 0 |
| currVTS) && !hb(oldVTS,currVTS) |
| |
| E6w Write(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0 |
| && !hb(oldVTS,currVTS) |
| |
| where newVTS = join2(oldVTS,currVTS) |
| |
| newLS = if hb(oldVTS,currVTS) |
| then currLS |
| else intersect(oldLS,currLS) |
| |
| hb(vts1, vts2) = vts1 happens before or is equal to vts2 |
| |
| |
| Interpretation of the states |
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| |
| I always found the state names in MSMProp1 confusing. Both MSMProp1 |
| and MSMProp2 are easier to understand if the states Read and Write are |
| renamed, like this: |
| |
| old name new name |
| |
| Read WriteConstraint |
| Write AllConstraint |
| |
| The effect of a state Read(VTS,LS) is to constrain all later-observed |
| writes so that either (1) the writing thread holds at least one lock |
| in common with LS, or (2) those writes must happen-after VTS. If |
| neither of those two conditions hold, a race is reported. |
| |
| Hence a Read state places a constraint on writes. |
| |
| The effect of a state Write(VTS,LS) is similar, but it applies to all |
| later-observed accesses: either (1) the accessing thread holds at |
| least one lock in common with LS, or (2) those accesses must |
| happen-after VTS. If neither of those two conditions hold, a race is |
| reported. |
| |
| Hence a Write state places a constraint on all accesses. |
| |
| If we ignore the LS component of these states, the intuitive |
| interpretation of the VTS component is that it states the earliest |
| vector-time that the next write / access may safely happen. |
| |