Blame - helgrind/README_MSMProp2.txt - platform/external/valgrind

blob: 6b4ac5f43cf5a5b744dd371c64fd2fff35602502 [file] [log] [blame]

sewardj	f98e1c0	2008-10-25 16:22:41 +0000	[diff] [blame]	1
				2	MSMProp2, a simplified but functionally equivalent version of MSMProp1
				3	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				4
				5	Julian Seward, OpenWorks Ltd, 19 August 2008
				6	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				7
				8	Note that this file does NOT describe the state machine used in the
				9	svn://svn.valgrind.org/branches/YARD version of Helgrind. That state
				10	machine is different again from any previously described machine.
				11
				12	See the file README_YARD.txt for more details on YARD.
				13
				14	----------------------
				15
				16	In early 2008 Konstantin Serebryany proposed "MSMProp1", a memory
				17	state machine for data race detection. It is described at
				18	http://code.google.com/p/data-race-test/wiki/MSMProp1
				19
				20	Implementation experiences show MSMProp1 is useful, but difficult to
				21	implement efficiently. In particular keeping the memory usage under
				22	control is complex and difficult.
				23
				24	This note points out a key simplification of MSMProp1, which makes it
				25	easier to implement without changing the functionality.
				26
				27
				28	The idea
				29	~~~~~~~~
				30
				31	The core of the idea pertains to the "Condition" entry for MSMProp1
				32	state machine rules E5 and E6(r). These are, respectively:
				33
				34	HB(SS, currS) and its negation
				35	! HB(SS, currS).
				36
				37	Here, SS is a set of segments, and currS is a single segment. Each
				38	segment contains a vector timestamp. The expression "HB(SS, currS)"
				39	is intended to denote
				40
				41	for each segment S in SS . happens_before(S,currS)
				42
				43	where happens_before(S,T) means that S's vector timestamp is ordered
				44	before-or-equal to T's vector timestamp.
				45
				46	In words, the expression
				47
				48	for each segment S in SS . happens_before(S,currS)
				49
				50	is equivalent to saying that currS has a timestamp which is
				51	greater-than-equal to the timestamps of all the segments in SS.
				52
				53	The key observation is that this is equivalent to
				54
				55	happens_before( JOIN(SS), currS )
				56
				57	where JOIN is the lattice-theoretic "max" or "least upper bound"
				58	operation on vector clocks. Given the definition of HB,
				59	happens_before and (binary) JOIN, this is easy to prove.
				60
				61
				62	The consequences
				63	~~~~~~~~~~~~~~~~
				64
				65	With that observation in place, it is a short step to observe that
				66	storing segment sets in MSMProp1 is unnecessary. Instead of
				67	storing a segment set in each shadow value, just store and
				68	update a single vector timestamp. The following two equivalences
				69	hold:
				70
				71	MSMProp1 MSMProp2
				72
				73	adding a segment S join-ing S's vector timestamp
				74	to the segment-set to the current vector timestamp
				75
				76	HB(SS,currS) happens_before(
				77	currS's timestamp,
				78	current vector timestamp )
				79
				80	Once it is no longer necessary to represent segment sets, it then
				81	also becomes unnecessary to represent segments. This constitutes
				82	a significant simplication to the implementation.
				83
				84
				85	The resulting state machine, MSMProp2
				86	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				87
				88	MSMProp2 is isomorphic to MSMProp1, with the following changes:
				89
				90	States are New, Read(VTS,LS), Write(VTS,LS)
				91
				92	where LS is a lockset (as before) and VTS is a vector timestamp.
				93
				94	For a thread T with current lockset 'currLS' and current VTS 'currVTS'
				95	making a memory access, the new rules are
				96
				97	Name Old-State Op Guard New-State Race-If
				98
				99	E1 New rd True Read(currVTS,currLS) False
				100
				101	E2 New wr True Write(currVTS,currLS) False
				102
				103	E3 Read(oldVTS,oldLS) rd True Read(newVTS,newLS) False
				104
				105	E4 Read(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0
				106	&& !hb(oldVTS,currVTS)
				107
				108	E5 Write(oldVTS,oldLS) rd hb(oldVTS, Read(currVTS,currLS) False
				109	currVTS)
				110
				111	E6r Write(oldVTS,oldLS) rd !hb(oldVTS, Write(newVTS,newLS) #newLS == 0
				112	currVTS) && !hb(oldVTS,currVTS)
				113
				114	E6w Write(oldVTS,oldLS) wr True Write(newVTS,newLS) #newLS == 0
				115	&& !hb(oldVTS,currVTS)
				116
				117	where newVTS = join2(oldVTS,currVTS)
				118
				119	newLS = if hb(oldVTS,currVTS)
				120	then currLS
				121	else intersect(oldLS,currLS)
				122
				123	hb(vts1, vts2) = vts1 happens before or is equal to vts2
				124
				125
				126	Interpretation of the states
				127	~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				128
				129	I always found the state names in MSMProp1 confusing. Both MSMProp1
				130	and MSMProp2 are easier to understand if the states Read and Write are
				131	renamed, like this:
				132
				133	old name new name
				134
				135	Read WriteConstraint
				136	Write AllConstraint
				137
				138	The effect of a state Read(VTS,LS) is to constrain all later-observed
				139	writes so that either (1) the writing thread holds at least one lock
				140	in common with LS, or (2) those writes must happen-after VTS. If
				141	neither of those two conditions hold, a race is reported.
				142
				143	Hence a Read state places a constraint on writes.
				144
				145	The effect of a state Write(VTS,LS) is similar, but it applies to all
				146	later-observed accesses: either (1) the accessing thread holds at
				147	least one lock in common with LS, or (2) those accesses must
				148	happen-after VTS. If neither of those two conditions hold, a race is
				149	reported.
				150
				151	Hence a Write state places a constraint on all accesses.
				152
				153	If we ignore the LS component of these states, the intuitive
				154	interpretation of the VTS component is that it states the earliest
				155	vector-time that the next write / access may safely happen.
				156