[GFS2] delay glock demote for a minimum hold time
When a lot of IO, with some distributed mmap IO, is run on a GFS2 filesystem in
a cluster, it will deadlock. The reason is that do_no_page() will repeatedly
call gfs2_sharewrite_nopage(), because each node keeps giving up the glock
too early, and is forced to call unmap_mapping_range(). This bumps the
mapping->truncate_count sequence count, forcing do_no_page() to retry. This
patch institutes a minimum glock hold time a tenth a second. This insures
that even in heavy contention cases, the node has enough time to get some
useful work done before it gives up the glock.
A second issue is that when gfs2_glock_dq() is called from within a page fault
to demote a lock, and the associated page needs to be written out, it will
try to acqire a lock on it, but it has already been locked at a higher level.
This patch puts makes gfs2_glock_dq() use the work queue as well, to avoid this
issue. This is the same patch as Steve Whitehouse originally proposed to fix
this issue, execpt that gfs2_glock_dq() now grabs a reference to the glock
before it queues up the work on it.
Signed-off-by: Benjamin E. Marzinski <bmarzins@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
diff --git a/fs/gfs2/glops.c b/fs/gfs2/glops.c
index 88342e0..7ef6b23 100644
--- a/fs/gfs2/glops.c
+++ b/fs/gfs2/glops.c
@@ -454,6 +454,7 @@
.go_lock = inode_go_lock,
.go_unlock = inode_go_unlock,
.go_type = LM_TYPE_INODE,
+ .go_min_hold_time = HZ / 10,
};
const struct gfs2_glock_operations gfs2_rgrp_glops = {
@@ -464,6 +465,7 @@
.go_lock = rgrp_go_lock,
.go_unlock = rgrp_go_unlock,
.go_type = LM_TYPE_RGRP,
+ .go_min_hold_time = HZ / 10,
};
const struct gfs2_glock_operations gfs2_trans_glops = {