Diff - 4875647a08e35f77274838d97ca8fa44158d50e2^! - kernel/msm-4.9

commit	4875647a08e35f77274838d97ca8fa44158d50e2	[log] [tgz]
author	David Teigland <teigland@redhat.com>	Thu Apr 26 15:54:29 2012 -0500
committer	David Teigland <teigland@redhat.com>	Wed May 02 14:15:27 2012 -0500
tree	bf8a39eaf3219af5d661ed3e347545306fd84bda
parent	6d40c4a708e0e996fd9c60d4093aebba5fe1f749 [diff] [blame]

dlm: fixes for nodir mode

The "nodir" mode (statically assign master nodes instead
of using the resource directory) has always been highly
experimental, and never seriously used.  This commit
fixes a number of problems, making nodir much more usable.

- Major change to recovery: recover all locks and restart
  all in-progress operations after recovery.  In some
  cases it's not possible to know which in-progess locks
  to recover, so recover all.  (Most require recovery
  in nodir mode anyway since rehashing changes most
  master nodes.)

- Change the way nodir mode is enabled, from a command
  line mount arg passed through gfs2, into a sysfs
  file managed by dlm_controld, consistent with the
  other config settings.

- Allow recovering MSTCPY locks on an rsb that has not
  yet been turned into a master copy.

- Ignore RCOM_LOCK and RCOM_LOCK_REPLY recovery messages
  from a previous, aborted recovery cycle.  Base this
  on the local recovery status not being in the state
  where any nodes should be sending LOCK messages for the
  current recovery cycle.

- Hold rsb lock around dlm_purge_mstcpy_locks() because it
  may run concurrently with dlm_recover_master_copy().

- Maintain highbast on process-copy lkb's (in addition to
  the master as is usual), because the lkb can switch
  back and forth between being a master and being a
  process copy as the master node changes in recovery.

- When recovering MSTCPY locks, flag rsb's that have
  non-empty convert or waiting queues for granting
  at the end of recovery.  (Rename flag from LOCKS_PURGED
  to RECOVER_GRANT and similar for the recovery function,
  because it's not only resources with purged locks
  that need grant a grant attempt.)

- Replace a couple of unnecessary assertion panics with
  error messages.

Signed-off-by: David Teigland <teigland@redhat.com>

diff --git a/fs/dlm/recoverd.c b/fs/dlm/recoverd.c
index 11351b5..f1a9073 100644
--- a/fs/dlm/recoverd.c
+++ b/fs/dlm/recoverd.c

@@ -84,6 +84,8 @@
 		goto fail;
 	}
 
+	ls->ls_recover_locks_in = 0;
+
 	dlm_set_recover_status(ls, DLM_RS_NODES);
 
 	error = dlm_recover_members_wait(ls);
@@ -130,7 +132,7 @@
 		 * Clear lkb's for departed nodes.
 		 */
 
-		dlm_purge_locks(ls);
+		dlm_recover_purge(ls);
 
 		/*
 		 * Get new master nodeid's for rsb's that were mastered on
@@ -161,6 +163,9 @@
 			goto fail;
 		}
 
+		log_debug(ls, "dlm_recover_locks %u in",
+			  ls->ls_recover_locks_in);
+
 		/*
 		 * Finalize state in master rsb's now that all locks can be
 		 * checked.  This includes conversion resolution and lvb
@@ -225,7 +230,7 @@
 		goto fail;
 	}
 
-	dlm_grant_after_purge(ls);
+	dlm_recover_grant(ls);
 
 	log_debug(ls, "dlm_recover %llu generation %u done: %u ms",
 		  (unsigned long long)rv->seq, ls->ls_generation,