target: Convert se_node_acl->device_list[] to RCU hlist

This patch converts se_node_acl->device_list[] table for mappedluns
to modern RCU hlist_head usage in order to support an arbitrary number
of node_acl lun mappings.

It converts transport_lookup_*_lun() fast-path code to use RCU read path
primitives when looking up se_dev_entry.  It adds a new hlist_head at
se_node_acl->lun_entry_hlist for this purpose.

For transport_lookup_cmd_lun() code, it works with existing per-cpu
se_lun->lun_ref when associating se_cmd with se_lun + se_device.
Also, go ahead and update core_create_device_list_for_node() +
core_free_device_list_for_node() to use ->lun_entry_hlist.

It also converts se_dev_entry->pr_ref_count access to use modern
struct kref counting, and updates core_disable_device_list_for_node()
to kref_put() and block on se_deve->pr_comp waiting for outstanding PR
special-case PR references to drop, then invoke kfree_rcu() to wait
for the RCU grace period to complete before releasing memory.

So now that se_node_acl->lun_entry_hlist fast path access uses RCU
protected pointers, go ahead and convert remaining non-fast path
RCU updater code using ->lun_entry_lock to struct mutex to allow
callers to block while walking se_node_acl->lun_entry_hlist.

Finally drop the left-over core_clear_initiator_node_from_tpg() that
originally cleared lun_access during se_node_acl shutdown, as post
RCU conversion it now becomes duplicated logic.

Reviewed-by: Hannes Reinecke <hare@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagig@mellanox.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
diff --git a/drivers/target/target_core_spc.c b/drivers/target/target_core_spc.c
index 78c0b40..9f995b87 100644
--- a/drivers/target/target_core_spc.c
+++ b/drivers/target/target_core_spc.c
@@ -981,6 +981,7 @@
 	int length = 0;
 	int ret;
 	int i;
+	bool read_only = target_lun_is_rdonly(cmd);;
 
 	memset(buf, 0, SE_MODE_PAGE_BUF);
 
@@ -991,9 +992,7 @@
 	length = ten ? 3 : 2;
 
 	/* DEVICE-SPECIFIC PARAMETER */
-	if ((cmd->se_lun->lun_access & TRANSPORT_LUNFLAGS_READ_ONLY) ||
-	    (cmd->se_deve &&
-	     (cmd->se_deve->lun_flags & TRANSPORT_LUNFLAGS_READ_ONLY)))
+	if ((cmd->se_lun->lun_access & TRANSPORT_LUNFLAGS_READ_ONLY) || read_only)
 		spc_modesense_write_protect(&buf[length], type);
 
 	/*
@@ -1211,8 +1210,9 @@
 {
 	struct se_dev_entry *deve;
 	struct se_session *sess = cmd->se_sess;
+	struct se_node_acl *nacl;
 	unsigned char *buf;
-	u32 lun_count = 0, offset = 8, i;
+	u32 lun_count = 0, offset = 8;
 
 	if (cmd->data_length < 16) {
 		pr_warn("REPORT LUNS allocation length %u too small\n",
@@ -1234,12 +1234,10 @@
 		lun_count = 1;
 		goto done;
 	}
+	nacl = sess->se_node_acl;
 
-	spin_lock_irq(&sess->se_node_acl->device_list_lock);
-	for (i = 0; i < TRANSPORT_MAX_LUNS_PER_TPG; i++) {
-		deve = sess->se_node_acl->device_list[i];
-		if (!(deve->lun_flags & TRANSPORT_LUNFLAGS_INITIATOR_ACCESS))
-			continue;
+	rcu_read_lock();
+	hlist_for_each_entry_rcu(deve, &nacl->lun_entry_hlist, link) {
 		/*
 		 * We determine the correct LUN LIST LENGTH even once we
 		 * have reached the initial allocation length.
@@ -1252,7 +1250,7 @@
 		int_to_scsilun(deve->mapped_lun, (struct scsi_lun *)&buf[offset]);
 		offset += 8;
 	}
-	spin_unlock_irq(&sess->se_node_acl->device_list_lock);
+	rcu_read_unlock();
 
 	/*
 	 * See SPC3 r07, page 159.