target: encapsulate smp_mb__after_atomic()

The target code has a rather generous helping of smp_mb__after_atomic()
throughout the code base.  Most atomic operations were followed by one
and none were preceded by smp_mb__before_atomic(), nor accompanied by a
comment explaining the need for a barrier.

Instead of trying to prove for every case whether or not it is needed,
this patch introduces atomic_inc_mb() and atomic_dec_mb(), which
explicitly include the memory barriers before and after the atomic
operation.  For now they are defined in a target header, although they
could be of general use.

Most of the existing atomic/mb combinations were replaced by the new
helpers.  In a few cases the atomic was sandwiched in
spin_lock/spin_unlock and I simply removed the barrier.

I suspect that in most cases the correct conversion would have been to
drop the barrier.  I also suspect that a few cases exist where a) the
barrier was necessary and b) a second barrier before the atomic would
have been necessary and got added by this patch.

Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
diff --git a/drivers/target/target_core_device.c b/drivers/target/target_core_device.c
index e784284..f5057a2 100644
--- a/drivers/target/target_core_device.c
+++ b/drivers/target/target_core_device.c
@@ -224,8 +224,7 @@
 		if (port->sep_rtpi != rtpi)
 			continue;
 
-		atomic_inc(&deve->pr_ref_count);
-		smp_mb__after_atomic();
+		atomic_inc_mb(&deve->pr_ref_count);
 		spin_unlock_irq(&nacl->device_list_lock);
 
 		return deve;
@@ -1388,8 +1387,7 @@
 
 	spin_lock(&lun->lun_acl_lock);
 	list_add_tail(&lacl->lacl_list, &lun->lun_acl_list);
-	atomic_inc(&lun->lun_acl_count);
-	smp_mb__after_atomic();
+	atomic_inc_mb(&lun->lun_acl_count);
 	spin_unlock(&lun->lun_acl_lock);
 
 	pr_debug("%s_TPG[%hu]_LUN[%u->%u] - Added %s ACL for "
@@ -1422,8 +1420,7 @@
 
 	spin_lock(&lun->lun_acl_lock);
 	list_del(&lacl->lacl_list);
-	atomic_dec(&lun->lun_acl_count);
-	smp_mb__after_atomic();
+	atomic_dec_mb(&lun->lun_acl_count);
 	spin_unlock(&lun->lun_acl_lock);
 
 	core_disable_device_list_for_node(lun, NULL, lacl->mapped_lun,