atomic: remove all traces of READ_ONCE_CTRL() and atomic*_read_ctrl()

This seems to be a mis-reading of how alpha memory ordering works, and
is not backed up by the alpha architecture manual.  The helper functions
don't do anything special on any other architectures, and the arguments
that support them being safe on other architectures also argue that they
are safe on alpha.

Basically, the "control dependency" is between a previous read and a
subsequent write that is dependent on the value read.  Even if the
subsequent write is actually done speculatively, there is no way that
such a speculative write could be made visible to other cpu's until it
has been committed, which requires validating the speculation.

Note that most weakely ordered architectures (very much including alpha)
do not guarantee any ordering relationship between two loads that depend
on each other on a control dependency:

    read A
    if (val == 1)
        read B

because the conditional may be predicted, and the "read B" may be
speculatively moved up to before reading the value A.  So we require the
user to insert a smp_rmb() between the two accesses to be correct:

    read A;
    if (A == 1)
        smp_rmb()
        read B

Alpha is further special in that it can break that ordering even if the
*address* of B depends on the read of A, because the cacheline that is
read later may be stale unless you have a memory barrier in between the
pointer read and the read of the value behind a pointer:

    read ptr
    read offset(ptr)

whereas all other weakly ordered architectures guarantee that the data
dependency (as opposed to just a control dependency) will order the two
accesses.  As a result, alpha needs a "smp_read_barrier_depends()" in
between those two reads for them to be ordered.

The coontrol dependency that "READ_ONCE_CTRL()" and "atomic_read_ctrl()"
had was a control dependency to a subsequent *write*, however, and
nobody can finalize such a subsequent write without having actually done
the read.  And were you to write such a value to a "stale" cacheline
(the way the unordered reads came to be), that would seem to lose the
write entirely.

So the things that make alpha able to re-order reads even more
aggressively than other weak architectures do not seem to be relevant
for a subsequent write.  Alpha memory ordering may be strange, but
there's no real indication that it is *that* strange.

Also, the alpha architecture reference manual very explicitly talks
about the definition of "Dependence Constraints" in section 5.6.1.7,
where a preceding read dominates a subsequent write.

Such a dependence constraint admittedly does not impose a BEFORE (alpha
architecture term for globally visible ordering), but it does guarantee
that there can be no "causal loop".  I don't see how you could avoid
such a loop if another cpu could see the stored value and then impact
the value of the first read.  Put another way: the read and the write
could not be seen as being out of order wrt other cpus.

So I do not see how these "x_ctrl()" functions can currently be necessary.

I may have to eat my words at some point, but in the absense of clear
proof that alpha actually needs this, or indeed even an explanation of
how alpha could _possibly_ need it, I do not believe these functions are
called for.

And if it turns out that alpha really _does_ need a barrier for this
case, that barrier still should not be "smp_read_barrier_depends()".
We'd have to make up some new speciality barrier just for alpha, along
with the documentation for why it really is necessary.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul E McKenney <paulmck@us.ibm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index b5fe765..aef9487 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -617,16 +617,16 @@
 However, stores are not speculated.  This means that ordering -is- provided
 for load-store control dependencies, as in the following example:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	if (q) {
 		WRITE_ONCE(b, p);
 	}
 
 Control dependencies pair normally with other types of barriers.  That
-said, please note that READ_ONCE_CTRL() is not optional!  Without the
-READ_ONCE_CTRL(), the compiler might combine the load from 'a' with
-other loads from 'a', and the store to 'b' with other stores to 'b',
-with possible highly counterintuitive effects on ordering.
+said, please note that READ_ONCE() is not optional! Without the
+READ_ONCE(), the compiler might combine the load from 'a' with other
+loads from 'a', and the store to 'b' with other stores to 'b', with
+possible highly counterintuitive effects on ordering.
 
 Worse yet, if the compiler is able to prove (say) that the value of
 variable 'a' is always non-zero, it would be well within its rights
@@ -636,16 +636,12 @@
 	q = a;
 	b = p;  /* BUG: Compiler and CPU can both reorder!!! */
 
-Finally, the READ_ONCE_CTRL() includes an smp_read_barrier_depends()
-that DEC Alpha needs in order to respect control depedencies. Alternatively
-use one of atomic{,64}_read_ctrl().
-
-So don't leave out the READ_ONCE_CTRL().
+So don't leave out the READ_ONCE().
 
 It is tempting to try to enforce ordering on identical stores on both
 branches of the "if" statement as follows:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	if (q) {
 		barrier();
 		WRITE_ONCE(b, p);
@@ -659,7 +655,7 @@
 Unfortunately, current compilers will transform this as follows at high
 optimization levels:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	barrier();
 	WRITE_ONCE(b, p);  /* BUG: No ordering vs. load from a!!! */
 	if (q) {
@@ -689,7 +685,7 @@
 In contrast, without explicit memory barriers, two-legged-if control
 ordering is guaranteed only when the stores differ, for example:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	if (q) {
 		WRITE_ONCE(b, p);
 		do_something();
@@ -698,14 +694,14 @@
 		do_something_else();
 	}
 
-The initial READ_ONCE_CTRL() is still required to prevent the compiler
-from proving the value of 'a'.
+The initial READ_ONCE() is still required to prevent the compiler from
+proving the value of 'a'.
 
 In addition, you need to be careful what you do with the local variable 'q',
 otherwise the compiler might be able to guess the value and again remove
 the needed conditional.  For example:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	if (q % MAX) {
 		WRITE_ONCE(b, p);
 		do_something();
@@ -718,7 +714,7 @@
 equal to zero, in which case the compiler is within its rights to
 transform the above code into the following:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	WRITE_ONCE(b, p);
 	do_something_else();
 
@@ -729,7 +725,7 @@
 relying on this ordering, you should make sure that MAX is greater than
 one, perhaps as follows:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	BUILD_BUG_ON(MAX <= 1); /* Order load from a with store to b. */
 	if (q % MAX) {
 		WRITE_ONCE(b, p);
@@ -746,7 +742,7 @@
 You must also be careful not to rely too much on boolean short-circuit
 evaluation.  Consider this example:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	if (q || 1 > 0)
 		WRITE_ONCE(b, 1);
 
@@ -754,7 +750,7 @@
 always true, the compiler can transform this example as following,
 defeating control dependency:
 
-	q = READ_ONCE_CTRL(a);
+	q = READ_ONCE(a);
 	WRITE_ONCE(b, 1);
 
 This example underscores the need to ensure that the compiler cannot
@@ -768,7 +764,7 @@
 
 	CPU 0                     CPU 1
 	=======================   =======================
-	r1 = READ_ONCE_CTRL(x);   r2 = READ_ONCE_CTRL(y);
+	r1 = READ_ONCE(x);        r2 = READ_ONCE(y);
 	if (r1 > 0)               if (r2 > 0)
 	  WRITE_ONCE(y, 1);         WRITE_ONCE(x, 1);
 
@@ -797,11 +793,6 @@
 
 In summary:
 
-  (*) Control dependencies must be headed by READ_ONCE_CTRL(),
-      atomic{,64}_read_ctrl(). Or, as a much less preferable alternative,
-      interpose smp_read_barrier_depends() between a READ_ONCE() and the
-      control-dependent write.
-
   (*) Control dependencies can order prior loads against later stores.
       However, they do -not- guarantee any other sort of ordering:
       Not prior loads against later loads, nor prior stores against
@@ -817,14 +808,13 @@
       between the prior load and the subsequent store, and this
       conditional must involve the prior load.  If the compiler is able
       to optimize the conditional away, it will have also optimized
-      away the ordering.  Careful use of READ_ONCE_CTRL() READ_ONCE(),
-      and WRITE_ONCE() can help to preserve the needed conditional.
+      away the ordering.  Careful use of READ_ONCE() and WRITE_ONCE()
+      can help to preserve the needed conditional.
 
   (*) Control dependencies require that the compiler avoid reordering the
-      dependency into nonexistence.  Careful use of READ_ONCE_CTRL(),
-      atomic{,64}_read_ctrl() or smp_read_barrier_depends() can help to
-      preserve your control dependency.  Please see the Compiler Barrier
-      section for more information.
+      dependency into nonexistence.  Careful use of READ_ONCE() or
+      atomic{,64}_read() can help to preserve your control dependency.
+      Please see the Compiler Barrier section for more information.
 
   (*) Control dependencies pair normally with other types of barriers.