Add TargetLowering::prepareVolatileOrAtomicLoad

One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.

llvm-svn: 196905
diff --git a/llvm/test/CodeGen/SystemZ/Large/branch-range-12.py b/llvm/test/CodeGen/SystemZ/Large/branch-range-12.py
index 007d477..626c899 100644
--- a/llvm/test/CodeGen/SystemZ/Large/branch-range-12.py
+++ b/llvm/test/CodeGen/SystemZ/Large/branch-range-12.py
@@ -98,8 +98,8 @@
 for i in xrange(branch_blocks):
     next = 'before%d' % (i + 1) if i + 1 < branch_blocks else 'main'
     print 'before%d:' % i
-    print '  %%bcur%da = load volatile i64 *%%stopa' % i
-    print '  %%bcur%db = load volatile i64 *%%stopb' % i
+    print '  %%bcur%da = load i64 *%%stopa' % i
+    print '  %%bcur%db = load i64 *%%stopb' % i
     print '  %%bsub%d = sub i64 %%bcur%da, %%bcur%db' % (i, i, i)
     print '  %%btest%d = icmp ult i64 %%bsub%d, %d' % (i, i, i + 50)
     print '  br i1 %%btest%d, label %%after0, label %%%s' % (i, next)
@@ -115,8 +115,8 @@
     print '  store volatile i8 %d, i8 *%%ptr%d' % (value, i)
 
 for i in xrange(branch_blocks):
-    print '  %%acur%da = load volatile i64 *%%stopa' % i
-    print '  %%acur%db = load volatile i64 *%%stopb' % i
+    print '  %%acur%da = load i64 *%%stopa' % i
+    print '  %%acur%db = load i64 *%%stopb' % i
     print '  %%asub%d = sub i64 %%acur%da, %%acur%db' % (i, i, i)
     print '  %%atest%d = icmp ult i64 %%asub%d, %d' % (i, i, i + 100)
     print '  br i1 %%atest%d, label %%main, label %%after%d' % (i, i)