Add TargetLowering::prepareVolatileOrAtomicLoad

One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.

llvm-svn: 196905
diff --git a/llvm/test/CodeGen/SystemZ/Large/branch-range-10.py b/llvm/test/CodeGen/SystemZ/Large/branch-range-10.py
index 3aeea3e..8c483c3 100644
--- a/llvm/test/CodeGen/SystemZ/Large/branch-range-10.py
+++ b/llvm/test/CodeGen/SystemZ/Large/branch-range-10.py
@@ -83,7 +83,7 @@
     next = 'before%d' % (i + 1) if i + 1 < branch_blocks else 'main'
     print 'before%d:' % i
     print '  %%bstop%d = getelementptr i8 *%%stop, i64 %d' % (i, i)
-    print '  %%bcur%d = load volatile i8 *%%bstop%d' % (i, i)
+    print '  %%bcur%d = load i8 *%%bstop%d' % (i, i)
     print '  %%bext%d = sext i8 %%bcur%d to i64' % (i, i)
     print '  %%btest%d = icmp ult i64 %%limit, %%bext%d' % (i, i)
     print '  br i1 %%btest%d, label %%after0, label %%%s' % (i, next)
@@ -100,7 +100,7 @@
 
 for i in xrange(branch_blocks):
     print '  %%astop%d = getelementptr i8 *%%stop, i64 %d' % (i, i + 25)
-    print '  %%acur%d = load volatile i8 *%%astop%d' % (i, i)
+    print '  %%acur%d = load i8 *%%astop%d' % (i, i)
     print '  %%aext%d = sext i8 %%acur%d to i64' % (i, i)
     print '  %%atest%d = icmp ult i64 %%limit, %%aext%d' % (i, i)
     print '  br i1 %%atest%d, label %%main, label %%after%d' % (i, i)