[AArch64] Improve v8.1-A code-gen for atomic load-and

Armv8.1-A added an atomic load-clear instruction (which performs bitwise
and with the complement of it's operand), but not a load-and
instruction. Our current code-generation for atomic load-and always
inserts an MVN instruction to invert its argument, even if it could be
folded into a constant or another instruction.

This adds lowering early in selection DAG to convert a load-and
operation into an xor with -1 and a load-clear, allowing the normal DAG
optimisations to work on it.

To do this, I've had to add a new ISD opcode, ATOMIC_LOAD_CLR. I don't
see any easy way to do this with an AArch64-specific ISD node, because
the code-generation for atomic operations assumes the SDNodes are of
type AtomicSDNode.

I've left the old tablegen patterns in because they are still needed for
global isel.

Differential revision: https://reviews.llvm.org/D42478

llvm-svn: 324908
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index b3aa462..a7db5da 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -85,6 +85,7 @@
   case ISD::ATOMIC_LOAD_ADD:            return "AtomicLoadAdd";
   case ISD::ATOMIC_LOAD_SUB:            return "AtomicLoadSub";
   case ISD::ATOMIC_LOAD_AND:            return "AtomicLoadAnd";
+  case ISD::ATOMIC_LOAD_CLR:            return "AtomicLoadClr";
   case ISD::ATOMIC_LOAD_OR:             return "AtomicLoadOr";
   case ISD::ATOMIC_LOAD_XOR:            return "AtomicLoadXor";
   case ISD::ATOMIC_LOAD_NAND:           return "AtomicLoadNand";