This patch introduces initial-exec model support for thread-local storage
on 64-bit PowerPC ELF.

The patch includes code to handle external assembly and MC output with the
integrated assembler.  It intentionally does not support the "old" JIT.

For the initial-exec TLS model, the ABI requires the following to calculate
the address of external thread-local variable x:

 Code sequence            Relocation                  Symbol
  ld 9,x@got@tprel(2)      R_PPC64_GOT_TPREL16_DS      x
  add 9,9,x@tls            R_PPC64_TLS                 x

The register 9 is arbitrary here.  The linker will replace x@got@tprel
with the offset relative to the thread pointer to the generated GOT
entry for symbol x.  It will replace x@tls with the thread-pointer
register (13).

The two test cases verify correct assembly output and relocation output
as just described.

PowerPC-specific selection node variants are added for the two
instructions above:  LD_GOT_TPREL and ADD_TLS.  These are inserted
when an initial-exec global variable is encountered by
PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to
machine instructions LDgotTPREL and ADD8TLS.  LDgotTPREL is a pseudo
that uses the same LDrs support added for medium code model's LDtocL,
with a different relocation type.

The rest of the processing is straightforward.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169281 91177308-0d34-0410-b5e6-96231b3b80d8
diff --git a/test/CodeGen/PowerPC/tls-ie.ll b/test/CodeGen/PowerPC/tls-ie.ll
new file mode 100644
index 0000000..cc6f084
--- /dev/null
+++ b/test/CodeGen/PowerPC/tls-ie.ll
@@ -0,0 +1,21 @@
+; RUN: llc -mcpu=pwr7 -O0 <%s | FileCheck %s
+
+; Test correct assembly code generation for thread-local storage
+; using the initial-exec model.
+
+target datalayout = "E-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f128:128:128-v128:128:128-n32:64"
+target triple = "powerpc64-unknown-linux-gnu"
+
+@a = external thread_local global i32
+
+define signext i32 @main() nounwind {
+entry:
+  %retval = alloca i32, align 4
+  store i32 0, i32* %retval
+  %0 = load i32* @a, align 4
+  ret i32 %0
+}
+
+; CHECK: ld [[REG:[0-9]+]], a@got@tprel(2)
+; CHECK: add {{[0-9]+}}, [[REG]], a@tls
+