Per-thread -fstack-protector guards for x86.

Based on a pair of patches from Intel:

  https://android-review.googlesource.com/#/c/43909/
  https://android-review.googlesource.com/#/c/44903/

For x86, this patch supports _both_ the global that ARM/MIPS use
and the per-thread TLS entry (%gs:20) that GCC uses by default. This
lets us support binaries built with any x86 toolchain (right now,
the NDK is emitting x86 code that uses the global).

I've also extended the original tests to cover ARM/MIPS too, and
be a little more thorough for x86.

Change-Id: I02f279a80c6b626aecad449771dec91df235ad01
diff --git a/tests/dlopen_test.cpp b/tests/dlopen_test.cpp
index 5b5c7f6..d38d8c5 100644
--- a/tests/dlopen_test.cpp
+++ b/tests/dlopen_test.cpp
@@ -58,7 +58,7 @@
 #endif
 }
 
-static void* ConcurrentDlErrorFn(void* arg) {
+static void* ConcurrentDlErrorFn(void*) {
   dlopen("/child/thread", RTLD_NOW);
   return reinterpret_cast<void*>(strdup(dlerror()));
 }