Add thread pool class

Added a thread pool class loosely based on google3 code.

Modified the compiler to have a single thread pool instead of creating new threads in ForAll.

Moved barrier to be in top level directory as it is not GC specific code.

Performance Timings:

Reference:
boot.oat: 14.306596s
time mm oat-target:
real    2m33.748s
user    10m23.190s
sys 5m54.140s

Thread pool:
boot.oat: 13.111049s
time mm oat-target:
real    2m29.372s
user    10m3.130s
sys 5m46.290s

The speed increase is probably just noise.

Change-Id: If3c1280cbaa4c7e4361127d064ac744ea12cdf49
diff --git a/src/atomic_integer.h b/src/atomic_integer.h
index 54d5fd8..adf3e77 100644
--- a/src/atomic_integer.h
+++ b/src/atomic_integer.h
@@ -17,7 +17,8 @@
 #ifndef ART_SRC_ATOMIC_INTEGER_H_
 #define ART_SRC_ATOMIC_INTEGER_H_
 
-#include "atomic.h"
+#include "cutils/atomic.h"
+#include "cutils/atomic-inline.h"
 
 namespace art {
 
@@ -62,6 +63,14 @@
   int32_t operator -- (int32_t) {
     return android_atomic_dec(&value_);
   }
+
+  int32_t operator ++ () {
+    return android_atomic_inc(&value_) + 1;
+  }
+
+  int32_t operator -- () {
+    return android_atomic_dec(&value_) - 1;
+  }
  private:
   int32_t value_;
 };