add pthread_attr_setstack interface (and get)

i originally omitted these (optional, per POSIX) interfaces because i
considered them backwards implementation details. however, someone
later brought to my attention a fairly legitimate use case: allocating
thread stacks in memory that's setup for sharing and/or fast transfer
between CPU and GPU so that the thread can move data to a GPU directly
from automatic-storage buffers without having to go through additional
buffer copies.

perhaps there are other situations in which these interfaces are
useful too.
diff --git a/src/internal/pthread_impl.h b/src/internal/pthread_impl.h
index d67edf2..0ce3c1e 100644
--- a/src/internal/pthread_impl.h
+++ b/src/internal/pthread_impl.h
@@ -59,7 +59,8 @@
 
 #define _a_stacksize __u.__s[0]
 #define _a_guardsize __u.__s[1]
-#define _a_detach __u.__i[2*__SU+0]
+#define _a_stackaddr __u.__s[2]
+#define _a_detach __u.__i[3*__SU+0]
 #define _m_type __u.__i[0]
 #define _m_lock __u.__i[1]
 #define _m_waiters __u.__i[2]