overhaul pthread cancellation

this patch improves the correctness, simplicity, and size of
cancellation-related code. modulo any small errors, it should now be
completely conformant, safe, and resource-leak free.

the notion of entering and exiting cancellation-point context has been
completely eliminated and replaced with alternative syscall assembly
code for cancellable syscalls. the assembly is responsible for setting
up execution context information (stack pointer and address of the
syscall instruction) which the cancellation signal handler can use to
determine whether the interrupted code was in a cancellable state.

these changes eliminate race conditions in the previous generation of
cancellation handling code (whereby a cancellation request received
just prior to the syscall would not be processed, leaving the syscall
to block, potentially indefinitely), and remedy an issue where
non-cancellable syscalls made from signal handlers became cancellable
if the signal handler interrupted a cancellation point.

x86_64 asm is untested and may need a second try to get it right.
diff --git a/src/internal/pthread_impl.h b/src/internal/pthread_impl.h
index a6d90e9..304bf98 100644
--- a/src/internal/pthread_impl.h
+++ b/src/internal/pthread_impl.h
@@ -24,7 +24,8 @@
 	unsigned long tlsdesc[4];
 	pid_t tid, pid;
 	int tsd_used, errno_val, *errno_ptr;
-	volatile int canceldisable, cancelasync, cancelpoint, cancel;
+	volatile uintptr_t cp_sp, cp_ip;
+	volatile int cancel, canceldisable, cancelasync;
 	unsigned char *map_base;
 	size_t map_size;
 	void *start_arg;
@@ -85,6 +86,7 @@
 void __unmapself(void *, size_t);
 
 int __timedwait(volatile int *, int, clockid_t, const struct timespec *, int);
+int __timedwait_cp(volatile int *, int, clockid_t, const struct timespec *, int);
 void __wait(volatile int *, volatile int *, int, int);
 void __wake(volatile int *, int, int);