arm: optimized current_pt_regs()

... no need to read current_thread_info()->task only to
feed it to task_thread_page() immediately afterwards.
Moreover, not using current_thread_info() at all ends
up with better assembler - we need a location very close
to the top of kernel stack page and it's actually better
to do or with 0x1fff, followed be subtracting a small
constant than and with ~0x1fff, followed by adding a large
one.  Both & and | would be a couple of insns (mvn lsr/mvn lsl
for |, a pair of bic for &), but the following addition
would cost a pair of add while the subtraction ends up
as a single sub.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
diff --git a/arch/arm/include/asm/ptrace.h b/arch/arm/include/asm/ptrace.h
index 355ece5..44fe998 100644
--- a/arch/arm/include/asm/ptrace.h
+++ b/arch/arm/include/asm/ptrace.h
@@ -254,6 +254,11 @@
 	return regs->ARM_sp;
 }
 
+#define current_pt_regs(void) ({				\
+	register unsigned long sp asm ("sp");			\
+	(struct pt_regs *)((sp | (THREAD_SIZE - 1)) - 7) - 1;	\
+})
+
 #endif /* __KERNEL__ */
 
 #endif /* __ASSEMBLY__ */