x86, 64-bit: Clean up user address masking
The discussion about using "access_ok()" in get_user_pages_fast() (see
commit 7f8189068726492950bf1a2dcfd9b51314560abf: "x86: don't use
'access_ok()' as a range check in get_user_pages_fast()" for details and
end result), made us notice that x86-64 was really being very sloppy
about virtual address checking.
So be way more careful and straightforward about masking x86-64 virtual
addresses:
- All the VIRTUAL_MASK* variants now cover half of the address
space, it's not like we can use the full mask on a signed
integer, and the larger mask just invites mistakes when
applying it to either half of the 48-bit address space.
- /proc/kcore's kc_offset_to_vaddr() becomes a lot more
obvious when it transforms a file offset into a
(kernel-half) virtual address.
- Unify/simplify the 32-bit and 64-bit USER_DS definition to
be based on TASK_SIZE_MAX.
This cleanup and more careful/obvious user virtual address checking also
uncovered a buglet in the x86-64 implementation of strnlen_user(): it
would do an "access_ok()" check on the whole potential area, even if the
string itself was much shorter, and thus return an error even for valid
strings. Our sloppy checking had hidden this.
So this fixes 'strnlen_user()' to do this properly, the same way we
already handled user strings in 'strncpy_from_user()'. Namely by just
checking the first byte, and then relying on fault handling for the
rest. That always works, since we impose a guard page that cannot be
mapped at the end of the user space address space (and even if we
didn't, we'd have the address space hole).
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@suse.de>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index 8d382d3..7639dbf 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -41,7 +41,7 @@
/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
#define __PHYSICAL_MASK_SHIFT 46
-#define __VIRTUAL_MASK_SHIFT 48
+#define __VIRTUAL_MASK_SHIFT 47
/*
* Kernel image size is limited to 512 MB (see level2_kernel_pgt in
diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtable_64.h
index abde308..c57a301 100644
--- a/arch/x86/include/asm/pgtable_64.h
+++ b/arch/x86/include/asm/pgtable_64.h
@@ -165,10 +165,7 @@
/* fs/proc/kcore.c */
#define kc_vaddr_to_offset(v) ((v) & __VIRTUAL_MASK)
-#define kc_offset_to_vaddr(o) \
- (((o) & (1UL << (__VIRTUAL_MASK_SHIFT - 1))) \
- ? ((o) | ~__VIRTUAL_MASK) \
- : (o))
+#define kc_offset_to_vaddr(o) ((o) | ~__VIRTUAL_MASK)
#define __HAVE_ARCH_PTE_SAME
#endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 512ee87..20e6a79 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -25,12 +25,7 @@
#define MAKE_MM_SEG(s) ((mm_segment_t) { (s) })
#define KERNEL_DS MAKE_MM_SEG(-1UL)
-
-#ifdef CONFIG_X86_32
-# define USER_DS MAKE_MM_SEG(PAGE_OFFSET)
-#else
-# define USER_DS MAKE_MM_SEG(__VIRTUAL_MASK)
-#endif
+#define USER_DS MAKE_MM_SEG(TASK_SIZE_MAX)
#define get_ds() (KERNEL_DS)
#define get_fs() (current_thread_info()->addr_limit)