mm: gup persist for write permission
do_wp_page()'s VM_FAULT_WRITE return value tells __get_user_pages() that
COW has been done if necessary, though it may be leaving the pte without
write permission - for the odd case of forced writing to a readonly vma
for ptrace. At present GUP then retries the follow_page() without asking
for write permission, to escape an endless loop when forced.
But an application may be relying on GUP to guarantee a writable page
which won't be COWed again when written from userspace, whereas a race
here might leave a readonly pte in place? Change the VM_FAULT_WRITE
handling to ask follow_page() for write permission again, except in that
odd case of forced writing to a readonly vma.
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Robin Holt <holt@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
diff --git a/mm/memory.c b/mm/memory.c
index 122d965..f594bb6 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1264,9 +1264,15 @@
* do_wp_page has broken COW when necessary,
* even if maybe_mkwrite decided not to set
* pte_write. We can thus safely do subsequent
- * page lookups as if they were reads.
+ * page lookups as if they were reads. But only
+ * do so when looping for pte_write is futile:
+ * in some cases userspace may also be wanting
+ * to write to the gotten user page, which a
+ * read fault here might prevent (a readonly
+ * page might get reCOWed by userspace write).
*/
- if (ret & VM_FAULT_WRITE)
+ if ((ret & VM_FAULT_WRITE) &&
+ !(vma->vm_flags & VM_WRITE))
foll_flags &= ~FOLL_WRITE;
cond_resched();