powerpc/mm: Fix _PAGE_PTE breaking swapoff
Core kernel expects swp_entry_t to consist of only swap type and swap
offset. We should not leak pte bits into swp_entry_t. This breaks
swapoff which use the swap type and offset to build a swp_entry_t and
later compare that to the swp_entry_t obtained from linux page table
pte. Leaking pte bits into swp_entry_t breaks that comparison and
results in us looping in try_to_unuse.
The stack trace can be anywhere below try_to_unuse() in mm/swapfile.c,
since swapoff is circling around and around that function, reading from
each used swap block into a page, then trying to find where that page
belongs, looking at every non-file pte of every mm that ever swapped.
Fixes: 6a119eae942c ("powerpc/mm: Add a _PAGE_PTE bit")
Reported-by: Hugh Dickins <hughd@google.com>
Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 03c1a5a..8e040c4 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -158,9 +158,14 @@
#define __swp_entry(type, offset) ((swp_entry_t) { \
((type) << _PAGE_BIT_SWAP_TYPE) \
| ((offset) << PTE_RPN_SHIFT) })
-
-#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) })
-#define __swp_entry_to_pte(x) __pte((x).val)
+/*
+ * swp_entry_t must be independent of pte bits. We build a swp_entry_t from
+ * swap type and offset we get from swap and convert that to pte to find a
+ * matching pte in linux page table.
+ * Clear bits not found in swap entries here.
+ */
+#define __pte_to_swp_entry(pte) ((swp_entry_t) { pte_val((pte)) & ~_PAGE_PTE })
+#define __swp_entry_to_pte(x) __pte((x).val | _PAGE_PTE)
#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
#define _PAGE_SWP_SOFT_DIRTY (1UL << (SWP_TYPE_BITS + _PAGE_BIT_SWAP_TYPE))