[PATCH] fix NUMA interleaving for huge pages Since vma->vm_pgoff is in units of smallpages, VMAs for huge pages have the lower HPAGE_SHIFT - PAGE_SHIFT bits always cleared, which results in badd offsets to the interleave functions. Take this difference from small pages into account when calculating the offset. This does add a 0-bit shift into the small-page path (via alloc_page_vma()), but I think that is negligible. Also add a BUG_ON to prevent the offset from growing due to a negative right-shift, which probably shouldn't be allowed anyways. Tested on an 8-memory node ppc64 NUMA box and got the interleaving I expected. Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com> Signed-off-by: Adam Litke <agl@us.ibm.com> Cc: Andi Kleen <ak@muc.de> Acked-by: Christoph Lameter <clameter@engr.sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>

commit: 3b98b087fc2daab67518d2baa8aef19a6ad82723 [log] [tgz]
author: Nishanth Aravamudan <nacc@us.ibm.com> Thu Aug 31 21:27:53 2006 -0700
committer: Linus Torvalds <torvalds@g5.osdl.org> Fri Sep 01 11:39:10 2006 -0700
tree: a7defc8fa53b2023affc072cd20c4f4734e4395d
parent: 1678df37be8abbb381becdc40242ed915e775550 [diff] [blame]
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index e07e27e..a9963ce 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c

@@ -1176,7 +1176,15 @@
 	if (vma) {
 		unsigned long off;
 
-		off = vma->vm_pgoff;
+		/*
+		 * for small pages, there is no difference between
+		 * shift and PAGE_SHIFT, so the bit-shift is safe.
+		 * for huge pages, since vm_pgoff is in units of small
+		 * pages, we need to shift off the always 0 bits to get
+		 * a useful offset.
+		 */
+		BUG_ON(shift < PAGE_SHIFT);
+		off = vma->vm_pgoff >> (shift - PAGE_SHIFT);
 		off += (addr - vma->vm_start) >> shift;
 		return offset_il_node(pol, vma, off);
 	} else
commit	3b98b087fc2daab67518d2baa8aef19a6ad82723	[log] [tgz]
author	Nishanth Aravamudan <nacc@us.ibm.com>	Thu Aug 31 21:27:53 2006 -0700
committer	Linus Torvalds <torvalds@g5.osdl.org>	Fri Sep 01 11:39:10 2006 -0700
tree	a7defc8fa53b2023affc072cd20c4f4734e4395d
parent	1678df37be8abbb381becdc40242ed915e775550 [diff] [blame]