[PATCH] x86_64: Optimize bitmap_weight for small bitmaps

Use inline code bitmaps <= BITS_PER_LONG in bitmap_weight. This
gives _much_ better code.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index d9ed279..dcc5de7 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -24,6 +24,9 @@
  * The available bitmap operations and their rough meaning in the
  * case that the bitmap is a single unsigned long are thus:
  *
+ * Note that nbits should be always a compile time evaluable constant.
+ * Otherwise many inlines will generate horrible code.
+ *
  * bitmap_zero(dst, nbits)			*dst = 0UL
  * bitmap_fill(dst, nbits)			*dst = ~0UL
  * bitmap_copy(dst, src, nbits)			*dst = *src
@@ -244,6 +247,8 @@
 
 static inline int bitmap_weight(const unsigned long *src, int nbits)
 {
+	if (nbits <= BITS_PER_LONG)
+		return hweight_long(*src & BITMAP_LAST_WORD_MASK(nbits));
 	return __bitmap_weight(src, nbits);
 }