mm: kmemleak: optimise kmemleak_lock acquiring during kmemleak_scan

The kmemleak memory scanning uses finer grained object->lock spinlocks
primarily to avoid races with the memory block freeing.  However, the
pointer lookup in the rb tree requires the kmemleak_lock to be held.
This is currently done in the find_and_get_object() function for each
pointer-like location read during scanning.  While this allows a low
latency on kmemleak_*() callbacks on other CPUs, the memory scanning is
slower.

This patch moves the kmemleak_lock outside the scan_block() loop,
acquiring/releasing it only once per scanned memory block.  The
allow_resched logic is moved outside scan_block() and a new
scan_large_block() function is implemented which splits large blocks in
MAX_SCAN_SIZE chunks with cond_resched() calls in-between.  A redundant
(object->flags & OBJECT_NO_SCAN) check is also removed from
scan_object().

With this patch, the kmemleak scanning performance is significantly
improved: at least 50% with lock debugging disabled and over an order of
magnitude with lock proving enabled (on an arm64 system).

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1 file changed