mm,vmacache: optimize overflow system-wide flushing For single threaded workloads, we can avoid flushing and iterating through the entire list of tasks, making the whole function a lot faster, requiring only a single atomic read for the mm_users. Signed-off-by: Davidlohr Bueso <davidlohr@hp.com> Suggested-by: Oleg Nesterov <oleg@redhat.com> Cc: Aswin Chandramouleeswaran <aswin@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit: 6b4ebc3a9078c5b7b8c4cf495a0b1d2d0e0bfe7a [log] [tgz]
author: Davidlohr Bueso <davidlohr@hp.com> Wed Jun 04 16:06:47 2014 -0700
committer: Linus Torvalds <torvalds@linux-foundation.org> Wed Jun 04 16:53:57 2014 -0700
tree: b72fad03149fb8e21284558b636bb2a8faa88cb6
parent: 4f115147ff802267d0aa41e361c5aa5bd933d896 [diff]
diff --git a/mm/vmacache.c b/mm/vmacache.c
index 658ed3b..9f25af8 100644
--- a/mm/vmacache.c
+++ b/mm/vmacache.c

@@ -17,6 +17,16 @@
 {
 	struct task_struct *g, *p;
 
+	/*
+	 * Single threaded tasks need not iterate the entire
+	 * list of process. We can avoid the flushing as well
+	 * since the mm's seqnum was increased and don't have
+	 * to worry about other threads' seqnum. Current's
+	 * flush will occur upon the next lookup.
+	 */
+	if (atomic_read(&mm->mm_users) == 1)
+		return;
+
 	rcu_read_lock();
 	for_each_process_thread(g, p) {
 		/*
commit	6b4ebc3a9078c5b7b8c4cf495a0b1d2d0e0bfe7a	[log] [tgz]
author	Davidlohr Bueso <davidlohr@hp.com>	Wed Jun 04 16:06:47 2014 -0700
committer	Linus Torvalds <torvalds@linux-foundation.org>	Wed Jun 04 16:53:57 2014 -0700
tree	b72fad03149fb8e21284558b636bb2a8faa88cb6
parent	4f115147ff802267d0aa41e361c5aa5bd933d896 [diff]