[patch] notifiers: fix blocking_notifier_call_chain() scalability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Subject: [patch] notifiers: fix blocking_notifier_call_chain() scalability
From: Ingo Molnar <[email protected]>

while lock-profiling the -rt kernel i noticed weird contention during 
mmap-intense workloads, and the tracer showed the following gem, in one 
of our MM hotpaths:

 threaded-2771  1....   65us : sys_munmap (sysenter_do_call)
 threaded-2771  1....   66us : profile_munmap (sys_munmap)
 threaded-2771  1....   66us : blocking_notifier_call_chain (profile_munmap)
 threaded-2771  1....   66us : rt_down_read (blocking_notifier_call_chain)

ouch! a global rw-semaphore taken in one of the most 
performance-sensitive codepaths of the kernel. And i dont even have 
oprofile enabled! All distro kernels have CONFIG_PROFILING enabled, so 
this scalability problem affects the majority of Linux users.

The fix is to enhance blocking_notifier_call_chain() to only take the 
lock if there appears to be work on the call-chain.

With this patch applied i get nicely saturated system, and much higher 
munmap performance, on SMP systems.

And as a bonus this also fixes a similar scalability bottleneck in the 
thread-exit codepath: profile_task_exit() ...

Signed-off-by: Ingo Molnar <[email protected]>
---
 kernel/sys.c |   15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

Index: linux/kernel/sys.c
===================================================================
--- linux.orig/kernel/sys.c
+++ linux/kernel/sys.c
@@ -325,11 +325,18 @@ EXPORT_SYMBOL_GPL(blocking_notifier_chai
 int blocking_notifier_call_chain(struct blocking_notifier_head *nh,
 		unsigned long val, void *v)
 {
-	int ret;
+	int ret = NOTIFY_DONE;
 
-	down_read(&nh->rwsem);
-	ret = notifier_call_chain(&nh->head, val, v);
-	up_read(&nh->rwsem);
+	/*
+	 * We check the head outside the lock, but if this access is
+	 * racy then it does not matter what the result of the test
+	 * is, we re-check the list after having taken the lock anyway:
+	 */
+	if (rcu_dereference(nh->head)) {
+		down_read(&nh->rwsem);
+		ret = notifier_call_chain(&nh->head, val, v);
+		up_read(&nh->rwsem);
+	}
 	return ret;
 }
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux