Re: PowerPC fastpaths for mutex subsystem

* Joel Schopp <jschopp@austin.ibm.com> wrote:

> >interesting. Could you try two things? Firstly, could you add some 
> >minimal delays to the lock/unlock path, of at least 1 usec? E.g.  
> >"synchro-test.ko load=1 interval=1". [but you could try longer delays 
> >too, 10 usecs is still realistic.]
> 
> Graphs attached.  The summary for those who don't like to look at 
> attachments is that the mutex fastpath (threads 1) that I sent the 
> optimized patch for is comparable within the margin of error to 
> semaphores.  The mutex common path (threads > 1) gets embarrassed by 
> semaphores. So mutexes common paths are not yet ready as far as ppc64 
> is concerned.

ok. I'll really need to look at "vmstat" output from these. We could 
easily make the mutex slowpath behave like ppc64 semaphores, via the 
attached (untested) patch, but i really think it's the wrong thing to 
do, because it overloads the system with runnable tasks in an 
essentially unlimited fashion [== overscheduling] - they'll all contend 
for the same single mutex.

in synthetic workloads on idle systems it such overscheduling can help, 
because the 'luck factor' of the 'thundering herd' of tasks can generate 
a higher total throughput - at the expense of system efficiency. At 8 
CPUs i already measured a net performance loss at 3 tasks! So i think 
the current 'at most 2 tasks runnable' approach of mutexes is the right 
one on a broad range of hardware.

still, i'll try a different patch tomorrow, to keep the number of 'in 
flight' tasks within a certain limit (say at 2) - i suspect that would 
close the performance gap too, on this test.

but i really think the current 'at most one task in flight' logic is the 
correct approach. I'm also curious about the VFS-test numbers (already 
on your todo).

> >thirdly, could you run 'vmstat 1' during the tests, and post those lines 
> >too? Here i'm curious about two things: the average runqueue length 
> >(whether we have overscheduling), and CPU utilization and idle time left 
> >(how efficiently cycles are preserved in contention). [btw., does ppc 
> >have an idle=poll equivalent mode of idling?]
> 
> Also queued in my todo list.

thanks!

> >also, there seems to be some fluctuation in the numbers - could you try 
> >to run a few more to see how stable the numbers are?
> 
> For the graphs the line is the average of 5 runs, and the 5 runs are 
> scatter plotted as well.

ok, that should be more than enough.

	Ingo

--- kernel/mutex.c.orig
+++ kernel/mutex.c
@@ -226,6 +226,9 @@ __mutex_unlock_slowpath(atomic_t *lock_c

 		debug_mutex_wake_waiter(lock, waiter);

+		/* be (much) more agressive about wakeups: */
+		list_move_tail(&waiter.list, &lock->wait_list);
+
 		wake_up_process(waiter->task);
 	}

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: PowerPC fastpaths for mutex subsystem
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: PowerPC fastpaths for mutex subsystem
  - From: Ingo Molnar <mingo@elte.hu>

References:
- [patch 00/21] mutex subsystem, -V14
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Olof Johansson <olof@lixom.net>
- PowerPC fastpaths for mutex subsystem
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: PowerPC fastpaths for mutex subsystem
  - From: Ingo Molnar <mingo@elte.hu>
- Re: PowerPC fastpaths for mutex subsystem
  - From: Joel Schopp <jschopp@austin.ibm.com>

Prev by Date: Re: Digital Audio Extraction with ATAPI drives far from perfect
Next by Date: Re: [patch 1/2] [BUG]kallsyms_lookup_name should return the text addres
Previous by thread: Re: PowerPC fastpaths for mutex subsystem
Next by thread: Re: PowerPC fastpaths for mutex subsystem
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]