Re: PowerPC fastpaths for mutex subsystem

ok. I'll really need to look at "vmstat" output from these. We couldeasily make the mutex slowpath behave like ppc64 semaphores, via theattached (untested) patch, but i really think it's the wrong thing todo, because it overloads the system with runnable tasks in anessentially unlimited fashion [== overscheduling] - they'll all contendfor the same single mutex.
in synthetic workloads on idle systems it such overscheduling can help,because the 'luck factor' of the 'thundering herd' of tasks can generatea higher total throughput - at the expense of system efficiency. At 8CPUs i already measured a net performance loss at 3 tasks! So i thinkthe current 'at most 2 tasks runnable' approach of mutexes is the rightone on a broad range of hardware.
still, i'll try a different patch tomorrow, to keep the number of 'inflight' tasks within a certain limit (say at 2) - i suspect that wouldclose the performance gap too, on this test.

The fundamental problem is that there is a relatively major latency to wakesomebody up, and for them to actually run so they can acquire a lock. In anideal world there would always be a single waiter running trying to acquire thelock at the moment it was unlocked and not running until then.

There are better solutions than just madly throwing more waiters in flight on anunlock. Here's three possibilities off the top of my head:

1) It is possible to have a hybrid lock that spins a single waiting thread andsleeps waiters 2..n, so there is always a waiter running trying to acquire thelock. It solves the latency problem if the lock is held a length of time atleast as long as it takes to wake up the next waiter. But the spinning waiterburns some cpu to buy the decreased latency.

2) You could also do the classic spin for awhile and then sleep method. Thisessentially turns low latency locks into spinlocks but still sleeps locks whichare held longer and/or are much more contested.

3) There is the option to look at cpu idleness of the current cpu and spin orsleep based on that.

4) Accept that we have a cpu efficient high latency lock and use it appropriately.

I'm not saying any of these 4 is what we should do. I'm just trying to saythere are options out there that don't involve thundering hurds and luck toaddress the problem.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- [patch 00/21] mutex subsystem, -V14
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: [patch 00/21] mutex subsystem, -V14
  - From: Olof Johansson <olof@lixom.net>
- PowerPC fastpaths for mutex subsystem
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: PowerPC fastpaths for mutex subsystem
  - From: Ingo Molnar <mingo@elte.hu>
- Re: PowerPC fastpaths for mutex subsystem
  - From: Joel Schopp <jschopp@austin.ibm.com>
- Re: PowerPC fastpaths for mutex subsystem
  - From: Ingo Molnar <mingo@elte.hu>

Prev by Date: Re: [PATCH ] VMSPLIT config options (with default config fixed)
Next by Date: Re: FAT and Microsoft patent?
Previous by thread: Re: PowerPC fastpaths for mutex subsystem
Next by thread: Re: PowerPC fastpaths for mutex subsystem
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]