Linus Torvalds wrote:
>
> > > [ Oh. I'm looking at the semaphore code, and I realize that we have a
> > > "wake_up(&sem->wait)" in the __down() path because we had some race long
> > > ago that we fixed by band-aiding over it. Which means that we wake up
> > > sleepers that shouldn't be woken up. THAT may well be part of the
> > > performance problem.. The semaphores are really meant to wake up just
> > > one at a time, but because of that race hack they'll wake up _two_ at a
> > > time - once by up(), once by down().
> > >
> > > That also destroys the fairness. Does anybody remember why it's that
> > > way? ]
> >
> > History?
>
> Oh, absolutely, I already checked the old BK history too, and that extra
> wake_up() has been there at least since before we even started using BK.
> So it's very much historical, I'm just wondering if somebody remembers far
> enough back that we'd know.
>
> I don't see why it's needed (since we re-try the "atomic_add_negative()"
> inside the semaphore wait lock, and any up() that saw contention should
> have always been guaranteed to do a wakeup that should fill the race in
> between that atomic_add_negative() and the thing going to sleep).
>
> It may be that it is _purely_ historical, and simply isn't needed. That
> would be funny/sad, in the sense that we've had it there for years and
> years ;)
This does not look like it was added to fix a race or historical
to me. I think without that "wake_up" a waiter can miss wakeup
if it is not the only one sleeper.
sem->count == 0, ->sleepers == 0. down() decrements ->count,
__down:
// ->count == -1
++sleepers; // == 1
for (;;) {
->count += (->sleepers - 1); // does nothing
if (->count >= 0) // NO
break;
->sleepers = 1;
schedule();
...
Another process calls down(), ->count == -2
__down:
++sleepers; // == 2;
for (;;) {
->count += (->sleepers - 1) // ->count == -1;
->sleepers = 1;
schedule();
...
up() makes ++->count == 0, and wakes one of them. It will see
->sleepers == 1, so atomic_add_negative(sleepers - 1) again
has no effect, sets ->sleepers == 0 and takes the semaphore.
Note that subsequent up() will not call wakeup(): ->count == 0,
it just increment it. That is why we are waking the next waiter
in advance. When it gets cpu, it will decrement ->count by 1,
because ->sleepers == 0. If up() (++->count) was already called,
it takes semaphore. If not - goes to sleep again.
Or my understanding is completely broken?
Oleg.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]