Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Eric,

[...]
> >> the ramp up slows down after 700-800 processes, but something very 
> >> strange happens. If I'm under X, I can switch the focus to all xterms 
> >> (the WM is still alive) but all xterms are frozen. On the console, 
> >> after one moment I simply cannot switch to another VT anymore while I 
> >> can still start commands locally. But "chvt 2" simply blocks. SysRq-K 
> >> killed everything and restored full control. Dmesg shows lots of :
> >
> >> SAK: killed process xxxx (scheddos2): process_session(p)==tty->session.
> 
> This.  Yes. SAK is noisy and tells you everything it kills.

OK, that's what I suspected, but I did not know if the fact that it talked
about the session was systematic or related to any particular state when it
killed the task.

> >> I wonder if part of the problem would be too many processes bound to 
> >> the same tty :-/
> >
> > hm, that's really weird. I've Cc:-ed the tty experts (Erik, Jiri, Alan), 
> > maybe this description rings a bell with them?
> 
> Is there any swapping going on?

Not at all.

> I'm inclined to suspect that it is a problem that has more to do with the
> number of processes and has nothing to do with ttys.

It is clearly possible. What I found strange is that I could still fork
processes (eg: ls, dmesg|tail), ... but not switch to another VT anymore.
It first happened under X with frozen xterms but a perfectly usable WM,
then I reproduced it on pure console to rule out any potential X problem.

> Anyway you can easily rule out ttys by having your startup program
> detach from a controlling tty before you start everything.
> 
> I'm more inclined to guess something is reading /proc a lot, or doing
> something that holds the tasklist lock, a lot or something like that,
> if the problem isn't that you are being kicked into swap.

Oh I'm sorry you were invited into the discussion without a first description
of the context. I was giving a try to Ingo's new scheduler, and trying to
reach corner cases with lots of processes competing for CPU.

I simply used a "for" loop in bash to fork 1000 processes, and this problem
happened between 700-800 children. The program only uses a busy loop and a
pause. I then changed my program to close 0,1,2 and perform the fork itself,
and the problem vanished. So there are two differences here :

  - bash not forking anymore
  - far less FDs on /dev/tty1

At first, I had around 2200 fds on /dev/tty1, reason why I suspected something
in this area.

I agree that this is not normal usage at all, I'm just trying to attack
Ingo's scheduler to ensure it is more robust than the stock one. But
sometimes brute force methods can make other sleeping problems pop up.

Thinking about it, I don't know if there are calls to schedule() while
switching from tty1 to tty2. Alt-F2 had no effect anymore, and "chvt 2"
simply blocked. It would have been possible that a schedule() call
somewhere got starved due to the load, I don't know.

Thanks,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux