Re: sched_yield: delete sysctl_sched_compat_yield

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Schwartz wrote:
Chris Friesen wrote:


If CFS really can't support sched_yield's semantics, then it should just
not, and that's that. Return ENOSYS and admit that the behavior sched_yield
is documented to have simply can't be supported by the scheduler.

That's just it though...sched_yield() with SCHED_OTHER doesn't have well defined semantics, so we can do just about anything we want.

The issue is mostly how to work around existing apps that (invalidly) expect certain behaviour from sched_yield().

The problem is where do we insert the task that is yielding?  CFS is
based around a tree structure ordered by time.

We put it exactly where we would have when its timeslice ran out. If we can
reward it a little bit, that's great. But if not, we can live with that.
Just imagine that the timer interrupt fired to indicate the end of the
thread's run time when the thread called 'sched_yield'.

CFS doesn't really do "timeslice". But in essence what you are describing is the default behaviour currently...it simply removes the task from the tree and reinserts it based on how much cpu time it used up.

Then what does he do when the task runs out of run time? It's hard to
imagine we can't do that when the task calls sched_yield.

It gets reinserted into the tree at a position based on how much cpu time it used. This is exactly the current sched_yield() behaviour.

It just says to delay the task,
without specifying how long or what the task is waiting *for*.

That is not true. The task is waiting for something that will be done by
another thread that is ready-to-run and at the same priority level. The task
does not need to wait until the thing is guaranteed done but wishes to wait
until it is more likely to be done. This is an often-misused but sometimes
sensible thing to do.

The scheduler still doesn't know specifically what the task is waiting for.

Sure, if there is more information. But if all you really want to do is wait
until other threads at the same static priority level have had a chance to
run, then sched_yield is the right API.

Technically, all of SCHED_OTHER has static priority level zero. Thus the "right" thing to do is to allow all SCHED_OTHER tasks to run, including the ones with the highst possible nice level.

This is the alternate implementation in the current code, but it has latency implications that may be unexpected by applications written for the previous 2.6 behaviour.

Chris

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux