Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]

Ingo Molnar wrote:

* Nick Piggin <npiggin@suse.de> wrote:
Maybe the progress is that more key people are becoming open tothe idea of changing the scheduler.
Could be. All was quiet for quite a while, but when RSDL showed up,it aroused enough interest to show that scheduling woes is on folksradar.
Well I know people have had woes with the scheduler for ever (I guessthat isn't going to change :P). [...]
yes, that part isnt going to change, because the CPU is a _scarceresource_ that is perhaps the most frequently overcommitted physicalcomputer resource in existence, and because the kernel does not (yet)track eye movements of humans to figure out which tasks are moreimportant them. So critical human constraints are unknown to thescheduler and thus complaints will always come.
The upstream scheduler thought it had enough information: the sleepaverage. So now the attempt is to go back and _simplify_ the schedulerand remove that information, and concentrate on getting fairnessprecisely right. The magic thing about 'fairness' is that it's a prettygood default policy if we decide that we simply have not enoughinformation to do an intelligent choice.
( Lets be cautious though: the jury is still out whether people actuallylike this more than the current approach. While CFS feedback lookspromising after a whopping 3 days of it being released [ ;-) ], thetest coverage of all 'fairness centric' schedulers, even consideringyears of availability is less than 1% i'm afraid, and that < 1% wasmostly self-selecting. )

At this point I'd like to make the observation that spa_ebs is a veryfair scheduler if you consider "nice" to be an indication of therelative entitlement of tasks to CPU bandwidth.

It works by mapping nice to shares using a function very similar to theone for calculating p->load weight except it's not offset by the RTpriorities as RT is handled separately. In theory, a runnable task'sentitlement to CPU bandwidth at any time is the ratio of its shares tothe total shares held by runnable tasks on the same CPU (in reality, asmoothed average of this sum is used to make scheduling smoother). Thedynamic priorities of the runnable tasks are then fiddled to try to keepeach tasks CPU bandwidth usage in proportion to its entitlement.

That's the theory anyway.

The actual implementation looks a bit different due to efficiencyconsiderations. The modifications to the above theory boil down tokeeping a running measure of the (recent) highest CPU bandwidth use pershare for tasks running on the CPU -- I call this the yardstick for thisCPU. When it's time to put a task on the run queue it's dynamicpriority is determined by comparing its CPU bandwidth per share valuewith the yardstick for its CPU. If it's greater than the yardstick thisvalue becomes the new yardstick and the task gets given the lowestpossible dynamic priority (for its scheduling class). If the value iszero it gets the highest possible priority (for its scheduling class)which would be MAX_RT_PRIO for a SCHED_OTHER task. Otherwise it getsgiven a priority between these two extremes proportional to ratio of itsCPU bandwidth per share value and the yardstick. Quite simple really.

The other way in which the code deviates from the original as that (fora few years now) I no longer calculated CPU bandwidth usage directly.I've found that the overhead is less if I keep a running average of thesize of a tasks CPU bursts and the length of its scheduling cycle (i.e.from on CPU one time to on CPU next time) and using the ratio of thesevalues as a measure of bandwidth usage.

Anyway it works and gives very predictable allocations of CPU bandwidthbased on nice.

Another good feature is that (in this pure form) it's starvation free.However, if you fiddle with it and do things like giving bonus priorityboosts to interactive tasks it becomes susceptible to starvation. Thiscan be fixed by using an anti starvation mechanism such as SPA'spromotion scheme and that's what spa_ebs does.

Peter
--
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: "Michael K. Edwards" <medwards.linux@gmail.com>

References:
- [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Con Kolivas <kernel@kolivas.org>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Mike Galbraith <efault@gmx.de>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Peter Williams <pwil3058@bigpond.net.au>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Mike Galbraith <efault@gmx.de>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Nick Piggin <npiggin@suse.de>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Mike Galbraith <efault@gmx.de>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Nick Piggin <npiggin@suse.de>
- Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
  - From: Ingo Molnar <mingo@elte.hu>

Prev by Date: Re: [CRYPTO] is it really optimized ?
Next by Date: Re: [PATCH] Show slab memory usage on OOM and SysRq-M
Previous by thread: Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Next by thread: Re: [Announce] [patch] Modular Scheduler Core and Completely Fair Scheduler [CFS]
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]