Re: [patch] CFS scheduler, -v8

Hi,

As encouraged by some of you, I have started implementing EEVDF.However, I am quite new in this area, and may not be experienced enoughto get it through quickly. The main problems, I am facing now ,is howto treat the semantics of yeild() and yield_to(). I probably will throwa lot of questions along the way of my implementation.

Also I found my previous email was not clear enough in describing theproperties of CFS and EEVDF and caused some confusion, and there werealso some mistakes too. In this email, I will try to make up for that.


*** Let's start from CFS:

For simplicity, let's assume that CFS preempt the current task p1 byanother tasks p2, when p1->key - p2->key >1, and the virtual timerq->fair_clock is initialized to be 0. Suppose, at time t = 0, we startn+1 tasks that run long enough. task 1 has weight n and all other taskshave weight 1. It is clear that, at time t=0, p_1->key = p_2->key = ...=p_(n+1)-> key = rq->fair_clock = 0

Since all tasks has the same key, CFS breaks the ties arbitrarily,which leads to many possibilities. Let's consider 2 of them:

_Case One:_ p1, which has weight n, executes first:

t = 1: rq->fair_clock = 1/2n, p1->key = 1/n // othersare not changed.

         t = 2: rq->fair_clock = 2/2n,  p1->key = 2/n
                                  ...
         t = n: rq->fair_clock = n/2n,   p1->key = n/n = 1

Only after p1 executes n ticks, the scheduler will pick another taskfor execution. Between time [0, n)the amount of actual work done by p1 is n. The amount of work should bedone in ideal fluid-flow system is n * n/2n = n/2. Therefore the lag isn/2 - n = -n/2, negative means p1 goes faster than the ideal case. As wecan see this lag is O(n)._Case Two:_ the scheduler executes the tasks in the order p2, p3, ...,p_(n+1), p1t = 1: rq->fair_clock = 1/2n, p2->key = 1; // othersare not changed

         t = 2: rq->fair_clock = 2/2n,  p3->key = 1;
                               ....
         t = n: rq->fair_clock = n/2n,  p_(n+1)->key = 1;

Then the scheduler picks p1 (weight n) for execution. Between time[0, n) the amount actual work done by p1 is 0, and the ideal amount isn/2. Therefore the lag is n/2 - 0, positive means p1 falls behind theideal case. The lag here for p1 is also O(n).As I said in the previous email, p->fair_key only has theinformation of past execution of a task and reflects a fair start point.It does not have the information about weight.


*** Now, let's look at EEVDF.

I have to say that I missed a very important concept in EEVDF whichleads to confusions here. EEVDF stands for _Eligible_ Earliest VirtualDeadline First, and I did not explain what is _eligible_.

EEVDF maintains a virtual start time ve_i and virtual deadline vd_i foreach task p_i, as well as a virtual time vt. A newly started/waked taskhas its ve_i initialized to be the current virtual time. Once atimeslice l_i amount of work is done, the new virtual start time is setto be the previous virtual deadline, and then virtual deadline vd_i isrecalculated.A task is eligible, if and only if ve_i <= currentvirtual time vtEEVDF, at every tick, always picks the eligible task which has theearliest virtual deadline for execution


Let's see how it works using a similar example as for CFS above.

Suppose, at time t = 0, we starts n+1 tasks. p1 has weight n, and allothers have weight 1. For simplicity, we assume all task use timeslicel_i = 1, and virtual time vt is initialized to be 0.

  - at time t = 0, we have
            vt = 0;
            ve_1 = 0, vd_1 = ve_1 + l_1/w_1 = 1/n
            ve_2 = 0, vd_2 = ve_1 + l_2/w_2 = 1
                          ...
            ve_(n+1) = 0, vd_(n+1) = ve_(n+1) + l_(n+1)/w_(n+1) = 1;

Since p1 is eligible and has the earliest deadline 1/n, thescheduler will executes it first. (Here, the weight which encoded in thedeadline plays an important rule, and allows higher weight tasks to beexecuted first).- at time t = 1:vt = 1/2n,ve_1 = 1/n (previous vd_1), vd_1 = ve_1 + 1/n = 2/nSince ve_1 > vt, p1 is _not_ eligible. EEVDF picks another task forexecution by breaking the tie, say

it executes p2.
   - at time t = 2:
             vt = 2/2n = 1/n,  ve_1 = 1/n, vd_1 = 2/n

ve_2 = 1, ve_2 = ve_2 + 1/1 = 2 // this makesp2 not eligibleSince vt = ve_1, p1 becomes eligible again and has the earliestdeadline 2/n, it will be scheduled for execution. As EEVDF repeats, itgive a schedule like p1, p2, p1,p3, p1, p4, p1 .... (presented by eachtick). As you can see, now p1 never falls behind/goes before the idealcase by 1.Now, let's check how timeslice l_i impacts the system. Suppose, wechange the timeslice of p1 from 1 to 2, and keep others unchanged. EEVDFgives a schedule like:p1, p1, p2, p3, p1, p1, p4, p5, p1, p1, .... (presentedby each tick)

similarly if timeslice of p1 is set to be 3, EEVDF gives

p1, p1, p1, p2, p3, p4, p1, p1, p1, ....(presented by each tick)As the timeslice of p1 increases, the system checks for rescheduleless frequently, thus makes the lag of p1 becomes longer, but the lagwill not be larger than the maximum timeslice used, as long as it is afixed constant. On the other hand, increasing the timeslice of othertasks has no effect on when p1 is schedule. ( you can try to play thealgorithm by yourself :-))

In CFS, a task has to increases p->fair_key for a certain amount sothat the scheduler can consider it to be preempted. Higher weight leadsto less progress in p->fair_key, and then effectively large timeslice.Suppose the preempt granularity is 5 virtual ticks, then a task ofweight 1 needs to run 5 ticks, weight 2 need 10 ticks, weight 10 needs50 ticks. The effect is that the timeslice increases linearly w.r.tweight, and causes the O(n) lag. In fact, we use timeslice n for p1 inthe above example for EEVDF, it behaves exactly as CFS.

Now, I have fix an incorrect statement about EEVDF's lag bound in myprevious email:Under EEVDF, a task's lag (difference between the amount work shouldbe done in ideal fluid-flow system and the actual amount of work done)is bounded by the maximum timeslice used. (not occasionally as I said inmy previous email). This actually means the maximum timeslice usedcontrols the total responsiveness of the system. Also the combination ofhigh weight and smaller timeslice will give better response guaranteefor those bursty response-time sensitive tasks.

Sorry for the late replay. I had an eye exam today and got my eyesdilated, which forced me to stay away from my computer for a while :-)


Ting


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- [patch] CFS scheduler, -v8
  - From: Ingo Molnar <[email protected]>
- Re: [patch] CFS scheduler, -v8
  - From: Ting Yang <[email protected]>
- Re: [patch] CFS scheduler, -v8
  - From: Srivatsa Vaddagiri <[email protected]>
- Re: [patch] CFS scheduler, -v8
  - From: William Lee Irwin III <[email protected]>

Prev by Date: Re: [PATCH 1/3] xpad.c: Added flags into xpad_device structure and removed dpad_mapping.
Next by Date: Re: [PATCH 2.6.21 1/3] x86_64: EFI64 support
Previous by thread: Re: [patch] CFS scheduler, -v8
Next by thread: Re: [patch] CFS scheduler, -v8
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]