Re: [patch] CFS scheduler, -v7

Hi, Ingo

This is the test case that I think worth discuss and it leads me tofind 2 things.

[...] but there are still some nice issues.
Try running 3 chew.c's, then renicing one to -10, starves others forsome seconds while switching prio-level. Now renice it back to 10, itstarves for up to 45sec.
ok - to make sure i understood you correctly: does this starvation onlyoccur right when you renice it (when switching prio levels), and it getsrectified quickly once they get over a few reschedules?

The main problem of what Al Boldi saw might come from this piece of codein sched_fair.c, which scales of fair_key difference needed to preemptthe current task:

+rescale_load(struct task_struct *p, u64 value)
+{
+    int load_shift = p->load_shift;
+
+    if (load_shift == SCHED_LOAD_SHIFT)
+        return value;
+
+    return (value << load_shift) >> SCHED_LOAD_SHIFT;
+}
+
+static u64
+niced_granularity(struct rq *rq, struct task_struct *curr,
+          unsigned long granularity)
+{
+    return rescale_load(curr, granularity);
+}

Here is the checking for pre-emption:

+static inline void
+__check_preempt_curr_fair(struct rq *rq, struct task_struct *p,
+              struct task_struct *curr, unsigned long granularity)
+{
+    s64 __delta = curr->fair_key - p->fair_key;
+
+    /*
+     * Take scheduling granularity into account - do not
+     * preempt the current task unless the best task has
+     * a larger than sched_granularity fairness advantage:
+     */
+    if (__delta > niced_granularity(rq, curr, granularity))
+        resched_task(curr);
+}

This code actually now says, the difference of fair_key needed topreempt the current task is amplified by a facto of its weigh (in AlBoldi's example 32). However, the weighted task already advance itsp->fair_key by its weight, (also 32 here). The combination of thembecomes quadratic!

Let's starting from three nice 0 tasks p1, p2, p3, at time t=0, withniced_granularity set to be 5ms:

    Originally each task executes 5 ms in turn:

5ms for p1, 5ms for p2, 5ms p3, 5ms for p1, 5ms forp2, 5ms for p3 ...If somehow p3 is re-niced to -10 _right before_ the 16th ms, we runinto the worst case after p3 gets the cpu:

       p1->fair_key = p2 ->fair_key = 10,   p3->fair_key = 5.

Now, in order for p3 to be preempted, it has to make its fair_key 5* 32 larger than p1 and p2's fair_key. Furthermore, p3 now is higherweight, push its fair_key to increase by 1 now needs 32ms,

thus p3 will stay one cpu for 5 * 32 *32ms, which is about 5 second!

Besides this quadratic effect, another minor issue amplified this alittle bit further: p->wait_runtime accumulated before. During renice itis not adjusted to match the new nice value. The p->wait_runtime earnedusing previous weight has to be paid off using the current weight. Ifrenice to larger weight you pay more than you need, otherwise you paidless, which introduces unfairness.

Ingo, now, partially solved this problem by clearingp->wait_runtime when a task is reniced, but the quadratic effect ofscaling is still there.

Thanks

Ting

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [patch] CFS scheduler, -v7
  - From: Ingo Molnar <mingo@elte.hu>

References:
- Re: [patch] CFS scheduler, -v7
  - From: Al Boldi <a1426z@gawab.com>
- Re: [patch] CFS scheduler, -v7
  - From: Ingo Molnar <mingo@elte.hu>

Prev by Date: Re: [PATCH] synclink_gt use dynamic tty device registration
Next by Date: Re: [patch] CFS scheduler, -v8
Previous by thread: Re: [patch] CFS scheduler, -v7
Next by thread: Re: [patch] CFS scheduler, -v7
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]