Re: CFS review

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Ingo Molnar <[email protected]> wrote:

> * Roman Zippel <[email protected]> wrote:
> 
> > Well, I've sent him the stuff now...
> 
> received it - thanks alot, looking at it!

everything looks good in your debug output and the TSC dump data, except 
for the wait_runtime values, they are quite out of balance - and that 
balance cannot be explained with jiffies granularity or with any sort of 
sched_clock() artifact. So this clearly looks like a CFS regression that 
should be fixed.

the only relevant thing that comes to mind at the moment is that last 
week Peter noticed a buggy aspect of sleeper bonuses (in that we do not 
rate-limit their output, hence we 'waste' them instead of redistributing 
them), and i've got the small patch below in my queue to fix that - 
could you give it a try?

this is just a blind stab into the dark - i couldnt see any real impact 
from that patch in various workloads (and it's not upstream yet), so it 
might not make a big difference. The trace you did (could you send the 
source for that?) seems to implicate sleeper bonuses though.

if this patch doesnt help, could you check the general theory whether 
it's related to sleeper-fairness, via turning it off:

   echo 30 > /proc/sys/kernel/sched_features

does the bug go away if you do that? If sleeper bonuses are showing too 
many artifacts then we could turn it off for final .23.

	Ingo

--------------------->
Subject: sched: fix sleeper bonus
From: Ingo Molnar <[email protected]>

Peter Ziljstra noticed that the sleeper bonus deduction code was not 
properly rate-limited: a task that scheduled more frequently would get a 
disproportionately large deduction. So limit the deduction to delta_exec 
and limit production to runtime_limit.

Not-Yet-Signed-off-by: Ingo Molnar <[email protected]>
---
 kernel/sched_fair.c |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Index: linux/kernel/sched_fair.c
===================================================================
--- linux.orig/kernel/sched_fair.c
+++ linux/kernel/sched_fair.c
@@ -75,7 +75,7 @@ enum {
 
 unsigned int sysctl_sched_features __read_mostly =
 		SCHED_FEAT_FAIR_SLEEPERS	*1 |
-		SCHED_FEAT_SLEEPER_AVG		*1 |
+		SCHED_FEAT_SLEEPER_AVG		*0 |
 		SCHED_FEAT_SLEEPER_LOAD_AVG	*1 |
 		SCHED_FEAT_PRECISE_CPU_LOAD	*1 |
 		SCHED_FEAT_START_DEBIT		*1 |
@@ -304,11 +304,9 @@ __update_curr(struct cfs_rq *cfs_rq, str
 	delta_mine = calc_delta_mine(delta_exec, curr->load.weight, lw);
 
 	if (cfs_rq->sleeper_bonus > sysctl_sched_granularity) {
-		delta = calc_delta_mine(cfs_rq->sleeper_bonus,
-					curr->load.weight, lw);
-		if (unlikely(delta > cfs_rq->sleeper_bonus))
-			delta = cfs_rq->sleeper_bonus;
-
+		delta = min(cfs_rq->sleeper_bonus, (u64)delta_exec);
+		delta = calc_delta_mine(delta, curr->load.weight, lw);
+		delta = min((u64)delta, cfs_rq->sleeper_bonus);
 		cfs_rq->sleeper_bonus -= delta;
 		delta_mine -= delta;
 	}
@@ -521,6 +519,8 @@ static void __enqueue_sleeper(struct cfs
 	 * Track the amount of bonus we've given to sleepers:
 	 */
 	cfs_rq->sleeper_bonus += delta_fair;
+	if (unlikely(cfs_rq->sleeper_bonus > sysctl_sched_runtime_limit))
+		cfs_rq->sleeper_bonus = sysctl_sched_runtime_limit;
 
 	schedstat_add(cfs_rq, wait_runtime, se->wait_runtime);
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux