Re: -mm seems significanty slower than mainline on kernbench

Peter Williams wrote:

Con Kolivas wrote:

On Wednesday 11 January 2006 23:24, Peter Williams wrote:

Martin J. Bligh wrote:

That seems broken to me ?


But, yes, given that the problem goes away when the patch is removed
(which we're still waiting to see) it's broken.  I think the problem is
probably due to the changed metric (i.e. biased load instead of simple
load) causing idle_balance() to fail more often (i.e. it decides to not
bother moving any tasks more often than it otherwise would) which would
explain the increased idle time being seen.  This means that the fix
would be to review the criteria for deciding whether to move tasks in
idle_balance().

Look back on my implementation. The problem as I saw it was that onetask alone with a biased load would suddenly make a runqueue look muchbusier than it was supposed to so I special cased the runqueue thathad precisely one task.


OK.  I'll look at that.

Addressed in a separate e-mail.

But I was thinking more about the code that (in the original) handledthe case where the number of tasks to be moved was less than 1 but morethan 0 (i.e. the cases where "imbalance" would have been reduced to zerowhen divided by SCHED_LOAD_SCALE). I think that I got that part wrongand you can end up with a bias load to be moved which is less than anyof the bias_prio values for any queued tasks (in circumstances where theoriginal code would have rounded up to 1 and caused a move). I thinkthat the way to handle this problem is to replace 1 with "average biasprio" within that logic. This would guarantee at least one task with abias_prio small enough to be moved.
I think that this analysis is a strong argument for my original patchbeing the cause of the problem so I'll go ahead and generate a fix. I'lltry to have a patch available later this morning.

Attached is a patch that addresses this problem. Unlike the descriptionabove it does not use "average bias prio" as that solution would be verycomplicated. Instead it makes the assumption that NICE_TO_BIAS_PRIO(0)is a "good enough" for this purpose as this is highly likely to be themedian bias prio and the median is probably better for this purpose thanthe average.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

Peter
--
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce

Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c	2006-01-12 09:23:38.000000000 +1100
+++ MM-2.6.X/kernel/sched.c	2006-01-12 10:44:50.000000000 +1100
@@ -2116,11 +2116,11 @@ find_busiest_group(struct sched_domain *
 				(avg_load - this_load) * this->cpu_power)
 			/ SCHED_LOAD_SCALE;
 
-	if (*imbalance < SCHED_LOAD_SCALE) {
+	if (*imbalance < NICE_TO_BIAS_PRIO(0) * SCHED_LOAD_SCALE) {
 		unsigned long pwr_now = 0, pwr_move = 0;
 		unsigned long tmp;
 
-		if (max_load - this_load >= SCHED_LOAD_SCALE*2) {
+		if (max_load - this_load >= NICE_TO_BIAS_PRIO(0) * SCHED_LOAD_SCALE*2) {
 			*imbalance = NICE_TO_BIAS_PRIO(0);
 			return busiest;
 		}

Follow-Ups:
- Re: -mm seems significanty slower than mainline on kernbench
  - From: Martin Bligh <mbligh@google.com>
- Re: -mm seems significanty slower than mainline on kernbench
  - From: Con Kolivas <kernel@kolivas.org>

References:
- -mm seems significanty slower than mainline on kernbench
  - From: Martin Bligh <mbligh@google.com>
- Re: -mm seems significanty slower than mainline on kernbench
  - From: "Martin J. Bligh" <mbligh@google.com>
- Re: -mm seems significanty slower than mainline on kernbench
  - From: Peter Williams <pwil3058@bigpond.net.au>
- Re: -mm seems significanty slower than mainline on kernbench
  - From: Con Kolivas <kernel@kolivas.org>
- Re: -mm seems significanty slower than mainline on kernbench
  - From: Peter Williams <pwil3058@bigpond.net.au>

Prev by Date: [2.6.15-git6,-git7] hard lockup on FC4 exiting X (Intel I915)
Next by Date: Re: 2.6.15-mm3: arch/ia64/sn/kernel/sn2/sn_proc_fs.c compile error
Previous by thread: Re: -mm seems significanty slower than mainline on kernbench
Next by thread: Re: -mm seems significanty slower than mainline on kernbench
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]