[PATCH] sched: smpnice prevent integer arithmetic wrap problems

Peter Williams wrote:

Siddha, Suresh B wrote:
more issues with smpnice patch...
a) consider a 4-way system (simple SMP system with no HT and cores)scenario
where a high priority task (nice -20) is running on P0 and two normal
priority tasks running on P1. load balance with smp nice code
will never be able to detect an imbalance and hence will never moveone of the normal priority tasks on P1 to idle cpus P2 or P3.
Fix already sent.
b) smpnice seems to break this patch..

[PATCH] sched: allow the load to grow upto its cpu_power
http://www.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0c117f1b4d14380baeed9c883f765ee023da8761
example scenario for this case: consider a numa system with two nodes,eachnode containing four processors. if there are two processes in node-0and withnode-1 being completely idle, your patch will move one of thoseprocesses tonode-1 whereas the previous behavior will retain those two processesin node-0..(in this case, in your code max_load will be less thanbusiest_load_per_task)
I think that the patch I sent to address a) above will also fix thisproblem as find_busiest_queue() will no longer find node-0 as thebusiest group unless both of the processes in node-0 are on the sameCPU. This is because it now only considers groups that have at leastone CPU with more than one running task as candidates for being thebusiest group.
Implicit in this is the assumption that it's OK to move one of the tasksfrom node-0 to node-1 if they're both on the same CPU within node-0.
Could you confirm this is OK?

It looks like my coffee was slow kicking in this morning :-)

When I looked at the code more carefully I realized that you'resuggestion re comparing avg_load and busiest_load_per_task is needed toprotect the calculation of max_pull from integer arithmetic wrappingproblems. There was a big clue to this need in the comment above thecalculation of max_pull that I failed to read :-(

Anyway the attached patch should fix the problem. It should be appliedon top of the other patch.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

Peter
--
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce

Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c	2006-03-25 13:56:37.000000000 +1100
+++ MM-2.6.X/kernel/sched.c	2006-03-27 10:15:38.000000000 +1100
@@ -2161,7 +2161,7 @@ find_busiest_group(struct sched_domain *
 		group = group->next;
 	} while (group != sd->groups);
 
-	if (!busiest || this_load >= max_load || busiest_nr_running <= 1)
+	if (!busiest || this_load >= max_load)
 		goto out_balanced;
 
 	avg_load = (SCHED_LOAD_SCALE * total_load) / total_pwr;
@@ -2171,6 +2171,9 @@ find_busiest_group(struct sched_domain *
 		goto out_balanced;
 
 	busiest_load_per_task /= busiest_nr_running;
+
+	if (avg_load <= busiest_load_per_task)
+		goto out_balanced;
 	/*
 	 * We're trying to get all the cpus to the average_load, so we don't
 	 * want to push ourselves above the average load, nor do we wish to

References:
- cpu scheduler merge plans
  - From: Andrew Morton <akpm@osdl.org>
- Re: cpu scheduler merge plans
  - From: Peter Williams <pwil3058@bigpond.net.au>
- more smpnice patch issues
  - From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
- Re: more smpnice patch issues
  - From: Peter Williams <pwil3058@bigpond.net.au>

Prev by Date: PI patch against 2.6.16-rt9
Next by Date: Re: [patch 00/10] PI-futex: -V1
Previous by thread: Re: more smpnice patch issues
Next by thread: Re: cpu scheduler merge plans
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]