[PATCH] sched: make sure busiest group and run queue are pullable

Peter Williams wrote:

Peter Williams wrote:
Siddha, Suresh B wrote:
more issues with smpnice patch...
a) consider a 4-way system (simple SMP system with no HT and cores)scenario
where a high priority task (nice -20) is running on P0 and two normal
priority tasks running on P1. load balance with smp nice code
will never be able to detect an imbalance and hence will never moveone of the normal priority tasks on P1 to idle cpus P2 or P3.
Why?
OK, I think I know why. The load balancing code will always decide thatP0 is the busiest CPU, right?

Attached is a patch that addresses this problem. The strategiesemployed are:

1. for find_busiest_group() only consider groups that have at least oneCPU with more than one task running as candidates for "busiest", and

2. for find_busiest_queue() only consider queues that have more than onerunning tasks as candidates for "busiest".

I think that the overhead gains from earlier abandonment of loadbalancing attempts that would eventually (most probably -- see nextparagraph) be abandoned anyway will compensate for the extra overheadintroduced in these functions.

I think that the only likely behavioural changes for an all tasks havenice==0 system is that without these checks there is a small chance thata "busiest" that only has one runnable task (and for which move_tasks()would eventually not move any tasks) when these tests are made mayactually acquire extra runnable tasks before the locks are taken inpreparation for calling move_tasks() and, therefore, load balancing mayactually take place. I think that this effect can be safely ignored.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

Peter
--
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce

Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c	2006-03-25 13:43:06.000000000 +1100
+++ MM-2.6.X/kernel/sched.c	2006-03-25 13:56:37.000000000 +1100
@@ -2115,6 +2115,7 @@ find_busiest_group(struct sched_domain *
 		int local_group;
 		int i;
 		unsigned long sum_nr_running, sum_weighted_load;
+		unsigned int nr_loaded_cpus = 0; /* where nr_running > 1 */
 
 		local_group = cpu_isset(this_cpu, group->cpumask);
 
@@ -2135,6 +2136,8 @@ find_busiest_group(struct sched_domain *
 
 			avg_load += load;
 			sum_nr_running += rq->nr_running;
+			if (rq->nr_running > 1)
+				++nr_loaded_cpus;
 			sum_weighted_load += rq->raw_weighted_load;
 		}
 
@@ -2149,7 +2152,7 @@ find_busiest_group(struct sched_domain *
 			this = group;
 			this_nr_running = sum_nr_running;
 			this_load_per_task = sum_weighted_load;
-		} else if (avg_load > max_load) {
+		} else if (nr_loaded_cpus && avg_load > max_load) {
 			max_load = avg_load;
 			busiest = group;
 			busiest_nr_running = sum_nr_running;
@@ -2258,16 +2261,16 @@ out_balanced:
 static runqueue_t *find_busiest_queue(struct sched_group *group,
 	enum idle_type idle)
 {
-	unsigned long load, max_load = 0;
-	runqueue_t *busiest = NULL;
+	unsigned long max_load = 0;
+	runqueue_t *busiest = NULL, *rqi;
 	int i;
 
 	for_each_cpu_mask(i, group->cpumask) {
-		load = weighted_cpuload(i);
+		rqi = cpu_rq(i);
 
-		if (load > max_load) {
-			max_load = load;
-			busiest = cpu_rq(i);
+		if (rqi->nr_running > 1 && rqi->raw_weighted_load > max_load) {
+			max_load = rqi->raw_weighted_load;
+			busiest = rqi;
 		}
 	}

References:
- cpu scheduler merge plans
  - From: Andrew Morton <akpm@osdl.org>
- Re: cpu scheduler merge plans
  - From: Peter Williams <pwil3058@bigpond.net.au>
- more smpnice patch issues
  - From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
- Re: more smpnice patch issues
  - From: Peter Williams <pwil3058@bigpond.net.au>
- Re: more smpnice patch issues
  - From: Peter Williams <pwil3058@bigpond.net.au>

Prev by Date: smp_locks reference_discarded errors
Next by Date: Re: [ck] [benchmark] Interbench 2.6.16-ck/mm
Previous by thread: Re: more smpnice patch issues
Next by thread: Re: more smpnice patch issues
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]