Re: [rfc][patch] sched: remove smpnice

Andrew Morton wrote:

Peter Williams <pwil3058@bigpond.net.au> wrote:
I don't think either of these issues warrant abandoning smpnice. Thefirst is highly unlikely to occur on real systems and the second is justan example of the patch doing its job (maybe too officiously). I don'tthink users would notice either on real systems.
Even if you pull it from 2.6.16 rather than upgrading it with my patchcan you please leave both in -mm?
Yes, I have done that.  I currently have:

sched-restore-smpnice.patch
sched-modified-nice-support-for-smp-load-balancing.patch
sched-cleanup_task_activated.patch
sched-alter_uninterruptible_sleep_interactivity.patch
sched-make_task_noninteractive_use_sleep_type.patch
sched-dont_decrease_idle_sleep_avg.patch
sched-include_noninteractive_sleep_in_idle_detect.patch
sched-new-sched-domain-for-representing-multi-core.patch
sched-fix-group-power-for-allnodes_domains.patch

OK. Having slept on these problems I am now of the opinion that theproblems are caused by the use of NICE_TO_BIAS_PRIO(0) to set *imbalanceinside the (*imbalance < SCHED_LOAD_SCALE) if statement infind_busiest_group(). What is happening here is that even though theimbalance is less than one (average) task sometimes the decision is madeto move a task anyway but with the current version this decision can besubverted in two ways: 1) if the task to be moved has a nice value lessthan zero the value of *imbalance that is set will be too small formove_tasks() to move it; and 2) if there are a number of tasks with nicevalues greater than zero on the "busiest" more than one of them may bemoved as the value of *imbalance that is set may be big enough toinclude more than one of these tasks.

The fix for this problem is to replace NICE_TO_BIAS_PRIO(0) with the"average bias prio per runnable task" on "busiest". This will(generally) result in a larger value for *imbalance in case 1. above anda smaller one in case 2. and alleviate both problems. A patch to applythis fix is attached.

Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>

Could you please add this patch to -mm so that it can be tested?

Thanks
Peter
--
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce

Index: MM-2.6.X/kernel/sched.c
===================================================================
--- MM-2.6.X.orig/kernel/sched.c	2006-02-12 11:24:48.000000000 +1100
+++ MM-2.6.X/kernel/sched.c	2006-02-12 11:35:40.000000000 +1100
@@ -735,6 +735,19 @@ static inline unsigned long biased_load(
 {
 	return (wload * NICE_TO_BIAS_PRIO(0)) / SCHED_LOAD_SCALE;
 }
+
+/* get the average biased load per runnable task for a run queue */
+static inline unsigned long avg_biased_load(runqueue_t *rq)
+{
+	/*
+	 * When I'm convinced that this won't be called with a zero nr_running
+	 * and that it can't change during the call this can be simplified.
+	 * For the time being and for proof of concept let's paly it safe.
+	 */
+	unsigned long n = rq->nr_running;
+
+	return n ? rq->prio_bias / n : 0;
+}
 #else
 static inline void set_bias_prio(task_t *p)
 {
@@ -2116,7 +2129,7 @@ find_busiest_group(struct sched_domain *
 		unsigned long tmp;
 
 		if (max_load - this_load >= SCHED_LOAD_SCALE*2) {
-			*imbalance = NICE_TO_BIAS_PRIO(0);
+			*imbalance = avg_biased_load(busiest);
 			return busiest;
 		}
 
@@ -2149,7 +2162,7 @@ find_busiest_group(struct sched_domain *
 		if (pwr_move <= pwr_now)
 			goto out_balanced;
 
-		*imbalance = NICE_TO_BIAS_PRIO(0);
+		*imbalance = avg_biased_load(busiest);
 		return busiest;
 	}

Follow-Ups:
- Re: [rfc][patch] sched: remove smpnice
  - From: Peter Williams <pwil3058@bigpond.net.au>

References:
- [rfc][patch] sched: remove smpnice
  - From: Nick Piggin <npiggin@suse.de>
- Re: [rfc][patch] sched: remove smpnice
  - From: Con Kolivas <kernel@kolivas.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: Con Kolivas <kernel@kolivas.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: Peter Williams <pwil3058@bigpond.net.au>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>

Prev by Date: Re: RCU latency regression in 2.6.16-rc1
Next by Date: Re: Packet writing issue on 2.6.15.1
Previous by thread: Re: [rfc][patch] sched: remove smpnice
Next by thread: Re: [rfc][patch] sched: remove smpnice
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]