Re: [rfc][patch] sched: remove smpnice

Andrew Morton wrote:

"Siddha, Suresh B" <suresh.b.siddha@intel.com> wrote:
On Tue, Feb 07, 2006 at 03:36:17PM -0800, Andrew Morton wrote:
Suresh, Martin, Ingo, Nick and Con: please drop everything, triple-check
and test this:

From: Peter Williams <pwil3058@bigpond.net.au>

This is a modified version of Con Kolivas's patch to add "nice" support to
load balancing across physical CPUs on SMP systems.
I have couple of issues with this patch.
a) on a lightly loaded system, this will result in higher priority job hoppingaround from one processor to another processor.. This is because of thecode in find_busiest_group() which assumes that SCHED_LOAD_SCALE representsa unit process load and with nice_to_bias calculations this is no longertrue(in the presence of non nice-0 tasks)
My testing showed that 178.galgel in SPECfp2000 is down by ~10% when run withnice -20 on a 4P(8-way with HT) system compared to a nice-0 run.

This is a bit of a surprise. Surely, even with this mod, a taskshouldn't be moved if it's the only runnable one on its CPU. If itisn't the only runnable one on its CPU, it's not actually on the CPU andit's not cache hot then moving it to another (presumably) idle CPUshould be a gain?

Presumably the delay waiting for the current task to exit the CPU isless than the time taken to move the task to the new CPU? I'd guessthat this means that the task about to be moved is either: a) higherpriority than the current task on the CPU and is waiting for it to bepreempted off or b) it's equal priority (or at least next one due to bescheduled) to the current task, waiting for the current task tosurrender the CPU and that surrender is going to happen pretty quicklydue to the current task's natural behaviour?

Is it normal to run enough -20 tasks to cause this problem to manifest?

b) On a lightly loaded system, this can result in HT scheduler optimizations
being disabled in presence of low priority tasks... in this case, they(low
priority ones) can end up running on the same package, even in the presenceof other idle packages.. Though this is not as serious as "a" above...

I think that this issue comes under the heading of "Result of betternice enforcement" which is the purpose of the patch :-). I wouldn'tcall this HT disablement or do I misunderstand the issue.

The only way that I can see load balancing subverting the HT schedulingmechanisms is if (say) there are 2 CPUs with 2 HT channels each and allof the high priority tasks end up sharing the 2 channels of one CPUwhile all of the low priority tasks share the 2 channels of the otherone. This scenario is far more likely to happen without the smpnicepatches than with them.


Thanks very much for discvoring those things.

That rather leaves us in a pickle wrt 2.6.16.

It looks like we back out smpnice after all?

Whatever we do, time is pressing.

I don't think either of these issues warrant abandoning smpnice. Thefirst is highly unlikely to occur on real systems and the second is justan example of the patch doing its job (maybe too officiously). I don'tthink users would notice either on real systems.

Even if you pull it from 2.6.16 rather than upgrading it with my patchcan you please leave both in -mm?

I think that there a few inexpensive things that can be tried before wego as far as sophisticated solutions (such as guesstimating how long atask will have to wait for CPU access if we don't move it). E.g. withthis patch move_tasks() takes two arguments: a) maximum number of tasksto be moved and b) maximum amount of biased load to be moved; and fornormal use is just passed max(nr_running - 1, 0) as the first of thesearguments and the move is controlled by the second but we could modifyfind_busiest_group() to give us values for both arguments. Otheroptions include modifying the function that maps nice to bias_prio sothat the weights aren't quite so heavy. Leaving the patches in -mmwould allow some of these options to be tested.

Peter
--
Peter Williams                                   pwil3058@bigpond.net.au

"Learning, n. The kind of ignorance distinguishing the studious."
 -- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [rfc][patch] sched: remove smpnice
  - From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
- Re: [rfc][patch] sched: remove smpnice
  - From: Peter Williams <pwil3058@bigpond.net.au>
- Re: [rfc][patch] sched: remove smpnice
  - From: Peter Williams <pwil3058@bigpond.net.au>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>

References:
- [rfc][patch] sched: remove smpnice
  - From: Nick Piggin <npiggin@suse.de>
- Re: [rfc][patch] sched: remove smpnice
  - From: Con Kolivas <kernel@kolivas.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: Con Kolivas <kernel@kolivas.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>
- Re: [rfc][patch] sched: remove smpnice
  - From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
- Re: [rfc][patch] sched: remove smpnice
  - From: Andrew Morton <akpm@osdl.org>

Prev by Date: [RERESEND] [PATCH] Updated Documentation/nfsroot.txt
Next by Date: Re: [PATCH] mm: Implement Swap Prefetching v24
Previous by thread: Re: [rfc][patch] sched: remove smpnice
Next by thread: Re: [rfc][patch] sched: remove smpnice
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]