Andrew Morton wrote:
"Siddha, Suresh B" <[email protected]> wrote:
On Tue, Feb 07, 2006 at 03:36:17PM -0800, Andrew Morton wrote:
Suresh, Martin, Ingo, Nick and Con: please drop everything, triple-check
and test this:
From: Peter Williams <[email protected]>
This is a modified version of Con Kolivas's patch to add "nice" support to
load balancing across physical CPUs on SMP systems.
I have couple of issues with this patch.
a) on a lightly loaded system, this will result in higher priority job hopping
around from one processor to another processor.. This is because of the
code in find_busiest_group() which assumes that SCHED_LOAD_SCALE represents
a unit process load and with nice_to_bias calculations this is no longer
true(in the presence of non nice-0 tasks)
My testing showed that 178.galgel in SPECfp2000 is down by ~10% when run with
nice -20 on a 4P(8-way with HT) system compared to a nice-0 run.
This is a bit of a surprise. Surely, even with this mod, a task
shouldn't be moved if it's the only runnable one on its CPU. If it
isn't the only runnable one on its CPU, it's not actually on the CPU and
it's not cache hot then moving it to another (presumably) idle CPU
should be a gain?
Presumably the delay waiting for the current task to exit the CPU is
less than the time taken to move the task to the new CPU? I'd guess
that this means that the task about to be moved is either: a) higher
priority than the current task on the CPU and is waiting for it to be
preempted off or b) it's equal priority (or at least next one due to be
scheduled) to the current task, waiting for the current task to
surrender the CPU and that surrender is going to happen pretty quickly
due to the current task's natural behaviour?
Is it normal to run enough -20 tasks to cause this problem to manifest?
b) On a lightly loaded system, this can result in HT scheduler optimizations
being disabled in presence of low priority tasks... in this case, they(low
priority ones) can end up running on the same package, even in the presence
of other idle packages.. Though this is not as serious as "a" above...
I think that this issue comes under the heading of "Result of better
nice enforcement" which is the purpose of the patch :-). I wouldn't
call this HT disablement or do I misunderstand the issue.
The only way that I can see load balancing subverting the HT scheduling
mechanisms is if (say) there are 2 CPUs with 2 HT channels each and all
of the high priority tasks end up sharing the 2 channels of one CPU
while all of the low priority tasks share the 2 channels of the other
one. This scenario is far more likely to happen without the smpnice
patches than with them.
Thanks very much for discvoring those things.
That rather leaves us in a pickle wrt 2.6.16.
It looks like we back out smpnice after all?
Whatever we do, time is pressing.
I don't think either of these issues warrant abandoning smpnice. The
first is highly unlikely to occur on real systems and the second is just
an example of the patch doing its job (maybe too officiously). I don't
think users would notice either on real systems.
Even if you pull it from 2.6.16 rather than upgrading it with my patch
can you please leave both in -mm?
I think that there a few inexpensive things that can be tried before we
go as far as sophisticated solutions (such as guesstimating how long a
task will have to wait for CPU access if we don't move it). E.g. with
this patch move_tasks() takes two arguments: a) maximum number of tasks
to be moved and b) maximum amount of biased load to be moved; and for
normal use is just passed max(nr_running - 1, 0) as the first of these
arguments and the move is controlled by the second but we could modify
find_busiest_group() to give us values for both arguments. Other
options include modifying the function that maps nice to bias_prio so
that the weights aren't quite so heavy. Leaving the patches in -mm
would allow some of these options to be tested.
Peter
--
Peter Williams [email protected]
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]