Re: allow the load to grow upto its cpu_power (was Re: [Patch] don't kick ALB in the presence of pinned task)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Siddha, Suresh B wrote:

On Fri, Aug 12, 2005 at 09:49:36AM +1000, Nick Piggin wrote:

Well, it is a departure from our current idea of balancing.


That idea is already changing from the first line of the patch. And the change is "allowing the load to grow upto the sched group's cpu_power"



But this is the intended behaviour of the scheduler. IMO
it was a bug in the implementation.

I would prefer to use my patch initially to fix the _bug_
you found, then we can think about changing policy for
power savings.


Second part of the patch is just a logical extension of the
first one.


Main things I'm worried about:

Idle time regressions that pop up any time we put
restrictions on balancing.


We are talking about lightly loaded case anyway. We are not changing
the behavior of a heavily loaded system.



But if the system is going idle, it _looks_ like it is lightly
loaded to the CPU scheduler. However, it is imperitive that
any spare tasks get moved to idle CPUs in order to keep things
like IO going.

It seems to be a problem with database workloads.

This can tend to unbalance memory controllers (eg. on POWER5,
CMP Opteron) which can be a performance problem on those
systems.


We will do that already with the first line in the patch.


But that's because the tasks are already running on that CPU, and
affinity is more important in that case (especially for NUMA systems).

If we want to distribute uniformly among the memory controllers, cpu_power of the group should reflect it (shared resources bottle neck).



I don't mean that we want to completely distribute load evenly over
memory controllers, but I think it is better not to introduce this kind
of bias to the scheduler. At least not without some justification.

Lastly, complexity in the calculation.


My first patch is a simple straight fwd one.



It is simpler than the second, but it introduces (yet another) special
case in find_busiest_group.

I'm not saying that I would reject any patch that did this or changed
behaviour in the way that you would propose, however I would like
to merge the version I sent as a bug fix first.

Thanks,
Nick


Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux