Christoph Lameter wrote on Sunday, November 12, 2006 9:45 PM
> > (2) we should initiate load balance within a domain only from least
> > loaded group.
>
> This would mean we would have to determine the least loaded group first.
Well, find_busiest_group() scans every single bloody CPU in the system at
the highest sched_domain level. In fact, this function is capable to find
busiest group within a domain, it should be capable to determine least
loaded group for free because it already scanned every groups within a domain.
> > Part of all this problem probably stemmed from "load balance" is incapable
> > of performing l-d between arbitrary pair of CPUs, and tightly tied load scan
> > and actual l-d action. And on top of that l-d is really a pull operation
> > to current running CPU. All these limitations dictate that every CPU somehow
> > has to scan and pull. It is extremely inefficient on large system.
>
> Right. However, if we follow this line of thought then we will be
> redesigning the load balancing logic.
It won't be a bad idea to redesign it ;-)
There are number of other oddity beside what was identified in it's design:
(1) several sched_groups are statically declared and they will reside in
boot node. I would expect cross node memory access to be expansive.
Every cpu will access these data structure repeatedly.
static struct sched_group sched_group_cpus[NR_CPUS];
static struct sched_group sched_group_core[NR_CPUS];
static struct sched_group sched_group_phys[NR_CPUS];
(2) load balance staggering. Number of people pointed out that it is overly
done.
(3) The for_each_domain() loop in rebalance_tick() looks different from
idle_balance() where it will traverse entire sched domains even if lower
level domain succeeded in moving some tasks. I would expect we either
break out of the for loop like idle_balance(), or somehow update load
for current CPU so it gets accurate load value when doing l-d in the
next level. Currently, It is doing neither.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]