On Wed, 2007-08-08 at 10:36 -0700, Christoph Lameter wrote:
> On Wed, 8 Aug 2007, Mel Gorman wrote:
>
> > These are the range of performance losses/gains I found when running against
> > 2.6.23-rc1-mm2. The set and these machines are a mix of i386, x86_64 and
> > ppc64 both NUMA and non-NUMA.
> >
> > Total CPU time on Kernbench: -0.20% to 3.70%
> > Elapsed time on Kernbench: -0.32% to 3.62%
> > page_test from aim9: -2.17% to 12.42%
> > brk_test from aim9: -6.03% to 11.49%
> > fork_test from aim9: -2.30% to 5.42%
> > exec_test from aim9: -0.68% to 3.39%
> > Size reduction of pg_dat_t: 0 to 7808 bytes (depends on alignment)
>
> Looks good.
>
> > o Remove bind_zonelist() (Patch in progress, very messy right now)
>
> Will this also allow us to avoid always hitting the first node of an
> MPOL_BIND first?
An idea:
Apologies if someone already suggested this and I missed it. Too much
traffic...
instead of passing a zonelist for BIND policy, how about passing [to
__alloc_pages(), I think] a starting node, a nodemask, and gfp flags for
zone and modifiers. For various policies, the arguments would look like
this:
Policy start node nodemask
default local node cpuset_current_mems_allowed
preferred preferred_node cpuset_current_mems_allowed
interleave computed node cpuset_current_mems_allowed
bind local node policy nodemask [replaces bind
zonelist in mempolicy]
Then, just walk the zonelist for the starting node--already ordered by
distance--filtering by gfp_zone() and nodemask. Done "right", this
should always return memory from the closest allowed node [based on the
nodemask argument] to the starting node. And, it would eliminate the
custom zonelists for bind policy. Can also eliminate cpuset checks in
the allocation loop because that constraint would already be applied to
the nodemask argument.
The fast path--when we hit in the target zone on the starting
node--might be faster. Once we have to start falling back to other
nodes/zones, we've pretty much fallen off the fast path anyway, I think.
Bind policy would suffer a hit when the nodemask does not include the
local node from which the allocation occurs. I.e., this would always be
a fallback case.
Too backed up to investigate further right now.
I will add Mel's patches to my test tree, tho'.
Lee
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]