pj wrote:
> On a different point, we could, if it was worth the extra bit of code,
> improve the current code's handling of mempolicy rebinding when the
> cpuset adds memory nodes. If we kept both the original cpusets
> mems_allowed, and the original MPOL_INTERLEAVE nodemask requested by
> the user in a call to set_mempolicy, then we could rebind (nodes_remap)
> the currently active policy v.nodes using that pair of saved masks to
> guide the rebinding. This way, if say a cpuset shrunk, then regrew back
> to its original size (original number of nodes) we would end up
> replicating the original MPOL_INTERLEAVE request, cpuset relative.
> This would provide a more accurate cpuset relative translation of such
> memory policies with-out- changing the set_mempolicy API. Hmmm ... this
> might meet your needs entirely, so that we did not need -any- added
> flags to the API.
Thinking about this some more ... there's daylight here!
I'll see if I can code up a patch for this now, but the idea is
to allow user code to specify any nodemask to a set_mempolicy
MPOL_INTERLEAVE call, even including nodes not in their cpuset,
and then (1) use nodes_remap() to fold that mask down to whatever
is their current cpuset (2) remember what they passed in and use
it again with nodes_remap() to re-fold that mask down, anytime
the cpuset changes.
For example, if they pass in a mask with all bits sets, then they
get interleave over all the nodes in their current cpuset, even as
that cpuset changes. If they pass in a mask with say just two
bits set, then they will get interleave over just two nodes anytime
they are in a cpuset with two or more nodes (when in a single node
cpuset, they will of course get no interleave, for lack of anything
to interleave over.)
This should replace the patches that David is proposing here. It should
replace what Lee is proposing. It should work with libnuma and be
fully upward compatible with current code (except perhaps code that
depends on getting an error from requesting MPOL_INTERLEAVE on a node
not allowed.)
And instead of just covering the special case of "interleave over all
available nodes" it should cover the more general case of interleaving
over any subset of nodes, folded or replicated to handle being in any
cpuset.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]