David wrote:
> It all comes down to the decision of whether we want to permit
> set_mempolicy() calls for tasks and respect the nodemask passed in a
> memory_spread_user cpuset.
Well said. My current inclination is to say:
1) If you want the current behaviour, where set_mempolicy(MPOL_INTERLEAVE)
calls mean what they say, and cpusets tries as best it can (imperfectly)
to honor those memory policy calls, even in the face of changing cpusets,
then leave memory_spread_user turned off (the default, right?)
2) If you want MPOL_INTERLEAVE tasks to interleave user memory across all
nodes in the cpuset, whatever that might be, then enable memory_spread_user.
This is admittedly less flexible than your patch provided, but I am
attracted to the simpler API - easier to explain.
This does beg yet another question: shouldn't memory_spread_user force
interleaving of user memory -regardless- of mempolicy.
And yet another question: what about the MPOL_BIND mempolicy? It too,
to a lesser extent, has the same problems with cpusets that shrink and
then expand. Several tasks in a cpuset with multiple nodes could carefully
bind to a separate node each, but then end up collapsed all onto the same
node if the cpuset was shrunk to one node and then expanded again.
I should sleep on this, and hopefully respond again, within this day.
On a different point, we could, if it was worth the extra bit of code,
improve the current code's handling of mempolicy rebinding when the
cpuset adds memory nodes. If we kept both the original cpusets
mems_allowed, and the original MPOL_INTERLEAVE nodemask requested by
the user in a call to set_mempolicy, then we could rebind (nodes_remap)
the currently active policy v.nodes using that pair of saved masks to
guide the rebinding. This way, if say a cpuset shrunk, then regrew back
to its original size (original number of nodes) we would end up
replicating the original MPOL_INTERLEAVE request, cpuset relative.
This would provide a more accurate cpuset relative translation of such
memory policies with-out- changing the set_mempolicy API. Hmmm ... this
might meet your needs entirely, so that we did not need -any- added
flags to the API.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <[email protected]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]