Re: [RFC PATCH] Dynamic sched domains aka Isolated cpusets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Simon wrote:
> I guess we hit a limit of the filesystem-interface approach here.
> Are the possible failure reasons really that complex ?

Given the amount of head scratching my proposal has provoked
so far, they might be that complex, yes.  Failure reasons
include:
 * The cpuset Foo whose domain_cpu_rebuild file we wrote does
   not align with the current partition of CPUs on the system
   (align: every subset of the partition is either within or
   outside the CPUs of Foo)
 * The cpusets Foo and its descendents which are marked with
   a true domain_cpu_pending do not form a partition of the
   CPUs in Foo.  This could be either because two of these
   cpusets have overlapping CPUs, or because the union of all
   the CPUs in these cpusets doesn't cover.
 * The usual other reasons such as lacking write permission.

> If this is only to get a hint, OK.

Yes - it would be a hint.  The official explanation would be the
errno setting on the failed write.  The hint, written to the
domain_cpu_error file, could actually state which two cpusets
had overlapping CPUs, or which CPUs in Foo were not covered by
the union of the CPUs in the marked descendent cpusets.

Yes - it pushing the limits of available mechanisms.  Though I don't
offhand see where the filesystem-interface approach is to blame here.
Can you describe any other approach that would provide such a similarly
useful error explanation in a less unusual fashion?

> Is such an error reporting scheme already in use in the kernel ?

I don't think so.

> On the other hand, there's also no guarantee that what we are triggering 
> by writing in domain_cpu_rebuild is what we have set up by writing in 
> domain_cpu_pending. User applications will need a bit of self-discipline.

True.

To preserve the invariant that the CPUs in the selected cpusets form a
partition (disjoint cover) of the systems CPUs, we either need to
provide an atomic operation that allows passing in a selection of
cpusets, or we need to provide a sequence of operations that essentially
drive a little finite state machine, building up a description of the
new state while the old state remains in place, until the final trigger
is fired.

This suggests what the primary alternative to my proposed API would be,
and that would be an interface that let one pass in a list of cpusets,
requesting that the partition below the specified cpuset subtree Foo be
completely and atomically rebuilt, to be that defined by the list of
cpusets, with the set of CPUS in each of these cpusets defining one
subset of the partition.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <[email protected]> 1.650.933.1373, 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux