Re: [patch] sched: fix sched-domains partitioning by cpusets

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday 03 October 2007 19:21, Paul Jackson wrote:
> Nick wrote:
> > Sorry for the confusion: I only meant the sched.c part of that
> > patch, not the full thing.
>
> Ah - ok.  We're getting closer then.  Good.
>
> Let me be sure I've got this right then.
>
> You prefer the interface from your proposed patch, by which the
> cpuset code passes sched domain requests to the scheduler code a single
> cpumask that will define a sched domain:
>
>     int partition_sched_domains(cpumask_t *partition)
>
> and I am suggesting instead a new and different interface:

Yep.

>
>     void partition_sched_domains(int ndoms_new, cpumask_t *doms_new)
>
> In the first API, one cpumask is passed in, and a single sched
> domain is formed, taking those CPUs from any sched domain they
> might have already been a member of, into this new sched domain.
>
> In the second API, the entire flat partitioning is passed in,
> giving an array of masks, one mask for each desired sched domain.
> The passed in masks do not overlap, but might not cover all CPUs.
>
> Question -- how does one turn off load balancing on some CPUs
> using the first API?
>
>     Does one do this by forming singleton sched domains of one
>     CPU each?  Is there any downside to doing this?

Yes, and no (it does the same thing as your version internally, so
no downside).


>     The simplest cpuset code to work with this would end up exposing
>     this method of disabling load balancing to user space, forcing
>     users to create cpusets with one CPU each to be able do disable
>     load balancing.
>
>     However a little bit of additional kernel cpuset code could hide
>     this detail from user space, by recognizing when the user had
>     asked to turn off load balancing on some larger cpuset, and by
>     then calling partition_sched_domains() multiple times, once for
>     each CPU in that cpuset.

Yeah: do all that in cpusets. It's already information you would have
to derive in order to make it work properly anyway. If you are not
passing in the singleton domains ATM, then they will not get properly
detached and isolated.


> There might be an even simpler way.  If the kernel/sched.c routines
> detach_destroy_domains() and build_sched_domains() were exposed as
> external routines, then the cpuset code could call them directly,
> removing the partition_sched_domains() routine from sched.c entirely.
> Would this be worth persuing?

detach_destroy and build are things that may get reimplemented to
suit different capabilities (eg. we might want to have multiple trees of
domains for each CPU). So I think it is important to expose the simple
partition API which should be unambiguous and stable.

It's not a huge deal, but I'd like to keep partition_sched_domains. After
my patch, it's really simple.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux