Re: [RFC][PATCH 0/4] Generic container system

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Paul,

Thanks for doing the exercise of removing the container part of cpuset
to provide some process aggregation.

With this model, I think I agree with you that RG can be split into
individual controllers (need to look at it closely).

I have few questions/concerns w.r.t this implementation:

- Since we are re-implementing anyways, why not use configfs instead of 
  having our own filesystem ?
- I am little nervous about notify_on_release, as RG would want 
  classes/RGs to be available even when there are no tasks or sub-
  classes. (Documentation says that the user level program can rmdir
  the container, which would be a problem). Can the user level program 
  be _not_ called when there are other subsystems registered ? Also,
  shouldn't it be cpuset specific, instead of global ?
- Export of the locks: These locks protect container data structures. 
  But, most of the usages in cpuset.c are to protect the cpuset data
  structure itself. Shouldn't the cpuset subsystem have its own locks ?
  IMO, these locks should be used by subsystem only when they want data
  integrity in the container data structure itself (like walking thru
  the sibling list).
- Tight coupling of subsystems: I like your idea (you mentioned in a 
  reply to the previous thread) of having an array of containers in task
  structure than the current implementation.

regards,

chandra

On Mon, 2006-10-02 at 02:53 -0700, Paul Menage wrote:
> This is essentially the same as the patch set that I posted last week,
> with the following fixes/changes:
> 
> - CONFIG_CONTAINERS is no longer a user-selectable option - subsystems
>   such as cpusets that require it should select it in Kconfig.
> 
> - Each container subsystem type now has a name, and a <name>_enabled
>   file in the top container directory. This file contains 0 or 1 to
>   indicate whether the container subsystem is enabled, and can only be
>   modified when there are no subcontainers; disabled container subsystems
>   don't get new instances created when a subcontainer is created; the
>   subsystem-specific state is simply inherited from the parent
>   container.
> 
> - include a config option to default to enabled, for backwards
>   compatibility
> 
> - Documentation tweaks
> 
> - builds properly without CONFIG_CONTAINER_CPUACCT configured on
> 
> - should build properly with newer gccs. (I've not actually had a
>   chance to try building it with anything newer than gcc 3.2.2, but I've
>   fixed all the potential warnings/errors that PaulJ pointed out when
>   compiling with some unspecified newer gcc).
> 
> I've also looked at converting ResGroups to be a client of the
> container system. This isn't yet complete; my thoughts so far include:
> 
> - each resource controller can be implemented as an independent
>   container subsystem; rather than a single "shares" and "stats" file
>   in each directory there will be e.g. "numtasks_shares",
>   "cpurc_shares", etc
> 
> - the ResGroups core will basically end up as a library that provides
>   the common parsing/displaying for the shares and stats file for each
>   controller, and the logic for propagating resources up and down the
>   parent/child tree.
> 
> - for some of the resource controllers we will probably require a few
>   extra callbacks from the container system, e.g. at fork/exit time.
>   I might make these a config option that the controller must "select"
>   in Kconfig, to avoid extra locking/overhead for subsystems such as
>   cpusets that don't require such callbacks.
> 
> -------------------------------------
> 
> There have recently been various proposals floating around for
> resource management/accounting subsystems in the kernel, including
> Res Groups, User BeanCounters and others.  These all need the basic
> abstraction of being able to group together multiple processes in an
> aggregate, in order to track/limit the resources permitted to those
> processes, and all implement this grouping in different ways.
> 
> Already existing in the kernel is the cpuset subsystem; this has a
> process grouping mechanism that is mature, tested, and well documented
> (particularly with regards to synchronization rules).
> 
> This patchset extracts the process grouping code from cpusets into a
> generic container system, and makes the cpusets code a client of
> the container system.
> 
> It also provides a very simple additional container subsystem to do
> per-container CPU usage accounting; this is primarily to demonstrate
> use of the container subsystem API, but is useful in its own right.
> 
> The change is implemented in four stages:
> 
> 1) extract the process grouping code from cpusets into a standalone system
> 
> 2) remove the process grouping code from cpusets and hook into the
>   container system
> 
> 3) convert the container system to present a generic API, and make
>   cpusets a client of that API
> 
> 4) add a simple CPU accounting container subsystem as an example
> 
> The intention is that the various resource management efforts can also
> become container clients, with the result that:
> 
> - the userspace APIs are (somewhat) normalised
> 
> - it's easier to test out e.g. the ResGroups CPU controller in
>  conjunction with the UBC memory controller
> 
> - the additional kernel footprint of any of the competing resource
>  management systems is substantially reduced, since it doesn't need
>  to provide process grouping/containment, hence improving their
>  chances of getting into the kernel
> 
> Possible TODOs include:
> 
> - define a convention for populating the per-container directories so
>  that different subsystems don't clash with one another
> 
> - provide higher-level primitives (e.g. an easy interface to seq_file)
>  for files registered by subsystems.
> 
> - support subsystem deregistering
> 
> Signed-off-by: Paul Menage <[email protected]>
> 
> ---
-- 

----------------------------------------------------------------------
    Chandra Seetharaman               | Be careful what you choose....
              - [email protected]   |      .......you may get it.
----------------------------------------------------------------------


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux