This is an update to my generic containers patch. The major change is
support for multiple hierarchies of containers (up to a limit
specified at build time).
- The mount options passed when mounting a container filesystem
indicate the set of controllers/subsystems that are wanted in the
hierarchy - e.g. "mount -t container -o cpuset,numtasks container /foo"
- Default is to try to mount all subsystems
- if a hierarchy with the requested set of subsystems already exists
then its superblock is reused
- otherwise (as long as all the requested subsystems are currently not
in use in any hierarchy) a new hierarchy is created.
- hierarchies with more than one container (i.e. with any children of
the root container) persist even when unmounted;
- /proc/containers shows current hierarchy/subsystem details
- /proc/<pid>/container shows one line for each active hierarchy
Other changes include:
- ported to 2.6.19-rc5
- per-subsystem/per-container state is no longer just a void * - it
has some state maintained by the container framework (to handle
moving subsystems in and out of hierarchies when they are created/released)
Note that this hasn't yet undergone intensive testing following the
multi-hierarchy introduction, but I wanted to get the basic idea out
for comments.
TODOs include:
- figuring out a nice way to handle release notifications now that
there are multiple hierarchies
-------------------------------------
There have recently been various proposals floating around for
resource management/accounting subsystems in the kernel, including
Res Groups, User BeanCounters and others. These all need the basic
abstraction of being able to group together multiple processes in an
aggregate, in order to track/limit the resources permitted to those
processes, and all implement this grouping in different ways.
Already existing in the kernel is the cpuset subsystem; this has a
process grouping mechanism that is mature, tested, and well documented
(particularly with regards to synchronization rules).
This patchset extracts the process grouping code from cpusets into a
generic container system, and makes the cpusets code a client of
the container system.
It also provides a very simple additional container subsystem to do
per-container CPU usage accounting; this is primarily to demonstrate
use of the container subsystem API, but is useful in its own right.
The change is implemented in five stages plus an additional example patch:
1) extract the process grouping code from cpusets into a standalone system
2) remove the process grouping code from cpusets and hook into the
container system
3) convert the container system to present a generic multi-hierarchy
API, and make cpusets a client of that API
4) add a simple CPU accounting container subsystem as an example
5) add support for fork/exit callbacks iff some subsystem is interested in them
6) example of implementing ResGroups and its numtasks controller over
generic containers - not intended to be applied with this patch set
The intention is that the various resource management efforts can also
become container clients, with the result that:
- the userspace APIs are (somewhat) normalised
- it's easier to test out e.g. the ResGroups CPU controller in
conjunction with the UBC memory controller
- the additional kernel footprint of any of the competing resource
management systems is substantially reduced, since it doesn't need
to provide process grouping/containment, hence improving their
chances of getting into the kernel
Signed-off-by: Paul Menage <[email protected]>
--
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]