Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices

On 10/31/06, Srivatsa Vaddagiri <vatsa@in.ibm.com> wrote:

For the case where resource node hierarchy is different from process
container hierarchy, I am trying to make sense of "why do we need to
maintain two hierarchies" - one the actual hierarchy used for resource
control purpose, another the process container hierarchy. What purpose
does maintaining the process container hierarchy (in addition to the
resource controller hierarchy) solve?

The idea is that in general, people aren't going to want to have
separate hierarchies for different resources - they're going to have
the hierarchies be the same for all resources. So in general when they
move a process from one container to another, they're going to want to
move that task to use all the new resources limits/guarantees
simultaneously.

Having completely independent hierarchies makes this more difficult -
you have to manually maintain multiple different hierarchies from
userspace. Suppose a task forks while you're moving it from one
container to another? With the approach that each process is in one
container, and each container is in a set of resource nodes, at least
the child task is either entirely in the new resource limits or
entirely in the old limits - if userspace has to update several
hierarchies at once non-atomically then a freshly forked child could
end up with a mixture of resource nodes.

I am thinking we can avoid maintaining these two hierarchies, by
something on these lines:

        mkdir /dev/cpu
        mount -t container -ocpu container /dev/cpu

                -> Represents a hierarchy for cpu control purpose.

                   tsk->cpurc   = represent the node in the cpu
                                  controller hierarchy. Also maintains
                                  resource allocation information for
                                  this node.

If we were going to do something like this, hopefully it would look
more like an array of generic container subsystems, rather than a
separate named pointer for each subsystem.

        mkdir /dev/mem
        mount -t container -omem container /dev/mem

                -> Represents a hierarchy for mem control purpose.

                   tsk->memrc   = represent the node in the mem
                                  controller hierarchy. Also maintains
                                  resource allocation information for
                                  this node.

                   tsk->memrc->parent = parent node.


        mkdir /dev/containers
        mount -t container -ocontainer container /dev/container

                -> Represents a (mostly flat?) hierarchy for the real
                   container (virtualization) purpose.

I think we have an overloading of terminology here. By "container" I
just mean "group of processes tracked for resource control and other
purposes". Can we use a term like "virtual server" if you're doing
virtualization? I.e. a virtual server would be a specialization of a
container (effectively analagous to a resource controller)

I suspect this may simplify the "container" filesystem, since it doesnt
have to track multiple hierarchies at the same time, and improve lock
contention too (modifying the cpu controller hierarchy can take a different
lock than the mem controller hierarchy).

Do you think that lock contention when modifying hierarchies is
generally going to be an issue - how often do tasks get moved around
in the hierarchy, compared to the other operations going on on the
system?

Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: Srivatsa Vaddagiri <vatsa@in.ibm.com>

References:
- [RFC] Resource Management - Infrastructure choices
  - From: Srivatsa Vaddagiri <vatsa@in.ibm.com>
- Re: [RFC] Resource Management - Infrastructure choices
  - From: "Paul Menage" <menage@google.com>
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: Paul Jackson <pj@sgi.com>
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: "Paul Menage" <menage@google.com>
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: Paul Jackson <pj@sgi.com>
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: "Paul Menage" <menage@google.com>
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: Paul Jackson <pj@sgi.com>
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: "Paul Menage" <menage@google.com>
- Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
  - From: Srivatsa Vaddagiri <vatsa@in.ibm.com>

Prev by Date: Re: [PATCH 1/1] fat: improve sync performance by grouping writes revised
Next by Date: Re: 2.6.19-rc3-mm1 -- missing network adaptors
Previous by thread: Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
Next by thread: Re: [ckrm-tech] [RFC] Resource Management - Infrastructure choices
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]