Well, I at least got to the point of seeing how the sysctls interact
when I tried to containerize them. Eric, I think the idea of the sysv
code being nicely and completely isolated is pretty much gone, due to
their connection to sysctls. I think I'll go back and just isolate the
"struct ipc_ids" portion. We can do the accounting bits later.
The patches I have will isolate the IDs, but I'm not sure how much sense
that makes without doing the things like the shm_tot variable. Does
anybody think we need to go after sysctls first, perhaps? Or, is this a
problem graph with cycles in it? :)
I don't see an immediately clear solution on how to containerize sysctls
properly. The entire construct seems to be built around getting data
from in and out of global variables and into /proc files.
We obviously want to be rid of many of these global variables. So, does
it make sense to introduce different classes of sysctls, at least
internally? There are probably just two types: global, writable only
from the root container and container-private. Does it make sense to
have _both_? Perhaps a sysadmin
Eric, can you think of how you would represent these in the hierarchical
container model? How would they work?
On another note, after messing with putting data in the init_task for
these things, I'm a little more convinced that we aren't going to want
to clutter up the task_struct with all kinds of containerized resources,
_plus_ make all of the interfaces to share or unshare each of those.
That global 'struct container' is looking a bit more attractive.
After checking proposed yours, Eric and vserver solutions, I must say
that these all are hacks.
If we want to virtualize sysctl we need to do it in honest way:
multiple sysctl trees, which can be different in different namespaces.
For example, one namespace can see /proc/sys/net/route and the other one
not. Introducing helpers/handlers etc. doesn't fully solve the problem
of visibility of different parts of sysctl tree and it's access rights.
Another example, the same network device can present in 2 namespaces and
these are dynamically(!) created entries in sysctl.
So we need to address actually 2 issues:
- ability to limit parts of sysctl tree visibility to namespace
- ability to limit/change sysctl access rights in namespace
You can check OpenVZ for cloning sysctl tree code. It is not clean, nor
elegant, but can be cleanuped.
Thanks,
Kirill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]