Re: [PATCH -mm 0/7] execns syscall and user namespace

Cedric Le Goater <[email protected]> writes:

> Hello eric,
>
> Eric W. Biederman wrote:
>
>>> The following patchset adds the user namespace and a new syscall execns.
>>>
>>> The user namespace will allow a process to unshare its user_struct table,
>>> resetting at the same time its own user_struct and all the associated
>>> accounting.
>>>
>>> The purpose of execns is to make sure that a process unsharing a namespace
>>> is free from any reference in the previous namespace. the execve() semantic
>>> seems to be the best candidate as it already flushes the previous process
>>> context.
>>>
>>> Thanks for reviewing, sharing, flaming !
>> 
>> 
>> I haven't had a chance to do a thorough review yet but why is
>> this needed?
>> 
>> What can be left shared by switching to a new namespace and then
>> execing an executable?
>> 
>> Is it not possible to ensure what you are trying to ensure with
>> a good user space executable?
>
> unshare() is unsafe for some namespaces because namespaces can reference
> each other. For the ipc namespace, example are shm ids vs. vma, sem ids vs.
> semundos, msq vs. netlink sockets. for the user namespace, open files. So
> it seems reasonable to provide a way to unshare namespaces from a clean
> process context.

It is perfectly legitimate to have a shared memory region memory mapped
from another namespace.  Yes sem ids versus semunds is an issue but it
just requires you to unshare one at the same time you unshare the other,
or to simply clone a new namespace.  I'm not familiar with the msq vs netlink
socket issue.  As for the user namespace vs open files.  If we have
any issues with open files in any namespace that sounds like an implementation
bug to me.

I'm not convinced the problems you are seeing are not implementation bugs.
For some things clone is still more general then unshare, and clone should
be considered the primary user interface, not unshare.

> Now, if you try to do that from user space, you will call unshare() then
> execve(), which leaves plenty of room and time for nasty things to happen
> in between the 2 calls.

I will look more closely but I think there is an important point being missed
somewhere.  Pieces of the kernel interact in all sorts of weird and unexpected
ways.  If we rely on ourselves always being in the right magic namespace for
things to work correctly we are setting ourselves up for trouble.  If we know
a namespace implementation will work even when a process has access to entities
in multiple instances of that namespace we are in much better shape.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH -mm 0/7] execns syscall and user namespace
  - From: Cedric Le Goater <[email protected]>

References:
- [PATCH -mm 0/7] execns syscall and user namespace
  - From: Cedric Le Goater <[email protected]>
- Re: [PATCH -mm 0/7] execns syscall and user namespace
  - From: [email protected] (Eric W. Biederman)
- Re: [PATCH -mm 0/7] execns syscall and user namespace
  - From: Cedric Le Goater <[email protected]>

Prev by Date: Re: [PATCH] sysctl: Allow /proc/sys without sys_sysctl
Next by Date: Re: please revert kthread from loop.c
Previous by thread: Re: [PATCH -mm 0/7] execns syscall and user namespace
Next by thread: Re: [PATCH -mm 0/7] execns syscall and user namespace
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]