Re: RFC [patch 13/34] PID Virtualization Define new task_pid api

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 31, 2006 at 08:39:19PM -0800, Linus Torvalds wrote:
> 
> 
> On Tue, 31 Jan 2006, Eric W. Biederman wrote:
> > 
> > Yes. Although there are a few container lifetimes problems with
> > that approach. Do you want your container alive for a long time
> > after every process using it has exited just because someone has
> > squirrelled away their pid. While container lifetime issues crop up
> > elsewhere as well PIDs are by far the worst, because it is current
> > safe to store a PID indefinitely with nothing worse that PID wrap
> > around.
>
> Are people really expecting to have a huge turn-over on containers? It
> sounds like this shouldn't be a problem in any normal circumstance:
> especially if you don't even do the "big hash-table per container"
> approach, who really cares if a container lives on after the last
> process exited?
>
> I'd have expected that the major user for this would end up being     
> ISP's and the like, and I would not expect the virtual machines to be 
> brought up all the time.                                              

well, really depends, as far as I can tell the 
number of guest (container) (re)starts can be as
high as one per second (in extreme cases) while
the entire setup doesn't have more than 50-100
containers at the same time, and usually 'runs'
for more than a few months without reboot ...

but agreed, the typical number of container
creations and deletions will be around one per
hour or day ...

> If it's a problem, you can do the same thing that the "struct
> mm_struct" does: it has life-time issues because a mm_struct actually
> has to live for potentially a _long_ time (zombies) but at the same
> time we want to free the data structures allocated to the mm_struct as
> soon as possible, notably the VMA's and the page tables.
>
> So a mm_struct uses a two-level counter, with the "real" users
> (who need the page tables etc) incrementing one ("mm_users"), and
> the "secondary" ones (who just need to have an mm_struct pinned,
> but are ok with an empty VM being attached) incrementing the other
> ("mm_count").

yes, we already do something very similar in 
Linux-VServer, basically differentiating between 
'active users' and 'passive references' ...

> The same approach might be valid for "containers": you can destroy most of 
> the associated container when the actual processes are gone, but keep just 
> the empty container around until all secondary references are finally also 
> gone.
> 
> It's pretty simple: the secondary reference starts at 1 - with the 
> "primary" counter being the single ref to the secondary. Then freeing a 
> primary does:
> 
> 	if (atomic_dec_and_test(&container->primary_counter)) {
> 		.. free the core resources here ..
> 
> 		/* then release the ref from the primary to secondary */
> 		secondary_free(container);
> 	}
> 
> (for "mm_struct", the primary is dropped "mmput()" and the secondary is 
> dropped with "mmdrop()", which is absolutely horrid naming. Please name 
> things better than I did ;)

best,
Herbert

> 			Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux