On Sat, 27 Oct 2007, Willy Tarreau wrote:
> On Sat, Oct 27, 2007 at 09:59:07AM -0700, Davide Libenzi wrote:
> > On Sat, 27 Oct 2007, Marc Lehmann wrote:
> >
> > > > Please provide some code to illustrate one exact problem you have.
> > >
> > > // assume there is an open epoll set that listens for events on fd 5
> > > if (fork () = 0)
> > > {
> > > close (5);
> > > // fd 5 is now removed from the epoll set of the parent.
> > > _exit (0);
> > > }
> >
> > Hmmm ... what? I assume you know that:
> >
> > 1) A file descriptor is a userspace view/handle of a kernel object
> >
> > 2) The kernel object has a use-count for as many file descriptors that
> > have been handed out to userspace
> >
> > 3) A close() decreases the internal counter by one
> >
> > 4) The kernel object gets effectively closed when the internal counter
> > goes to zero
> >
> > 5) A fork() acts as a dup() on the file descriptors by hence bumping up
> > its internal counter
> >
> > 6) Epoll removes the file from the set, when the *kernel* object gets
> > closed (internal use-count goes to zero)
> >
> > With that in mind, how can the code snippet above trigger a removal from
> > the epoll set?
>
> Davide,
>
> from what I understand, Marc is not asking for the code above to remove
> the fd from the epoll set, but he's in fact complaining that he *observed*
> that the fd was removed from the epoll set in the *parent* process when
> the child closes it, which is of course not expected at all. As strange
> as it looks like, this might need investigation. It is possible that there
> is some strange bug somewhere in some kernel versions.
That would be *really* strange, since epoll hooks in __fput() in order to
perform proper cleanup. This means that, in the case above, the file will
be really closed in the parent too. That, I think, would trigger way more
serious problems in userspace.
> Marc, I think that if you indicate the last kernel version on which you
> observed this and provide a very short and easy reproducer, it would
> help everyone investigating this. Basically something which reports "OK"
> or "KO".
Of course. That'd be great.
- Davide
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]