Re: [patch] stop inotify from sending random DELETE_SELF event under load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2005-09-21 at 02:01 +0100, Al Viro wrote:
> On Tue, Sep 20, 2005 at 06:53:34PM -0400, John McCutchan wrote:
> > Is there some reason we can't just do this from vfs_unlink
> > 
> > inode = dentry->inode;
> > iget (inode);
> > d_delete (dentry);
> > fsnotify_inoderemove (inode);
> > iput (inode);
> > 
> > This would allow us to have immediate event notification, and avoid a
> > race with the inode going away, right?
> 
> Playing with references to struct inode means playing dirty tricks
> behind the filesystem's back.  Doing that in a way that really changes
> inode lifetime means asking for trouble.  Combined with a dirty trick
> *already* pulled by sys_unlink() to postpone the final iput until after
> we unlock the parent, it means breakage (and aforementioned dirty trick
> took some rather interesting logics to compensate for in the first place).
> 
> Moreover, your suggestion would do that to _everyone_, whether they use
> inotify or not.  NAK.

Got it.

> 
> >  static inline void fsnotify_inoderemove(struct inode *inode)
> >  {
> > -	inotify_inode_queue_event(inode, IN_DELETE_SELF, 0, NULL);
> > -	inotify_inode_is_dead(inode);
> > +	inotify_inode_queue_event(inode, IN_DELETE_SELF, inode->i_nlink, NULL);
> > +	if (inode->i_nlink == 0)
> > +		inotify_inode_is_dead(inode);
> >  }
> 
> Assumes that filesystem treats ->i_nlink on final iput() in usual way.
> It doesn't have to.
> 

I grepped all the filesystems, and they all seem to use
generic_drop_inode, except for hugetlbfs, which seems to have the same
logic of (!inode->i_nlink).

> BTW, what happens if one uses inotify on procfs?  Or sysfs, for that matter?
> Fundamental problem with that sucker is that you are playing games with
> lifetime rules of inodes in a way that might be OK for some filesystems,
> but violates a lot of assumptions made by other...
> 

Honestly, I don't know. And I don't think I know enough to say with any
certainty how either of them would work. Would a black list of
filesystems that don't want inotify on them be acceptable?

> BTW^2, what guarantees that inotify_unmount_inodes() will not happen while we
> are in inotify_release()?  That would happily keep watch refcount bumped,
> so it would outlive inotify_unmount_inodes().  Sure, it would be dropped.
> And call iput() on a pinned inode that had outlived the umount().  Oops...

Good catch,

Index: linux/fs/inotify.c
===================================================================
--- linux.orig/fs/inotify.c	2005-08-31 15:41:11.000000000 -0400
+++ linux/fs/inotify.c	2005-09-20 21:18:35.000000000 -0400
@@ -756,6 +756,7 @@
 	 * do not know the inode until we iterate to the watch.  But we need to
 	 * hold inode->inotify_sem before dev->sem.  The following works.
 	 */
+	down(&iprune_sem);
 	while (1) {
 		struct inotify_watch *watch;
 		struct list_head *watches;
@@ -779,6 +780,7 @@
 		up(&inode->inotify_sem);
 		put_inotify_watch(watch);
 	}
+	up(&iprune_sem);
 
 	/* destroy all of the events on this device */
 	down(&dev->sem);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux