Re: [-mm patch] Fix inotify umount hangs.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 5 Jul 2005, Anton Altaparmakov wrote:
> On Tue, 5 Jul 2005, Anton Altaparmakov wrote:
> > On Tue, 5 Jul 2005, Robert Love wrote:
> > > On Mon, 2005-07-04 at 20:28 +0100, Anton Altaparmakov wrote:
> > > > The below patch against 2.6.13-rc1-mm1 fixes the umount hangs caused by 
> > > > inotify.
> > > 
> > > Thank you, very much, Anton, for hacking on this over the weekend.
> > 
> > You are welcome.  (-:
> > 
> > > It's definitely not the prettiest thing, but there may be no easier
> > > approach.  One thing, the messy code is working around the list
> > > changing, doesn't invalidate_inodes() have the same problem?  If so, it
> > > solves it differently.
> > 
> > It does.  It first goes over the i_sb_list and anything it finds that it 
> > is interested in (i.e. all inodes with zero i_count), it moves the inode 
> > (i_list) over to a private list (this is done in invalidate_list()).  
> > Then, when it is finished accessing the i_sb_list, and all inodes of 
> > interest (zero i_count) are on the private list, dispose_list() is called, 
> > and all inodes on the private list are exterminated.  This obviously no 
> > longer uses i_sb_list so it does not matter that that is changing now.
> > 
> > > I'm also curious if the I_WILL_FREE or i_count check fixed the bug.  I
> > > suspect the other fix did, but we probably still want this.  Or at least
> > > the I_WILL_FREE check.
> > 
> > The i_count check is at least sensible if not required (not sure) 
> > otherwise you do iget() on inode with zero i_count then waste your time 
> > looking for watches (which can't be there or i_count would not be zero), 
> > and then iput() kills off the inode and throws it out of the icache.  This 
> > would be done in invalidate_inodes() anyway, no need for you to do it 
> > first.  Even if this is not a required check it saves some cpu cycles.
> > 
> > The I_WILL_FREE is definitely required otherwise you will catch inodes 
> > that will suddenly become I_FREEING and then I_CLEAR under your feet once 
> > you drop the inode_lock and we know that is a Bad Thing (TM) as it causes 
> > the BUG_ON(i_state == I_CLEAR) in iput() to trigger that was reported 
> > before (when I got drawn into inotify in the first place).
> > 
> > Note that if you want to be really thorough you could wait on the inode 
> > to be destroyed for good:
> > 
> > if (inode->i_state & (I_FREEING|I_CLEAR|I_WILL_FREE))
> > 	__wait_on_freeing_inode(inode);
> > 
> > And then re-check in the i_sb_list if the inode is still there (e.g. via 
> > prev member of next_i->i_sb_list which will no longer be "inode" if the 
> > inode has been evicted).  If the inode is still there someone 
> > "reactivated" it while you were waiting and you need to redo the 
> > i_count and i_state checks and deal with the inode as appropriate.
> > 
> > However given this is umount we are talking about there doesn't seem to be 
> > much point in being that thorough.
> > 
> > I am not familiar enough with i_notify but _if_ it is possible for a user 
> > to get a watch on an inode which is I_FREEING|I_CLEAR|I_WILLFREE then you 
> > have to do the waiting otherwise you will miss that watch with I don't 
> > know what results but probably not good ones...
> 
> Actually given that watches increment i_count, the results would be quite 
> obvious if it were possible for this to happen.  The user would see my 
> favourite printk (see fs/super.c::generic_shutdown_super()) in dmesg! (-;
> 
> printk("VFS: Busy inodes after unmount. "
>    "Self-destruct in 5 seconds.  Have a nice day...\n");

I had a look at this and it is safe, i.e. to register a watch you need to 
open(2) a file first and you cannot do an open(2) at the point in time 
when invalidate_inodes() is called in the umount(2) code path.  And you 
cannot get that far into the umount(2) code path if a file is still open 
(you would get -EBUSY much earlier on and the umount would fail).  Thus it 
is not possible for a watch to be created when inotify_umount_inodes() is 
running.  Thus it is not necessary to wait_on_inode() wen an inodes is in 
i_state I_CLEAR|I_FREEING|I_WILL_FREE.  So my patch stands as being 
complete and sufficient for safety (AFAICS).  (-:

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux