On Thu, Mar 02, Neil Brown wrote:
> The core problem is that:
> prune_one_dentry can hold a reference to a dentry without any
> lock being held, and without any other reference to the
> filesystem (if it is being called from shrink_dcache_memory).
> It holds this reference while calling iput on an inode. This can
> take an arbitrarily long time to complete, especially if NFS
> needs to wait for some RPCs to complete or timeout.
>
> shrink_dcache_parent skips over dentries which have a reference,
> such as the one held by prune_one_dentry.
>
> Thus umount can find that an inode is still in use (by it's dentry
> which was skipped) and will complain. Worse, when the nfs request
> on some inode finally completes, it might find the superblock
> doesn't exist any more and... oops.
>
> My proposed solution to the problem is never to expose the reference
> held by prune_one_dentry. i.e. keep the spin_lock held.
This morning I wondered, why I was using a list to drop the dentry's inodes
after the dput() has visited all parents.
It is not enough to fix prune_one_dentry() to hold the lock until the parent
is dereferenced since prune_one_dentry() calls __dput_locked(). And in
__dput_locked() we have the same problem again:
from __dput_locked():
list_del(&dentry->d_u.d_child);
dentry_stat.nr_dentry--; /* For d_free, below */
/*drops the locks, at that point nobody can reach this dentry*/
-> dentry_iput(dentry);
parent = dentry->d_parent;
d_free(dentry);
if (dentry == parent)
return;
dentry = parent;
+
+ if (atomic_read(&dentry->d_count) == 1)
+ might_sleep();
+ if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
+ return;
+
+ spin_lock(&dentry->d_lock);
goto repeat;
}
}
Between -> and the atomic_dec_and_lock() the reference count on the parent is
wrong and no lock is held. I fixed that by using d_lru to keep track of all
dentry's which inodes still have to be dereferenced. This should happen after
all parents have been dereferenced.
Regards,
Jan
--
Jan Blunck [email protected]
SuSE LINUX AG - A Novell company
Maxfeldstr. 5 +49-911-74053-608
D-90409 Nürnberg http://www.suse.de
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]