Kirill Korotaev <[email protected]> discovered a race between shrink_dcache_parent()
and shrink_dcache_memory(). That one is based on dput() is calling
dentry_iput() too early and therefore is giving up the dcache_lock. This leads
to the situation that the parent dentry might be still referenced although all
childs are already dead. This parent is ignore by a concurrent select_parent()
call which might be the reason for busy inode after umount failures.
This is from Kirill's original patch:
CPU 1 CPU 2
~~~~~ ~~~~~
umount /dev/sda1
generic_shutdown_super shrink_dcache_memory
shrink_dcache_parent prune_one_dentry
select_parent dput <<<< child is dead, locks are released,
but parent is still referenced!!! >>>>
skip dentry->parent,
since it's d_count > 0
message: BUSY inodes after umount...
<<< parent is left on dentry_unused list,
referencing freed super block >>>
This patch is introducing dput_locked() which is doing all the dput work
except of freeing up the dentry's inode and memory itself. Therefore, when the
dcache_lock is given up, all the reference counts of the parents are correct.
prune_one_dentry() must also use the dput_locked version and free up the
inodes and the memory of the parents later. Otherwise we have an incorrect
reference count on the parents of the dentry to prune.
...
-void dput(struct dentry *dentry)
+static void dput_locked(struct dentry *dentry, struct list_head *list)
{
if (!dentry)
return;
-repeat:
- if (atomic_read(&dentry->d_count) == 1)
- might_sleep();
- if (!atomic_dec_and_lock(&dentry->d_count, &dcache_lock))
+ if (!atomic_dec_and_test(&dentry->d_count))
return;
+
...
+void dput(struct dentry *dentry)
+{
+ LIST_HEAD(free_list);
+
+ if (!dentry)
+ return;
+
+ if (atomic_add_unless(&dentry->d_count, -1, 1))
+ return;
+
+ spin_lock(&dcache_lock);
+ dput_locked(dentry, &free_list);
+ spin_unlock(&dcache_lock);
This seems to be an open-coded copy of atomic_dec_and_lock()?
Yeah, this is what I also didn't like...
Why do it this way, when it's _really_ _possible_ to use old-good
atomic_dec_and_test() keeping logic more clear?
Kirill
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]