The problem with lock_rename() is that it compares parent directory dentries
when trying to decide how many inode i_sem semaphores to acquire. That works
fine on a single node system, but not in a distributed environment, on a
distributed filesystem.
The problem with rename(2) in a cluster, is that there is no guarantee that
path_lookup()s will return a coherent path structure while at the same time
renames, on another node, are executing on that same path hierarchy. In our
case, in do_rename(), the parent directory path_lookup()s find/create unique
dentries for the old_dir and new_dir, however, because a rename(2) on another
node was executing, both dentries end up pointing to the same inode.
When lock_rename(new_dir, old_dir) is called, the dentries don't match, so we
end up in a code path that tries to acquire the inode i_sem of both the
old_dir and new_dir, but since they point to the same inode, the second
attempt to acquire the same i_sem results in a deadlock.
A fix would be to compare the dentries ->d_inode field instead. Patch for
kernel 2.6.12.1 attached.
Author of the patch is Michael Gaughen <[email protected]>
--
Zuzana Petrova
===================================================================
#
# This patch changes lock_rename()/unlock_rename() to compare the inodes
# referenced by the dentries. The problem with distributed filesystems is
# that there is no guarantee the path_lookup() will return valid path dentry
# components at the same time rename(2)s of that same path hierarchy are
# executing on another node. So there are cases where the dentries, passed
# to lock_rename()/unlock_rename(), are different, but refer to the same
# inode. In that case, the check for equivalent dentries will fail, and an
# attempt to acquire each of the dentries inode will be made, resulting in
# a deadlock since the inodes are the same.
#
--- linux-2.6.11.12.old/fs/namei.c 2005-03-11 13:56:30.877725438 +0100
+++ linux-2.6.11.12/fs/namei.c 2005-03-11 14:01:49.196080914 +0100
@@ -1197,7 +1197,7 @@ lock_rename(
{
struct dentry *p;
- if (p1 == p2) {
+ if (p1->d_inode == p2->d_inode) {
down(&p1->d_inode->i_sem);
return NULL;
}
@@ -1228,7 +1228,7 @@
void unlock_rename(struct dentry *p1, struct dentry *p2)
{
up(&p1->d_inode->i_sem);
- if (p1 != p2) {
+ if (p1->d_inode != p2->d_inode) {
up(&p2->d_inode->i_sem);
up(&p1->d_inode->i_sb->s_vfs_rename_sem);
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]