On Fri, 2 Jun 2006 13:43:46 -0700 Andrew Morton wrote: > Trond Myklebust <[email protected]> wrote: > > > > On Fri, 2006-06-02 at 16:24 -0400, Joe Korty wrote: > > > On Thu, Jun 01, 2006 at 04:13:39PM -0400, Trond Myklebust wrote: > > > > On Thu, 2006-06-01 at 15:55 -0400, Joe Korty wrote: > > > >> Tree 5fdccf2354269702f71beb8e0a2942e4167fd992 > > > >> > > > >> [PATCH] vfs: *at functions: core > > > >> > > > >> introduced a bug where lock_kernel() can be called from > > > >> under a spinlock. To trigger the bug one must have > > > >> CONFIG_PREEMPT_BKL=y and be using NFS heavily. It is > > > >> somewhat rare and, so far, haven't traced down the userland > > > >> sequence that causes the fatal path to be taken. > > > >> > > > >> The bug was caused by the insertion into do_path_lookup() > > > >> of a call to file_permission(). do_path_lookup() > > > >> read-locks current->fs->lock for most of its operation. > > > >> file_permission() calls permission() which calls > > > >> nfs_permission(), which has one path through it > > > >> that uses lock_kernel(). > > > > > > > Nowhere should anyone be calling file_permission() under a spinlock. > > > > > > > > Why would you need to read-protect current->fs in the case where you are > > > > starting from a file? The correct thing to do there would appear to be > > > > to read_protect only the cases where (*name=='/') and (dfd == AT_FDCWD). > > > > > > > > Something like the attached patch... > > > > > > > > > Hi Trond, > > > I've been running with the patch for the last few hours, on an nfs-rooted > > > system, and it has been working fine. Any plans to submit this for 2.6.17? > > > > It probably ought to be, given the nature of the sin. Andrew? > > > > OK. > > Just to confirm, this is final? > > > From: Trond Myklebust <[email protected]> > > We're presently running lock_kernel() under fs_lock via nfs's ->permission > handler. That's a ranking bug and sometimes a sleep-in-spinlock bug. This > problem was introduced in the openat() patchset. > > We should not need to hold the current->fs->lock for a codepath that doesn't > use current->fs. > > Signed-off-by: Trond Myklebust <[email protected]> > Cc: Al Viro <[email protected]> > Signed-off-by: Andrew Morton <[email protected]> > --- > > fs/namei.c | 6 ++++-- > 1 files changed, 4 insertions(+), 2 deletions(-) > > diff -puN fs/namei.c~fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path fs/namei.c > --- 25/fs/namei.c~fs-nameic-call-to-file_permission-under-a-spinlock-in-do_lookup_path Fri Jun 2 13:39:52 2006 > +++ 25-akpm/fs/namei.c Fri Jun 2 13:39:52 2006 > @@ -1080,8 +1080,8 @@ static int fastcall do_path_lookup(int d > nd->flags = flags; > nd->depth = 0; > > - read_lock(¤t->fs->lock); > if (*name=='/') { > + read_lock(¤t->fs->lock); > if (current->fs->altroot && !(nd->flags & LOOKUP_NOALT)) { > nd->mnt = mntget(current->fs->altrootmnt); > nd->dentry = dget(current->fs->altroot); > @@ -1092,9 +1092,12 @@ static int fastcall do_path_lookup(int d > } > nd->mnt = mntget(current->fs->rootmnt); > nd->dentry = dget(current->fs->root); > + read_unlock(¤t->fs->lock); > } else if (dfd == AT_FDCWD) { > + read_lock(¤t->fs->lock); > nd->mnt = mntget(current->fs->pwdmnt); > nd->dentry = dget(current->fs->pwd); > + read_unlock(¤t->fs->lock); > } else { > struct dentry *dentry; > > @@ -1118,7 +1121,6 @@ static int fastcall do_path_lookup(int d > > fput_light(file, fput_needed); > } > - read_unlock(¤t->fs->lock); > current->total_link_count = 0; > retval = link_path_walk(name, nd); > out: 1) This bug is also present in the 2.6.16 tree. 2) The patch above is broken - it needs the fix below (or the fix should be folded into the patch directly): -------------------------------------------------------------------- Fix do_path_lookup() failure path after locking changes Signed-off-by: Sergey Vlasov <[email protected]> --- fs/namei.c | 13 ++++++------- 1 files changed, 6 insertions(+), 7 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index a2f79d2..d6e2ee2 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1104,17 +1104,17 @@ static int fastcall do_path_lookup(int d file = fget_light(dfd, &fput_needed); retval = -EBADF; if (!file) - goto unlock_fail; + goto out_fail; dentry = file->f_dentry; retval = -ENOTDIR; if (!S_ISDIR(dentry->d_inode->i_mode)) - goto fput_unlock_fail; + goto fput_fail; retval = file_permission(file, MAY_EXEC); if (retval) - goto fput_unlock_fail; + goto fput_fail; nd->mnt = mntget(file->f_vfsmnt); nd->dentry = dget(dentry); @@ -1129,13 +1129,12 @@ out: nd->dentry->d_inode)) audit_inode(name, nd->dentry->d_inode, flags); } +out_fail: return retval; -fput_unlock_fail: +fput_fail: fput_light(file, fput_needed); -unlock_fail: - read_unlock(¤t->fs->lock); - return retval; + goto out_fail; } int fastcall path_lookup(const char *name, unsigned int flags, -- 1.3.3.g3d95c
Attachment:
pgpsuxSvPjz0Y.pgp
Description: PGP signature
- Follow-Ups:
- Re: lock_kernel called under spinlock in NFS
- From: Trond Myklebust <[email protected]>
- Re: lock_kernel called under spinlock in NFS
- References:
- lock_kernel called under spinlock in NFS
- From: Joe Korty <[email protected]>
- Re: lock_kernel called under spinlock in NFS
- From: Trond Myklebust <[email protected]>
- Re: lock_kernel called under spinlock in NFS
- From: Joe Korty <[email protected]>
- Re: lock_kernel called under spinlock in NFS
- From: Trond Myklebust <[email protected]>
- Re: lock_kernel called under spinlock in NFS
- From: Andrew Morton <[email protected]>
- lock_kernel called under spinlock in NFS
- Prev by Date: Re: [patch 05/61] lock validator: introduce WARN_ON_ONCE(cond)
- Next by Date: Re: 2GB MMC/SD cards
- Previous by thread: Re: lock_kernel called under spinlock in NFS
- Next by thread: Re: lock_kernel called under spinlock in NFS
- Index(es):