Re: What protection does sysfs_readdir have with SMP/Preemption?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 22, 2005 at 01:39:47PM -0800, Greg KH wrote:
> On Tue, Nov 22, 2005 at 04:33:22PM -0500, Steven Rostedt wrote:
> > Hi,
> > 
> > I'm developing a custom kernel on top of Ingo's -rt patch. My kernel
> > makes race conditions in the vanilla kernel show up very well :-)
> > 
> > I just hit a bug, actually a page fault in fs/sysfs/dir.c in
> > sysfs_readdir:
> > 
> > 
> > 
> > 			for (p=q->next; p!= &parent_sd->s_children; p=p->next) {
> > 				struct sysfs_dirent *next;
> > 				const char * name;
> > 				int len;
> > 
> > 				next = list_entry(p, struct sysfs_dirent,
> > 						   s_sibling);
> > 				if (!next->s_element)
> > 					continue;
> > 
> > 				name = sysfs_get_name(next);
> > 				len = strlen(name);
> > 				if (next->s_dentry)
> > 					ino = next->s_dentry->d_inode->i_ino;
> > 
> > ^^^^
> > This is where I had a bad pointer reference.
> > 
> > 				else
> > 					ino = iunique(sysfs_sb, 2);
> > 
> > 				if (filldir(dirent, name, len, filp->f_pos, ino,
> > 						 dt_type(next)) < 0)
> > 					return 0;
> > 
> > 
> > Looking at this code, I don't see anything protecting the s_dentry. For
> > example, couldn't the following happen:
> > 
> > sysfs_create_dir is called, which calls create_dir.  Now we create a
> > dentry with no d_inode. In sysfs_make_dirent which calls
> > sysfs_new_dirent which adds to the parents s_children. Then
> > sysfs_make_dirent sets s_dentry = dentry (the one that was just made
> > with no d_inode assigned yet).  Then create_dir calls sysfs_create which
> > finally assigns the d_inode.
> > 
> > So, either there is some hidden protection and my modification to the
> > kernel has caused this to bug, or we have just been lucky the whole time
> > in the vanilla kernel.
> 
> I think we've been lucky :(
> 
> Maneesh, any ideas?
> 

The dir operation sysfs_readdir() is called under directory inode's i_sem
taken in vfs_readdir() and create_dir() also takes parent directory inode's 
i_sem. So in this case I think following are the relevant steps happening
which look safe to me.

cpu 0
vfs_readdir()
  down(dir inode i_sem)
    sysfs_readdir(dir)
      parse through dir->s_dirent s_children list
  up(dir inode i_sem)
   

cpu 1
sysfs_create_dir()
  create_dir()
   down(parent dir inode i_sem)
   lookup_one_len (allocates & makes the new directory dentry visible)
   sysfs_make_diret()
     sysfs_new_dirent()
       attach the new directory s_dirent to parent's s_children list)
   up(parent dir inode i_sem)


Basically, sysfs_readdir for a directory is protected against any 
addition/deletion in the directory by directory inode's i_sem.

But the bad pointer reference seen in sysfs_readdir() has to be debugged.
Assumption here is that if there is a dentry attached to s_dirent, there
has to be a inode associated becuase negative dentries are not created
in sysfs. Is it possible to get some more information about the recreation
scenario. Could you enable DEBUG printks for lib/kobject.c and 
drivers/base/class.c to see the events happening.


Thanks
Maneesh
   
-- 
Maneesh Soni
Linux Technology Center, 
IBM India Software Labs,
Bangalore, India
email: [email protected]
Phone: 91-80-25044990
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux