Re: [PATCH] proc-readdir-race-fix-take-3-fix-3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oleg Nesterov <[email protected]> writes:

> On 09/07, Oleg Nesterov wrote:
>>
>> On 09/06, Jean Delvare wrote:
>> >
>> > On Wednesday 6 September 2006 11:01, Jean Delvare wrote:
>> > > Eric, Kame, thanks a lot for working on this. I'll be giving some good
>> > > testing to this patch today, and will return back to you when I'm done.
>> > 
>> > The original issue is indeed fixed, but there's a problem with the patch. 
>> > When stressing /proc (to verify the bug was fixed), my test machine ended 
>> > up crashing. Here are the 2 traces I found in the logs:
>> > 
>> > Sep  6 12:06:00 arrakis kernel: BUG: warning at 
>> > kernel/fork.c:113/__put_task_struct()
>> > Sep  6 12:06:00 arrakis kernel:  [<c0115f93>] __put_task_struct+0xf3/0x100
>> > Sep  6 12:06:00 arrakis kernel:  [<c019666a>] proc_pid_readdir+0x13a/0x150
>> > Sep  6 12:06:00 arrakis kernel:  [<c01745f0>] vfs_readdir+0x80/0xa0
>> > Sep  6 12:06:00 arrakis kernel:  [<c0174750>] filldir+0x0/0xd0
>> > Sep  6 12:06:00 arrakis kernel:  [<c017488c>] sys_getdents+0x6c/0xb0
>> > Sep  6 12:06:00 arrakis kernel:  [<c0174750>] filldir+0x0/0xd0
>> > Sep  6 12:06:00 arrakis kernel:  [<c0102fb7>] syscall_call+0x7/0xb
>> 
>> If the task found is not a group leader, we go to retry, but
>> the task != NULL.
>> 
>> Now, if find_ge_pid(tgid) returns NULL, we return that wrong
>> task, and it was not get_task_struct()'ed.

Yep.  That would do it.  And of course having written the
code it was very hard for me to step back far enough to see that.

The other two failure modes still don't make sense to me but
they may have been a side effect of this.

And it would have taken a thread being the highest pid in the
system that exits while we are calling filldir to trigger this.
Wow.  That was a good stress test.


Signed-off-by: Eric W. Biederman <[email protected]>

> Signed-off-by: Oleg Nesterov <[email protected]>
>
> --- t/fs/proc/base.c~	2006-09-07 02:33:26.000000000 +0400
> +++ t/fs/proc/base.c	2006-09-07 02:34:19.000000000 +0400
> @@ -2149,9 +2149,9 @@ static struct task_struct *next_tgid(uns
>  	struct task_struct *task;
>  	struct pid *pid;
>  
> -	task = NULL;
>  	rcu_read_lock();
>  retry:
> +	task = NULL;
>  	pid = find_ge_pid(tgid);
>  	if (pid) {
>  		tgid = pid->nr + 1;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux