On running the Stress Test on machine for more than 72 hours following
error message was observed.
0:mon> e
cpu 0x0: Vector: 300 (Data Access) at [c00000007ce2f7f0]
pc: c000000000060d90: .dup_fd+0x240/0x39c
lr: c000000000060d6c: .dup_fd+0x21c/0x39c
sp: c00000007ce2fa70
msr: 800000000000b032
dar: ffffffff00000028
dsisr: 40000000
current = 0xc000000074950980
paca = 0xc000000000454500
pid = 27330, comm = bash
0:mon> t
[c00000007ce2fa70] c000000000060d28 .dup_fd+0x1d8/0x39c (unreliable)
[c00000007ce2fb30] c000000000060f48 .copy_files+0x5c/0x88
[c00000007ce2fbd0] c000000000061f5c .copy_process+0x574/0x1520
[c00000007ce2fcd0] c000000000062f88 .do_fork+0x80/0x1c4
[c00000007ce2fdc0] c000000000011790 .sys_clone+0x5c/0x74
[c00000007ce2fe30] c000000000008950 .ppc_clone+0x8/0xc
--- Exception: c00 (System Call) at 000000000fee9c60
SP (fcb2e770) is in userspace
---------------------------
The problem is because of race window. When if(expand) block is executed in dup_fd
unlocking of oldf->file_lock give a window for fdtable in oldf to be
modified. So actual open_files in oldf may not match with open_files
variable.
This is the debug patch to fix the problem
Please let me know of your opinion. It is generated on:2.6.19-rc1
--- kernel/fork.c.orig 2006-11-10 14:42:02.000000000 +0530
+++ kernel/fork.c 2006-11-10 14:42:30.000000000 +0530
@@ -687,6 +687,7 @@ static struct files_struct *dup_fd(struc
* the latest pointer.
*/
spin_lock(&oldf->file_lock);
+ open_files = count_open_files(old_fdt);
old_fdt = files_fdtable(oldf);
}
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]