Re: cpu hotplug oops on 2.6.15-rc5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Dec 22, 2005 at 11:37:00AM -0600, Sonny Rao wrote:
> On Thu, Dec 22, 2005 at 01:27:43AM -0800, Ravikiran G Thirumalai wrote:
> > On Mon, Dec 19, 2005 at 12:16:59AM -0500, Sonny Rao wrote:
> > > (apologies if this is a dup)
> > > 
> > > Hi, I'm crashing 2.6.15-rc5 when I try and offline the last and only CPU in a node on a ppc64 Power5, SMT was disabled.
> > > 
> > > Here's the backtrace:
> > > 
> > > 0:mon> t
> > > [c0000001ad033820] c000000000096a7c .kfree+0x250/0x280
> > > [c0000001ad0338d0] c00000000009a544 .cpuup_callback+0x238/0x5fc
> > > [c0000001ad0339c0] c000000000068114 .notifier_call_chain+0x68/0x9c
> > > [c0000001ad033a50] c0000000000789fc .cpu_down+0x1fc/0x368
> > > [c0000001ad033b40] c0000000002ac658 .store_online+0x88/0xe8
> > > [c0000001ad033bd0] c0000000002a6f14 .sysdev_store+0x4c/0x68
> > > [c0000001ad033c50] c000000000110368 .sysfs_write_file+0x100/0x1a0
> > > [c0000001ad033cf0] c0000000000be854 .vfs_write+0x100/0x200
> > > [c0000001ad033d90] c0000000000bea64 .sys_write+0x54/0x9c
> > > [c0000001ad033e30] c000000000008600 syscall_exit+0x0/0x18
> > > --- Exception: c01 (System Call) at 000000000fe5ec10
> > > SP (ffc4c4f0) is in userspace
> > > 
> > > 0:mon> e
> > > cpu 0x0: Vector: 300 (Data Access) at [c0000001ad033520]
> > >     pc: c00000000048bd30: ._spin_lock+0x18/0x80
> > >     lr: c000000000096a7c: .kfree+0x250/0x280
> > >     sp: c0000001ad0337a0
> > >    msr: 8000000000001032
> > >    dar: 48
> > >  dsisr: 40000000
> > >   current = 0xc0000001aff12040
> > >   paca    = 0xc0000000005c1000
> > >     pid   = 17376, comm = bash
> > > 
> > > 
> > 
> > Sonny,
> > Does this patch fix the issue?   This one applies cleanly on 2.6.15-rc6
> > unlike the one that was sent to you earlier.
> 
> Hi, thanks, now I'm getting a slightly different error, 
> hitting a BUG in the slab debug code:
> 
> ihplus:~ # echo 0 > /sys/devices/system/cpu/cpu14/online 
> cpu 0x4: Vector: 700 (Program Check) at [c0000003a8c233f0]
>     pc: c00000000009bb2c: .check_slabp+0x130/0x188
>     lr: c00000000009bb28: .check_slabp+0x12c/0x188
>     sp: c0000003a8c23670
>    msr: 8000000000021032
>   current = 0xc0000001b95297f0
>   paca    = 0xc0000000005d7000
>     pid   = 11116, comm = bash
> kernel BUG in check_slabp at mm/slab.c:2368!
> enter ? for help
> 
> 
> 4:mon> t
> [c0000003a8c23700] c00000000009d918 .free_block+0x168/0x294
> [c0000003a8c237e0] c00000000009d1dc .kfree+0x2b8/0x2d4
> [c0000003a8c238a0] c0000000000a1644 .cpuup_callback+0x144/0x618
> [c0000003a8c239b0] c0000000004a0780 .notifier_call_chain+0x68/0x9c
> [c0000003a8c23a40] c00000000007d608 .cpu_down+0x1fc/0x358
> [c0000003a8c23b30] c0000000002bb4ec .store_online+0x88/0xe8
> [c0000003a8c23bc0] c0000000002b5c14 .sysdev_store+0x4c/0x68
> [c0000003a8c23c40] c000000000119c6c .sysfs_write_file+0x118/0x1bc
> [c0000003a8c23cf0] c0000000000c6078 .vfs_write+0x100/0x200
> [c0000003a8c23d90] c0000000000c6288 .sys_write+0x54/0x9c
> [c0000003a8c23e30] c000000000008600 syscall_exit+0x0/0x18
> --- Exception: c01 (System Call) at 000000000fe5ec10
> SP (ff865560) is in userspace

More details: 

The above crash was with SMT on, and I had already off-lined the SMT
sibling thread.  

When I boot with SMT off, I get a slightly different crash:

ihplus:~ # echo 0 > /sys/devices/system/cpu/cpu14/online 
cpu 0x0: Vector: 700 (Program Check) at [c0000003afa13480]
    pc: c00000000009d960: .free_block+0x1b0/0x294
    lr: c00000000009d95c: .free_block+0x1ac/0x294
    sp: c0000003afa13700
   msr: 8000000000021032
  current = 0xc0000003afe04000
  paca    = 0xc0000000005d5000
    pid   = 10998, comm = bash
kernel BUG in free_block at mm/slab.c:2664!
enter ? for help

0:mon> t
[c0000003afa137e0] c00000000009d1dc .kfree+0x2b8/0x2d4
[c0000003afa138a0] c0000000000a1644 .cpuup_callback+0x144/0x618
[c0000003afa139b0] c0000000004a0780 .notifier_call_chain+0x68/0x9c
[c0000003afa13a40] c00000000007d608 .cpu_down+0x1fc/0x358
[c0000003afa13b30] c0000000002bb4ec .store_online+0x88/0xe8
[c0000003afa13bc0] c0000000002b5c14 .sysdev_store+0x4c/0x68
[c0000003afa13c40] c000000000119c6c .sysfs_write_file+0x118/0x1bc
[c0000003afa13cf0] c0000000000c6078 .vfs_write+0x100/0x200
[c0000003afa13d90] c0000000000c6288 .sys_write+0x54/0x9c
[c0000003afa13e30] c000000000008600 syscall_exit+0x0/0x18
--- Exception: c01 (System Call) at 000000000fe5ec10
SP (ff8b4560) is in userspace

This one points to a double free somewhere

Sonny
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux