Re: 2.6.14-rc1-git-now still dying in mm/slab - this time line 1849

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Petr Vandrovec <[email protected]> wrote:
>
> Andrew Morton wrote:
> > Petr Vandrovec <[email protected]> wrote:
> > 
> >>Andrew Morton wrote:
> >> > Petr Vandrovec <[email protected]> wrote:
> >> > 
> >> >>   so now once crashes on UP system were sorted out, I tried to
> >> >> put new kernel on my SMP host - and sorry to say, but it does not
> >> >> seem to work as advertised :-(
> >> > 
> >> > .config (again), please.
> >>
> >> Any SMP with NUMA.  One which I'm trying to debug now is attached.
> >> It is available at http://vana.vc.cvut.cz/config as well.
> > 
> > I can get 2.6.14-rc1 to crash with your .config, but current -linus is OK.
> 
> It still dies for me, with current git (tree 7513cdadc661cfe0bd1625145a4876e54df191ca,
> commit 6c0741fbdee5bd0f8ed13ac287c4ab18e8ba7d83).  Config is available at
> http://platan.vc.cvut.cz/config-vana.txt.  Box is dual opteron Tyan K8W, S2885.
> 
> ...
>
>      ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
>      ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at mm/slab.c:1849
> invalid operand: 0000 [1] SMP
> CPU 0
> Modules linked in:
> Pid: 8, comm: events/0 Not tainted 2.6.14-rc1-6c07 #1
> RIP: 0010:[<ffffffff8016d316>] <ffffffff8016d316>{free_block+294}
> RSP: 0000:ffff81007ff21d88  EFLAGS: 00010002
> RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000310
> RDX: 0000000000000000 RSI: ffff81007ffddd10 RDI: ffff81007ffda080
> RBP: ffff81007ffde000 R08: ffff81003ffa0d50 R09: 0000000000000000
> R10: 00000000ffffffff R11: 0000000000000000 R12: ffff81007ffc9b50
> R13: ffff81007ffde048 R14: ffff81007ffda080 R15: ffff81007ffda080
> FS:  0000000000000000(0000) GS:ffffffff805f2800(0000) knlGS:0000000000000000
> CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006e0
> Process events/0 (pid: 8, threadinfo ffff81007ff20000, task ffff81003ff8c790)
> Stack: 0000000000000000 0000000000000000 0000000000000292 hda: _NEC DVD_RW ND-3500AG, ATAPI CD/DVD-ROM drive
> 0000000200000000
>         ffff81007ffddd10 ffff81007ffddd10 ffff81007ffddce8 0000000000000002
>         0000000000000000 ffff81007ffda080
> Call Trace:<ffffffff8016e8b7>{drain_array_locked+167} <ffffffff8016e9f7>{cache_reap+231}
>         <ffffffff80131e23>{__wake_up+67} <ffffffff8016e910>{cache_reap+0}
>         <ffffffff8014930c>{worker_thread+476} <ffffffff80131d60>{default_wake_function+0}
>         <ffffffff80131d60>{default_wake_function+0} <ffffffff80149130>{worker_thread+0}
>         <ffffffff8014db82>{kthread+146} <ffffffff8010ec22>{child_rip+8}
>         <ffffffff80149130>{worker_thread+0} <ffffffff8014daf0>{kthread+0}
>         <ffffffff8010ec1a>{child_rip+0}
> 
> Code: 0f 0b 68 9d 26 3d 80 c2 39 07 48 89 ee 4c 89 ff 4c 8d 75 30
> RIP <ffffffff8016d316>{free_block+294} RSP <ffff81007ff21d88>
>   ide0 at 0x1f0-0x1f7,0x3f6 on irq 14

Well.  The CPU_UP_CANCELED locking in cpuup_callback() looks borked to me -
it takes cachep->nodelists[node]->list_lock and then calls
drain_alien_cache() which appears to take the same lock.  But that's not
the problem here.

The code in cache_reap() recalculates numa_node_id() multiple times, so if
the caller changes CPUs then this assertion will trigger.  However it's
running under keventd here, which is pinned to a single CPU.  Still, it
would be useful if you could try putting preempt_disable()s in
cache_reap(), or change cache_reap() to evaluate numa_node_id() just the
once, and cache that in a local variable.

I wonder why numa_node_id() uses raw_smp_processor_id()?  That's just
asking for preempt non-atomicity bugs.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux