Petr Vandrovec <[email protected]> wrote:
>
> Andrew Morton wrote:
> > Petr Vandrovec <[email protected]> wrote:
> >
> >>Andrew Morton wrote:
> >> > Petr Vandrovec <[email protected]> wrote:
> >> >
> >> >> so now once crashes on UP system were sorted out, I tried to
> >> >> put new kernel on my SMP host - and sorry to say, but it does not
> >> >> seem to work as advertised :-(
> >> >
> >> > .config (again), please.
> >>
> >> Any SMP with NUMA. One which I'm trying to debug now is attached.
> >> It is available at http://vana.vc.cvut.cz/config as well.
> >
> > I can get 2.6.14-rc1 to crash with your .config, but current -linus is OK.
>
> It still dies for me, with current git (tree 7513cdadc661cfe0bd1625145a4876e54df191ca,
> commit 6c0741fbdee5bd0f8ed13ac287c4ab18e8ba7d83). Config is available at
> http://platan.vc.cvut.cz/config-vana.txt. Box is dual opteron Tyan K8W, S2885.
>
> ...
>
> ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
> ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
> ----------- [cut here ] --------- [please bite here ] ---------
> Kernel BUG at mm/slab.c:1849
> invalid operand: 0000 [1] SMP
> CPU 0
> Modules linked in:
> Pid: 8, comm: events/0 Not tainted 2.6.14-rc1-6c07 #1
> RIP: 0010:[<ffffffff8016d316>] <ffffffff8016d316>{free_block+294}
> RSP: 0000:ffff81007ff21d88 EFLAGS: 00010002
> RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000310
> RDX: 0000000000000000 RSI: ffff81007ffddd10 RDI: ffff81007ffda080
> RBP: ffff81007ffde000 R08: ffff81003ffa0d50 R09: 0000000000000000
> R10: 00000000ffffffff R11: 0000000000000000 R12: ffff81007ffc9b50
> R13: ffff81007ffde048 R14: ffff81007ffda080 R15: ffff81007ffda080
> FS: 0000000000000000(0000) GS:ffffffff805f2800(0000) knlGS:0000000000000000
> CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006e0
> Process events/0 (pid: 8, threadinfo ffff81007ff20000, task ffff81003ff8c790)
> Stack: 0000000000000000 0000000000000000 0000000000000292 hda: _NEC DVD_RW ND-3500AG, ATAPI CD/DVD-ROM drive
> 0000000200000000
> ffff81007ffddd10 ffff81007ffddd10 ffff81007ffddce8 0000000000000002
> 0000000000000000 ffff81007ffda080
> Call Trace:<ffffffff8016e8b7>{drain_array_locked+167} <ffffffff8016e9f7>{cache_reap+231}
> <ffffffff80131e23>{__wake_up+67} <ffffffff8016e910>{cache_reap+0}
> <ffffffff8014930c>{worker_thread+476} <ffffffff80131d60>{default_wake_function+0}
> <ffffffff80131d60>{default_wake_function+0} <ffffffff80149130>{worker_thread+0}
> <ffffffff8014db82>{kthread+146} <ffffffff8010ec22>{child_rip+8}
> <ffffffff80149130>{worker_thread+0} <ffffffff8014daf0>{kthread+0}
> <ffffffff8010ec1a>{child_rip+0}
>
> Code: 0f 0b 68 9d 26 3d 80 c2 39 07 48 89 ee 4c 89 ff 4c 8d 75 30
> RIP <ffffffff8016d316>{free_block+294} RSP <ffff81007ff21d88>
> ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Well. The CPU_UP_CANCELED locking in cpuup_callback() looks borked to me -
it takes cachep->nodelists[node]->list_lock and then calls
drain_alien_cache() which appears to take the same lock. But that's not
the problem here.
The code in cache_reap() recalculates numa_node_id() multiple times, so if
the caller changes CPUs then this assertion will trigger. However it's
running under keventd here, which is pinned to a single CPU. Still, it
would be useful if you could try putting preempt_disable()s in
cache_reap(), or change cache_reap() to evaluate numa_node_id() just the
once, and cache that in a local variable.
I wonder why numa_node_id() uses raw_smp_processor_id()? That's just
asking for preempt non-atomicity bugs.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|