MCE/SMP problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm seeing an odd problem on a dual Xeon server.  After rebooting
following a power fault, it now no longer boots successfully with an SMP
kernel, but is fine with a uniprocessor kernel.

It seems to be related to MCE but there's nothing in /var/log/mcelog.
Adding the 'nomce' kernel parameter doesn't seem to help - this is the
Oops that appears in that case:

NMI Watchdog detected LOCKUP, CPU=3, registers:
CPU 3
Modules linked in:
Pid: 1,comm: swapper Tainted: G   M  2.6.9-34.106.unsupportedsmp
RIP: 0010:[<ffffffff8011be25>]
<ffffffff8011be25>{__smp_call_function+100}
RSP: 0018:00000100bff19cb8  EFLAGS: 00000097
RAX: 0000000000000002 RBX: 0000000000000003 RCX: 0000000000000004
RDX: 0000ffff0000ffff RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000008 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000004 R12: ffffffff8011bece
R13: 0000000000000000 R14: 0000000774977492 R15: ffffffff8031d0af
FS:  0000000000000000(0000) GS: ffffffff804db880(0000)
knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000000bff14000 CR4: 00000000000006e0
Process swapper (pid: 1, threadinfo 0000010037e20000, task
00000100bff9b7f0
Stack: ffffffff8011bece 0000000000000000 0000000000000002
ffffffff00000000
       ffffffff8031d0c8 0000000000000000 0000000000000900
00000000ffffffff
       ffffffff803d5140 ffffffff8011bf0b
Call Trace:<ffffffff8011bece>{smp_really_stop_cpu+0}
<ffffffff8011bf0b>{smp_send_stop+52}
<ffffffff801367fe>{panic+235} <ffffffff801177ec>{print_mce+136}
<ffffffff801178c4>{mce_available+0}
<ffffffff80117c72>{do_machine_check+916}
<ffffffff8011134f>{machine_check+127}
<ffffffff802446f6>{sysdev_driver_register+29)

I tried Knoppix (which has a newer kernel) and there was a similar
result.

One odd detail is that just before the Oops appears some MCE events are
listed, and they refer to CPU 6 and CPU 7, even though they don't exist
on this machine.


Adam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux