> [root@turyxsrv ~]# mcelog
> MCE 0
> HARDWARE ERROR. This is *NOT* a software problem!
> Please contact your hardware vendor
> CPU 1 4 northbridge TSC 89a560bb249
> ADDR 1dfa49690
> Northbridge Chipkill ECC error
> Chipkill ECC syndrome = 2021
> bit46 = corrected ecc error
> bus error 'local node response, request didn't time out
> generic read mem transaction
> memory access, level generic'
> STATUS 9410c00020080a13 MCGSTATUS 0
> Repeats whenever I do any kind of operations...
> How severe is ChipKill errors? Should I consider throwing away CPU 1
> and get another one.
That sounds to me more like some of the RAM attached to CPU1 is bad..
I took out CPU1. Errors went away. But so is half of the RAM
(accessible only to CPU1)
Okay, I would test with swapping the RAM of CPU0 to CPU1 and test. If
I get messages again, I would change the RAM.
Thanks.
Om.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]