Intel Woodcrest Crash under heavy load with FC5 and MySql

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all (especially the very technical),

I have been experiencing hardware lockups and crashes under Linux (Fedora Core 5 latest kernel version 2.6.17-1.2174_FC5smp). The crashes occur under what appears to be very heavy disk access and possibly multiple concurrent access (i.e. multiple threads).

I experience crashes using Mysql (MySQL-server-4.1.21-0.glibc23) latest 4.1 stable. In this case we also have multiple threads generating a database of approx 13-30G in size or a period of about 18 hours.

I also have experienced crashes using rsync local_disk to local_disk copies- this creates multiple threads (unlike a simple copy - cp command which is a single thread).

The servers are 10 x:

Woodcrest 5160 3Ghz (dual Core+Dual Xeon)  (1333 FSB)
Supermicro servers http://www.supermicro.com/products/system/1U/6015/SYS-6015P-8R.cfm Motherboard http://www.supermicro.com/products/motherboard/Xeon1333/5000P/X7DBP-8.cfm (BIOS 1.1c) 16 GB FB-DIMM RAM 677Mhz - Approved and personally tested by Supermicro USA
3ware 9550SX-4
4x500GB SATA Seagate Drives/16Mb cache.


HINTS
====

The crashes ONLY happen if we enable all 4 Cores in the BIOS (Dual core = enabled)

Our tests run 100% perfect if we disable the second core if each Xeon! (i.e. one core from each Xeon)

My questions
=========

Are there any "known" problems with Dual Core Xeons under load - e.g. microcode issues ? kernel bugs ?

From the kernel perspective is there any difference in operating code (i,e, ignoring any superficial stuff like /proc/cpuinfo stuff) for Dual Core Xeons ?

I assumed that Dual Core would use the exact same code as SMP kernel ? is this correct ? - I'm told it's not

Are there any special specific patches for Dual Core ? (I did notice in RH AS 4 a change log that stated something list "improved scheduling for Dual Core"

Things I've tried
===========
I have tried most combination of BIOS settings e.g. ACPI disabled in BIOS, kernel parameters acpi=off noacpi noapic etc.. all of which make no difference - the machines all crash unless I disabled Dual Core ?

I've had extensive contact with Supermicro, 3ware and now Intel - all of which are blaming each other ?

I've also recompiled the FC5 source RPM with exact same results.

I'm told that AMD had a similar problem with one of their dual cores, but this was fixed long ago and I assume that fix was specific to AMD chips and would not apply to Intel due to differences in architecture.

Any suggestions for helping be solve these crash problems would be very very much appreciated.


Thanks in advance.

Albert.



BIOS Output on boot:

Phoenix TrustedCore(tm) Server
Copyright 1985-2005 Phoenix Technologies Ltd.
All Rights Reserved

Supermicro X7DBP-8/X7DBP-I BIOS Rev 1.1b

CPU = 2 Processors Detected, Cores per Processor = 2
Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
Intel(R) Xeon(R) CPU            5160  @ 3.00GHz
DRAM Type : DDR2-667, FSB at 1333MHz
16384M System RAM Passed
4096 KB L2 Cache
System BIOS shadowed
Video BIOS shadowed

I will post some crash traces from our serial console server as a reply to this message shortly.








[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux