Re: EDAC error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ric Moore wrote:
On Sat, 2008-03-22 at 10:03 -0500, Roger Heflin wrote:
Ric Moore wrote:
On Thu, 2008-03-20 at 21:58 -0500, Roger Heflin wrote:
Brent Snow, Mr. wrote:
Hi All,

            I am having a problem with a new Dell PowerEdge 1900 Server
running Fedora 8.

            The System setup is as follows:

            2 - Xeon  E5310 (Quad-Core 1.6 GHz) processors

16 GB of RAM, I SATA 80 GB HDD.
Holy Smokes! 2 quad cores? That's 8 cores total(?) and 16 GIGS of Ram??
My Gawd, not only am I jealous as all hell, I'm wondering what kinda
kernel are you running?? Any sort of stock kernel would roll over and
join the Choir Eternal.
Actually fairly normal kernels work just fine on the large boxes, I have ran stock FC6 kernels up to 8 cpus/16 cores and up to 64GB of ram with no issues.

Wouldn't you be running some sort of mini clustering setup?? Setup
right, it should really blow serious coal. Your problem might lie in
that direction. You might have training wheels on a Dodge Hemi. With a
machine like that, I could almost do without eating! <huge drooling grins> Ric
Clustering setups are only needed when you have more than 1 machine, having lots of cpus on a single machine is much easier than clustering as you don't need have to worry about the networking, and the memory can be shared easily between the cpus.

Huh, I wonder then why he's having problems. In the -OLD- days he'd be
rolling a new kernel. Is the stock kernel multi-cpu aware or does he
need a more specialized kernel, or is it the kernel at all?? That's
where I would be looking, fer sure. God, I want one like he's got.
<scratching strong itch> I always stay a couple of years behind. :) Ric


Hyperthreading has been around too long, and dual core has also been around too long, so pretty much everyone ships with SMP on *NOW*. And you are correct, several years ago, SMP was default off on a number of distributions, so you almost always had to compile your own.

EDAC errors either mean that the memory is actually bad (or not correctly seated, or has dirty connectors, or has some other issue), or that EDAC has some sort of issues with either his bios or his hardware. I guess the easiest way to test would be to test a minimum ram configuration and see if *ANY* config gets no EDAC errors, if he can find a configuration that has no errors, then it is fairly likely that EDAC actually works on that MB, and it is likely he has one of the other problems.

It is really much harder to build the big machines, they have more dimms to start with and each of the dimms have 2x-4x times the number of chips that a normal PC dimm has (ignoring the ECC chips the dimm has), this is because the dimms are often double-sided and sometimes on top of that have 2 or 4 chips stacked on top of each other to increase the capacity (I don't remember the term for that), and once you start stacking the fanout on the memory controller rises, and everything gets a lot nastier, and harder to get to work reliably, timing has to be changed (and how much it has to be changed depends on the number of dimms on the controller). It just gets messy, I have seen some really weird failures when using all of the dimm slots on MB's, often things are not adequately tested by the MB companies and/or noted in the MB manual.


                                 Roger


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux