Ric Moore wrote:
On Sat, 2008-03-22 at 10:03 -0500, Roger Heflin wrote:
Ric Moore wrote:
On Thu, 2008-03-20 at 21:58 -0500, Roger Heflin wrote:
Brent Snow, Mr. wrote:
Hi All,
I am having a problem with a new Dell PowerEdge 1900 Server
running Fedora 8.
The System setup is as follows:
2 - Xeon E5310 (Quad-Core 1.6 GHz) processors
16 GB of RAM, I SATA 80 GB HDD.
Holy Smokes! 2 quad cores? That's 8 cores total(?) and 16 GIGS of Ram??
My Gawd, not only am I jealous as all hell, I'm wondering what kinda
kernel are you running?? Any sort of stock kernel would roll over and
join the Choir Eternal.
Actually fairly normal kernels work just fine on the large boxes, I have ran
stock FC6 kernels up to 8 cpus/16 cores and up to 64GB of ram with no issues.
Wouldn't you be running some sort of mini clustering setup?? Setup
right, it should really blow serious coal. Your problem might lie in
that direction. You might have training wheels on a Dodge Hemi. With a
machine like that, I could almost do without eating!
<huge drooling grins> Ric
Clustering setups are only needed when you have more than 1 machine, having lots
of cpus on a single machine is much easier than clustering as you don't need
have to worry about the networking, and the memory can be shared easily between
the cpus.
Huh, I wonder then why he's having problems. In the -OLD- days he'd be
rolling a new kernel. Is the stock kernel multi-cpu aware or does he
need a more specialized kernel, or is it the kernel at all?? That's
where I would be looking, fer sure. God, I want one like he's got.
<scratching strong itch> I always stay a couple of years behind. :) Ric
Hyperthreading has been around too long, and dual core has also been around too
long, so pretty much everyone ships with SMP on *NOW*. And you are correct,
several years ago, SMP was default off on a number of distributions, so you
almost always had to compile your own.
EDAC errors either mean that the memory is actually bad (or not correctly
seated, or has dirty connectors, or has some other issue), or that EDAC has some
sort of issues with either his bios or his hardware. I guess the easiest way
to test would be to test a minimum ram configuration and see if *ANY* config
gets no EDAC errors, if he can find a configuration that has no errors, then it
is fairly likely that EDAC actually works on that MB, and it is likely he has
one of the other problems.
It is really much harder to build the big machines, they have more dimms to
start with and each of the dimms have 2x-4x times the number of chips that a
normal PC dimm has (ignoring the ECC chips the dimm has), this is because the
dimms are often double-sided and sometimes on top of that have 2 or 4 chips
stacked on top of each other to increase the capacity (I don't remember the term
for that), and once you start stacking the fanout on the memory controller
rises, and everything gets a lot nastier, and harder to get to work reliably,
timing has to be changed (and how much it has to be changed depends on the
number of dimms on the controller). It just gets messy, I have seen some really
weird failures when using all of the dimm slots on MB's, often things are not
adequately tested by the MB companies and/or noted in the MB manual.
Roger