On Mon, 8 Oct 2007, George N. White III wrote:
The P4 has been around for years, so that type of system has been pretty well
tested.
This is true!
OK, within 12 hours after startup of the new machine running identical
software that the other slower machines are running with the exact same
data feed, I get
kernel: journal commit I/O error
Don't assume the problem is related to your heavy disk I/O. Try some other
workloads. I like to run a suite of benchmarks on new hardware.
They often reveal problems with the initial setup, and are helpful
later on when something seems broken, e.g., why did the last kernel
update cause disk I/O to slow by 50%?
Yessir, I'm trying it with no load right now to see what happens.
Are you using x86_64 kernels?
Nope.
I suspect most people with similar workloads
will be using x86_64, so you may be encountering problems specific code that
hasn't been thoroughly exercises on i386 kernels.
For that reason, I stay away from x86_64 kernels.
In the past, there have
been problems with RH's 4k stack size, particularly during error handling,
that can mask the real source of the problem.
That is true, and it makes me wonder if that is what is happening here.
If you are really stuck with 32-bit kernels, you might try the 16k
versions from linuxant.
Hmmm. Where are they at? And thanks for the thoughts!
*******************************************************************************
Gilbert Sebenste ********
(My opinions only!) ******
*******************************************************************************