Re: F7: Trying to figure out why kernel crashes with journal commit I/O error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Title: Signature
Gilbert Sebenste wrote:
Hello all,

I am having an absolutely vexing problem that maybe somebody might shed some light on.

I just got 2 new computers, both running F7. They each have one Seagate 750 GB SATA 3 Gb/s, 7200 RPM, 16 MB drive. Each machine has 4 GB of RAM, Core 2 quad 6700 motherboard from ASUS.

OK. I run the computers pretty hard. But I have two Pentium 4's who work just as hard, all getting a 20 MB/sec peak (1 MB/sec avg) weather feed from the National Weather Service, flawlessly for months until I install new kernels on it and reboot.

OK, within 12 hours after startup of the new machine running identical software that the other slower machines are running with the exact same data feed, I get

kernel: journal commit I/O error

I can log in, but can't do commands. A manual power-down (shutdown -r now won't work) and reboot clears it fine.

First I suspected a hard drive error on both machines. But then
replacement hard drives came in. It seemed to stop the problem for a few days, so I closed a bugzilla I had. Nope, this weekend, it went back to crashing every 4-18 hours.

I tried to cut the read-writes in half, to no effect, by reducing the
amount of data/files coming in.

I have:

Replaced the hard drive 3 times with new ones (to no avail)

Reduced the read/writes by around half

Turned off legacy USB support, which also caused my keyboard and mouse to stop working with errors (that's been cleared and is OK)

Filed a bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=318661

Tonight, I tried using the original kernel that came with F7
(2.6.21-1.3194.fc7) instead of the latest (2.6.22.9-91.fc-7).
As of two hours into this, so far so good, but I'm not confident.

Two other machines, Pentium 4's at 3 GHZ with ASUS motherboards, purr like a kitten.

Has anyone seen anything like this, or know what could be the problem?

As always, grateful for any help, and thanks for reading this!

Gilbert

*******************************************************************************
Gilbert Sebenste                                                     ********
(My opinions only!)                                                  ******
*******************************************************************************

I would suspect a hardware issue with the motherboards as my first port of call. I have had a similar problsm with a new Pentium 4 board recently where the ATA disc interface offlined every 18 hours of so but hvaing replaced with a SATA drive the system purrs for weeks.

Secondly the kernel version may be important - core 2 quad processors are newish so later kernel SHOULD have better support. Maybe try a development kernel on one of the machines e.g. 2.6.23.-----

Finally, have you run a full FSCK on the drives after they fail - reboot into single mode and run fsck -f. You may find that the problem is a disc structure corruption ... then you have to find out why.

You do not say which journalling file system you are using - is this ext3, jfs, reiserfs, ...

Finally, have you run memtest86+ on these machines - possible memory dropout going unnoticed (especially if they do not have ECC memory)

Note sure if this will help but hope it is not just noise....

--

Howard Wilkinson

Phone:

+44(20)76907075

Coherent Technology Limited

Fax:

 

23 Northampton Square,

Mobile:

+44(7980)639379

United Kingdom, EC1V 0HL

Email:

howard@xxxxxxxxxxx

 


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux