Re: F7: Trying to figure out why kernel crashes with journal commit I/O error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Howard,

OK, within 12 hours after startup of the new machine running identical software that the other slower machines are running with the exact same data feed, I get

kernel: journal commit I/O error

I can log in, but can't do commands. A manual power-down (shutdown -r now won't work) and reboot clears it fine.

First I suspected a hard drive error on both machines. But then
replacement hard drives came in. It seemed to stop the problem for a few days, so I closed a bugzilla I had. Nope, this weekend, it went back to crashing every 4-18 hours.

I tried to cut the read-writes in half, to no effect, by reducing the
amount of data/files coming in.

I have:

Replaced the hard drive 3 times with new ones (to no avail)

Reduced the read/writes by around half

Turned off legacy USB support, which also caused my keyboard and mouse to stop working with errors (that's been cleared and is OK)

Filed a bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=318661

Tonight, I tried using the original kernel that came with F7
(2.6.21-1.3194.fc7) instead of the latest (2.6.22.9-91.fc-7).
As of two hours into this, so far so good, but I'm not confident.

Two other machines, Pentium 4's at 3 GHZ with ASUS motherboards, purr like a kitten.

Has anyone seen anything like this, or know what could be the problem?

As always, grateful for any help, and thanks for reading this!

Gilbert

******************************************************************************* Gilbert Sebenste ********
(My opinions only!)                                                  ******
*******************************************************************************
I would suspect a hardware issue with the motherboards as my first port of call. I have had a similar problsm with a new Pentium 4 board recently where the ATA disc interface offlined every 18 hours of so but hvaing replaced with a SATA drive the system purrs for weeks.

On two new PC's? Showing identical symptoms? I find that hard to believe.
But on the other hand...

Secondly the kernel version may be important - core 2 quad processors are newish so later kernel SHOULD have better support. Maybe try a development kernel on one of the machines e.g. 2.6.23.-----

This is what I am wondering...if it *is* the kernel, udev, or something like that. This thing has 2 gb/sec throughput...it shouldn't be doing this.

Finally, have you run a full FSCK on the drives after they fail - reboot into single mode and run fsck -f. You may find that the problem is a disc structure corruption ... then you have to find out why.

I need to do that...thanks for the reminer.

You do not say which journalling file system you are using - is this ext3, jfs, reiserfs, ...

ext3.

Finally, have you run memtest86+ on these machines - possible memory dropout going unnoticed (especially if they do not have ECC memory)

Not yet. But I can tell you "top" gives the full 4 GB it says I have. Of course, that doesn't mean much. Again, I find it very difficult to believe that two machines will have this problem. That said, I'm not ruling out anything.

 > Note sure if this will help but hope it is not just noise.... >

No, it helped, thanks. Any other suggestions, I'll take them.

*******************************************************************************
Gilbert Sebenste                                                     ********
(My opinions only!)                                                  ******
*******************************************************************************


[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux