Re: Fake ext3 corruption on raid5 in 2.6 .9 smp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I saw this exact same error (EXT3-fs error (device dm-x): ext3_readdir:
bad entry in directory #nnnnnnn: rec_len % 4 != 0 - offset=0, inode=xxxxxxxx,
rec_len=...

from time to time in my non-SMP RAID system with 512Mb RAM, with ext3 on LVM on top of RAID5.

Never caused actual corruption - run FSCK, no errors, remount rw
successfully until next time; error rarely in the same place, but always
in a directory and rec_len % 4 != 0.  Looks like an 'in-kernel' thing,
because (e.g.) running find on the volume after remounting rw produced
no issues, so presumably the on-disk directory wasn't *really* the
issue.

Filesystems between about 8 and 50 Gb, and not what I'd characterise as a
heavy load.

This was with about 2.6.4 - 2.6.7.  I'm running 2.6.11 now and haven't
seen it in some time; so it was either fixed by 2.6.11, or mounting ro
by default has just reduced my exposure.


John Pearson.


On Thu, May 26, 2005 at 11:34:24AM +0200, Olivier Guerrier wrote
> Hello,
> 
> I've encoutered the following strangeness with a newly installed box.
> This box was not previously running Linux, but has been carefully tested
> with memtest86, cpuburn, and some smalls scripts of mine. So hardware
> failure may be excluded at first.
> 
> The box is an asus a7m266D with dual athlon-MP (true MP), 2Go RAM, and
> the raid 5 set is build on 4 maxtor 160Go connected to 2 additionals
> sil0680 ata133 IDE controllers.
> 
> I use dm aes encryption over sotfware raid5, and under heavy load,
> I get theses error messages, and the partition is remounted ro:
> 
> 8<--
> EXT3-fs error (device dm-6): ext3_readdir: bad entry in directory
> #20515395: rec_len % 4 != 0 - offset=0, inode=1605429031, rec_len=14026,
> name_len=177
> Aborting journal on device dm-6.
> ext3_abort called.
> EXT3-fs error (device dm-6): ext3_journal_start_sb: Detected aborted journal
> Remounting filesystem read-only
> EXT3-fs error (device dm-6) in start_transaction: Journal has aborted
> (repeated 3 times)
> EXT3-fs error (device dm-6) in ext3_ordered_writepage: IO failure
> EXT3-fs error (device dm-6) in start_transaction: Journal has aborted
> (repeated 53 times)
> __journal_remove_journal_head: freeing b_committed_data (repeated 5 times)
> -->8
> 
> It is easily reproducible.
> 
> fsck show no error, no hardware related message found in dmesg, the raid
> set is not marked as bad, no reconstruction needed. I just remount the
> partition rw, and now try to keep the box out of pressure.
> 
> Googling around I found theses threads, but it does not contain a
> conclusion:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=107913912311421&w=2
> and
> http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=+295657
> 
> The kernel is patched with grsec, I have disabled it all but same
> behaviour occur, maybe a vanilla will change something, but previous
> links show same error on non-grsec kernel.
> 
> attached is my .config and boot-time dmesg.
> 
> -------------- next part --------------
> An embedded and charset-unspecified text was scrubbed...
> Name: config.txt
> Url: http://lists.us.dell.com/pipermail/linux-kernel-daily-digest/attachments/20050526/15e8620a/config.txt
> -------------- next part --------------
> An embedded and charset-unspecified text was scrubbed...
> Name: dmesg_boot.txt
> Url: http://lists.us.dell.com/pipermail/linux-kernel-daily-digest/attachments/20050526/15e8620a/dmesg_boot.txt
> 
> ------------------------------

-- 
Voice: +61 8 8202 9040
Email: [email protected]

Oasis Systems Pty Ltd
288 Glen Osmond Road
Fullarton, South Australia 5063

Ph: + 61 8 82029000
Fax: +61 8 82029001

CAUTION: This email and any attachments may contain information that is
confidential and subject to copyright. If you are not the
intended recipient, you must not read, use, disseminate, distribute or
copy this email or any attachments. If you have received this
email in error, please notify the sender immediately by reply email and
erase this email and any attachments.

DISCLAIMER: OASIS Systems uses virus-scanning technology but accepts
no responsibility for loss or damage arising from the use of the
information transmitted by this email including damage from virus.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux