Re: xfs|loop|raid: attempt to access beyond end of device

On Sun, Dec 23, 2007 at 08:21:08PM +0100, Janos Haar wrote:
> Hello, list,
> 
> I have a little problem on one of my productive system.
> 
> The system sometimes crashed, like this:
> 
> Dec 23 08:53:05 Albohacen-global kernel: attempt to access beyond end of
> device
> Dec 23 08:53:05 Albohacen-global kernel: loop0: rw=1, want=50552830649176,
> limit=3085523200
> Dec 23 08:53:05 Albohacen-global kernel: Buffer I/O error on device loop0,
> logical block 6319103831146
> Dec 23 08:53:05 Albohacen-global kernel: lost page write due to I/O error on
> loop0

So a long way beyond the end of the device.

[snip soft lockup warnings]

> Dec 23 09:08:19 Albohacen-global kernel: Filesystem "loop0": Access to block
> zero in inode 397821447 start_block: 0 start_off: 0 blkcnt: 0 extent-state:
> 0 lastx: e4

And that's to block zero of the filesystem. Sure signs of a corupted inode
extent btree. We've seen a few of these corruptions on loopback device
reported recently.

You'll need to unmount and repair the filesystem to make this go away,
but it's hard to know what is causing the btree corruption.

> Dec 23 09:08:22 Albohacen-global last message repeated 19 times
> 
> some more info:
> 
> [root@Albohacen-global ~]# uname -a
> Linux Albohacen-global 2.6.21.1 #3 SMP Thu May 3 04:33:36 CEST 2007 x86_64
> x86_64 x86_64 GNU/Linux
> [root@Albohacen-global ~]# cat /proc/mdstat
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
> [multipath] [faulty]
> md1 : active raid4 sdf2[1] sde2[0] sdd2[5] sdc2[4] sdb2[3] sda2[2]
>       19558720 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
>       bitmap: 8/239 pages [32KB], 8KB chunk
> 
> md2 : active raid4 sdf3[1] sde3[0] sdd3[5] sdc3[4] sdb3[3] sda3[2]
>       1542761600 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
>       bitmap: 0/148 pages [0KB], 1024KB chunk
> 
> md0 : active raid1 sdb1[1] sda1[0]
>       104320 blocks [2/2] [UU]
> 
> unused devices: <none>
> [root@Albohacen-global ~]# losetup /dev/loop0
> /dev/loop0: [0010]:6598 (/dev/md2), encryption blowfish (type 18)

You're using an encrypted block device? What mechanism are you using for
encryption (doesn't appear to be dmcrypt)? Does it handle readahead bio
cancellation correctly? We had similar XFS corruption problems on dmcrypt
between 2.6.14 and ~2.6.20 due to a bug in dmcrypt's failure to handle
aborted readahead I/O correctly....

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: xfs|loop|raid: attempt to access beyond end of device
  - From: "Janos Haar" <[email protected]>

References:
- xfs|loop|raid: attempt to access beyond end of device
  - From: "Janos Haar" <[email protected]>

Prev by Date: Re: volanoMark 24% regression in 2.6.24-rc6: why a simple patch makes it
Next by Date: Re: [patch] Make MMCONFIG space (extended PCI config space) a driver opt-in issue
Previous by thread: xfs|loop|raid: attempt to access beyond end of device
Next by thread: Re: xfs|loop|raid: attempt to access beyond end of device
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]