Re: xfs|loop|raid: attempt to access beyond end of device

Hello list,


----- Original Message ----- 
From: "David Chinner" <[email protected]>
To: "Janos Haar" <[email protected]>
Cc: <[email protected]>
Sent: Tuesday, December 25, 2007 10:13 AM
Subject: Re: xfs|loop|raid: attempt to access beyond end of device


> On Sun, Dec 23, 2007 at 08:21:08PM +0100, Janos Haar wrote:
> > Hello, list,
> >
> > I have a little problem on one of my productive system.
> >
> > The system sometimes crashed, like this:
> >
> > Dec 23 08:53:05 Albohacen-global kernel: attempt to access beyond end of
> > device
> > Dec 23 08:53:05 Albohacen-global kernel: loop0: rw=1,
want=50552830649176,
> > limit=3085523200
> > Dec 23 08:53:05 Albohacen-global kernel: Buffer I/O error on device
loop0,
> > logical block 6319103831146
> > Dec 23 08:53:05 Albohacen-global kernel: lost page write due to I/O
error on
> > loop0
>
> So a long way beyond the end of the device.
>
> [snip soft lockup warnings]
>
> > Dec 23 09:08:19 Albohacen-global kernel: Filesystem "loop0": Access to
block
> > zero in inode 397821447 start_block: 0 start_off: 0 blkcnt: 0
extent-state:
> > 0 lastx: e4
>
> And that's to block zero of the filesystem. Sure signs of a corupted inode
> extent btree. We've seen a few of these corruptions on loopback device
> reported recently.
>
> You'll need to unmount and repair the filesystem to make this go away,
> but it's hard to know what is causing the btree corruption.

Yes, if i do that, the fs clean again, and works well until we moves big
amount of data on the storage.
But the problem comes out again, and again.

If i can, i will try the latest stable kernel, with some loop fixes, but it
is not too easy.
This is a productive system.


>
> > Dec 23 09:08:22 Albohacen-global last message repeated 19 times
> >
> > some more info:
> >
> > [root@Albohacen-global ~]# uname -a
> > Linux Albohacen-global 2.6.21.1 #3 SMP Thu May 3 04:33:36 CEST 2007
x86_64
> > x86_64 x86_64 GNU/Linux
> > [root@Albohacen-global ~]# cat /proc/mdstat
> > Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5]
[raid4]
> > [multipath] [faulty]
> > md1 : active raid4 sdf2[1] sde2[0] sdd2[5] sdc2[4] sdb2[3] sda2[2]
> >       19558720 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
> >       bitmap: 8/239 pages [32KB], 8KB chunk
> >
> > md2 : active raid4 sdf3[1] sde3[0] sdd3[5] sdc3[4] sdb3[3] sda3[2]
> >       1542761600 blocks level 4, 64k chunk, algorithm 0 [6/6] [UUUUUU]
> >       bitmap: 0/148 pages [0KB], 1024KB chunk
> >
> > md0 : active raid1 sdb1[1] sda1[0]
> >       104320 blocks [2/2] [UU]
> >
> > unused devices: <none>
> > [root@Albohacen-global ~]# losetup /dev/loop0
> > /dev/loop0: [0010]:6598 (/dev/md2), encryption blowfish (type 18)
>
> You're using an encrypted block device?

Yes.

> What mechanism are you using for
> encryption (doesn't appear to be dmcrypt)?

No, this is the "old way", not the dmcrypt.
This is cryptoloop.

> Does it handle readahead bio
> cancellation correctly?

I dont know exactly, but in my log it is always a write failures! ( i think
from rw=1 and lost page write messages)
It is the readahead important in this case?

> We had similar XFS corruption problems on dmcrypt
> between 2.6.14 and ~2.6.20 due to a bug in dmcrypt's failure to handle
> aborted readahead I/O correctly....

Ok, what is the next step? :-)
(i will try the latest kernel, if i can....)
Can i switch to dmcrypt from cryptoloop without rebuild the fs?

Thanks a lot,
Janos Haar

>
> Cheers,
>
> Dave.
> -- 
> Dave Chinner
> Principal Engineer
> SGI Australian Software Group

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- xfs|loop|raid: attempt to access beyond end of device
  - From: "Janos Haar" <[email protected]>
- Re: xfs|loop|raid: attempt to access beyond end of device
  - From: David Chinner <[email protected]>

Prev by Date: Re: [BUG 2.6.24-rc6-mm1] BUG() at check_irqs_on() in fs/buffer.c .
Next by Date: [patch] ipv4: Update ip command line processing (take II)
Previous by thread: Re: xfs|loop|raid: attempt to access beyond end of device
Next by thread: Boot time module loading problem
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]