Re: [patch 4/5] fail-injection capability for disk IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 23 2006, Ric Wheeler wrote:
> Jens Axboe wrote:
> >On Wed, Aug 23 2006, Andrew Morton wrote:
> >
> >>On Wed, 23 Aug 2006 14:03:55 +0200
> >>Jens Axboe <[email protected]> wrote:
> >>
> >>
> >>>On Wed, Aug 23 2006, Akinobu Mita wrote:
> >>>
> >>>>This patch provides fail-injection capability for disk IO.
> >>>>
> >>>>Boot option:
> >>>>
> >>>>	fail_make_request=<probability>,<interval>,<times>,<space>
> >>>>
> >>>>	<probability>
> >>>>
> >>>>		specifies how often it should fail in percent.
> >>>>
> >>>>	<interval>
> >>>>
> >>>>		specifies the interval of failures.
> >>>>
> >>>>	<times>
> >>>>
> >>>>		specifies how many times failures may happen at most.
> >>>>
> >>>>	<space>
> >>>>
> >>>>		specifies the size of free space where disk IO can be issued
> >>>>		safely in bytes.
> >>>>
> >>>>Example:
> >>>>
> >>>>	fail_make_request=100,10,-1,0
> >>>>
> >>>>generic_make_request() fails once per 10 times.
> >>>
> >>>Hmm dunno, seems a pretty useless feature to me.
> >>
> >>We need it.  What is the FS/VFS/VM behaviour in the presence of IO
> >>errors?  Nobody knows, because we rarely test it.  Those few times where
> >>people _do_ test it (the hard way), bad things tend to happen.  reiserfs
> >>(for example) likes to go wobble, wobble, wobble, BUG.
> >
> >
> >You misunderstood me - a global parameter is useless, as it makes it
> >pretty impossible for people to use this for any sort of testing (unless
> >it's very specialized). I didn't say a feature to test io errors was
> >useless!
> >
> >
> >>>Wouldn't it make a lot
> >>>more sense to do this per-queue instead of a global entity?
> >>
> >>Yes, I think so.  /sys/block/sda/sda2/make-it-fail.
> >
> >
> >Precisely.
> >
> 
> I think that this is very useful for testing file systems.
> 
> What this will miss is the error path through the lower levels of the IO 
> path (i.e., the libata/SCSI error handling confusion that Mark Lord has 
> been working on patches for would need some error injection at or below 
> the libata level).
> 
> We currently test this whole path with either weird fault injection gear 
> to hit the s-ata bus or the old fashion pile of moderately flaky disks 
> that we try hard not to fix or totally kill.
> 
> It would be really useful to get something (target mode SW disk? libata 
> or other low level error injection?) to test this whole path in software...

Yes, this approach only tests the layer(s) above the device. To simulate
hardware failure or timeouts, I _think_ scsi_debug can already help you
quite a bit. If not, it should be easy enough to extend do add these
sorts of things.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux