Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR



Mark Lord wrote:

Eric D. Mudama wrote:
Actually, it's possibly worse, since each failure in libata willgenerate 3-4 retries. With existing ATA error recovery in thedrives, that's about 3 seconds per retry on average, or 12 secondsper failure. Multiply that by the number of blocks past the error tocomplete the request..
It really beats the alternative of a forced reboot
due to, say, superblock I/O failing because it happened
to get merged with an unrelated I/O which then failed..
Etc..

Definitely an improvement.

The number of retries is an entirely separate issue.
If we really care about it, then we should fix SD_MAX_RETRIES.

The current value of 5 is *way* too high.  It should be zero or one.

Cheers

I think that drives retry enough, we should leave retry at zero fornormal (non-removable) drives. Should this be a policy we can set likewe do with NCQ queue depth via /sys ?

We need to be able to layer things like MD on top of normal drive errorsin a way that will produce a system that provides reasonable responsetime despite any possible IO error on a single component. Another casethat we end up doing on a regular basis is drive recovery. Errors needto be limited in scope to just the impacted area and dispatched up tothe application layer as quickly as we can so that you don't spend dayswatching a copy of huge drive (think 750GB or more) ;-)


ric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
  - From: Mark Lord <[email protected]>
- Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
  - From: Douglas Gilbert <[email protected]>
- Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
  - From: James Bottomley <[email protected]>

References:
- [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
  - From: Mark Lord <[email protected]>
- Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
  - From: James Bottomley <[email protected]>
- Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
  - From: Mark Lord <[email protected]>

Prev by Date: Re: Linux 2.6.19.2: Freeze with CIFS mount
Next by Date: [PATCH 23/23] clocksource tsc: add verify routine
Previous by thread: Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
Next by thread: Re: [PATCH] scsi_lib.c: continue after MEDIUM_ERROR
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]