Re: [usb-storage] single bit errors on files stored on USB-HDDs via USB2/usb_storage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alan Stern wrote:
> On Wed, 6 Dec 2006, Matthias Schniedermeyer wrote:
> 
> 
>>Hi
>>
>>
>>I'm using a Bunch auf HDDs in USB-Enclosures for storing files.
>>(currently 38 HDD, with a total capacity of 9,5 TB of which 8,5 TB is used)
>>
>>After i realised about a year(!) ago that the files copied to the HDDs
>>sometimes aren't identical to the "original"-files i changed my
>>procedured so that each file is MD5 before and after and deleted/copied
>>again if an error is detected.
>>
>>My averate file size is about 1GB with files from about 400MB to 5000MB
>>I estimate the average error-rate at about one damaged file in about
>>10GB of data.
>>
>>I'm not sure and haven't checked if the files are wrongly written or
>>"only" wrongly read back as i delete the defective files and copy them
>>again.
>>
>>Today i copied a few files back and checked them against the stored MD5
>>sums and 5 files of 86 (each about 700 MB) had errors. So i copied the 5
>>files again. 4 of the files were OK after that and coping the last file
>>the third time also resulted in the correct MD5.
>>
>>This time i kept the defective files and used "vbindiff" to show me the
>>difference. Strangly in EVERY case the difference is a single bit in a
>>sequence of "0xff"-Bytes inside a block of varing bit-values that
>>changed a "0xff" into a "0xf7".
>>Also interesting is that each error is at a 0xXXXXXXX5-Position
>>
>>Attached is a file with 5 of the 6 differences named 1-5. Of each of the
>>5 2x3 lines-blocks the first 3 lines are the original the following 3
>>lines contain the error in the middle row 6th value.
>>
>>NEVER did i see any messages in syslog regarding erros or an aborting
>>program due to errors passed down from the kernel or something like that.
> 
> 
> This was almost certainly caused by hardware flaws in the USB interface 
> chips of the enclosures.  There's nothing the kernel can do about it 
> because the errors aren't reported; all that happens is that incorrect 
> data is sent to or from the drive.

So pretty much all ich can do is to pray that the errors don't corrupt
the Filesystem-Metadata (XFS).

So i should definetly consider writing me a "NO-FS" where the
"filesystem"-part is stored elsewhere and the HDD contains 100% content
(Minus a Dummy-MBR-Block for sector 0). On the plus side such a
filesystem won't have any overhead at all, but on the flipside you loose
pretty much the whole content if you lose the metadata. But i guess in
my case it would considerably lower the risk of loosing data.




Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux