Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion

Alan Cox wrote:

Ar Maw, 2006-08-01 am 11:44 -0500, ysgrifennodd David Masover:
Yikes.  Undetected.
Wait, what? Disks, at least, would be protected by RAID. Are youtelling me RAID won't detect such an error?
Yes.

RAID deals with the case where a device fails. RAID 1 with 2 disks can
in theory detect an internal inconsistency but cannot fix it.

Still, if it does that, that should be enough. The scary part wasn'tthat there's an internal inconsistency, but that you wouldn't know.

And it can fix it if you can figure out which disk went. Or give it 3disks and it should be entirely automatic -- admin gets paged, adminhotswaps in a new disk, done.

we're OK with that, so long as our filesystems are robust enough. Ifit's an _undetected_ error, doesn't that cause way more problems(impossible problems) than FS corruption? Ok, your FS is fine -- butnow your bank database shows $1k less on random accounts -- is that ok?


Not really no. Your bank is probably using a machine (hopefully using a
machine) with ECC memory, ECC cache and the like. The UDMA and SATA
storage subsystems use CRC checksums between the controller and the
device. SCSI uses various similar systems - some older ones just use a
parity bit so have only a 50/50 chance of noticing a bit error.

Similarly the media itself is recorded with a lot of FEC (forward error
correction) so will spot most changes.

Unfortunately when you throw this lot together with astronomical amounts
of data you get burned now and then, especially as most systems are not
using ECC ram, do not have ECC on the CPU registers and may not even
have ECC on the caches in the disks.

It seems like this is the place to fix it, not the software. If thesoftware can fix it easily, great. But I'd much rather rely on thehardware looking after itself, because when hardware goes bad, all betsare off.

Specifically, it seems like you do mention lots of hardware solutions,that just aren't always used. It seems like storage itself is gettingcheap enough that it's time to step back a year or two in Moore's Law toget the reliability.

The sort of changes this needs hit the block layer and ever fs.

Seems it would need to hit every application also...


Depending how far you propogate it. Someone people working with huge
data sets already write and check user level CRC values for this reason
(in fact bitkeeper does it for one example). It should be relatively
cheap to get much of that benefit without doing application to
application just as TCP gets most of its benefit without going app to
app.

And yet, if you can do that, I'd suspect you can, should, must do it ata lower level than the FS. Again, FS robustness is good, but if thedisk itself is going, what good is having your directory (mostly) intactif the files themselves have random corruptions?

If you can't trust the disk, you need more than just an FS which canmostly survive hardware failure. You also need the FS itself (or maybethe block layer) to support bad block relocation and all that goodstuff, or you need your apps designed to do that job by themselves.

It just doesn't make sense to me to do this at the FS level. Youmention TCP -- ok, but if TCP is doing its job, I shouldn't also need toimplement checksums and other robustness at the protocol layer (http,ftp, ssh), should I? Because in this analogy, it looks like TCP is the"block layer" and a protocol is the "fs".

As I understand it, TCP only lets the protocol/application know whensomething's seriously FUBARed and it has to drop the connection.Similarly, the FS (and the apps) shouldn't have to know about hardwareproblems until it really can't do anything about it anymore, at whichpoint the right thing to do is for the FS and apps to go "oh shit" anddrop what they're doing, and the admin replaces hardware and restoresfrom backup. Or brings a backup server online, or...

I guess my main point was that _undetected_ problems are serious, but ifyou can detect them, and you have at least a bit of redundancy, youshould be good. For instance, if your RAID reports errors that it can'tfix, you bring that server down and let the backup server run.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: Matthias Andree <[email protected]>
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: Krzysztof Halasa <[email protected]>

References:
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: Bernd Schubert <[email protected]>
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: "Horst H. von Brand" <[email protected]>
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: Adrian Ulrich <[email protected]>
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: Alan Cox <[email protected]>
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: David Masover <[email protected]>
- Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
  - From: Alan Cox <[email protected]>

Prev by Date: Re: [PATCH 0/6] AVR32 update for 2.6.18-rc2-mm1
Next by Date: Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
Previous by thread: Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
Next by thread: Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]