Rogério Brito wrote:
On Jul 01 2005, Jens Axboe wrote:
On Fri, Jul 01 2005, David Masover wrote:
Not always possible. Some disks lie and leave caching on anyway.
And the same (and others) disks will not honor a flush anyways.
Moral of that story - avoid bad hardware.
But how does the end-user know what hardware is "good hardware"? Which
vendors don't lie (or, at least, lie less than others) regarding HDs?
Thanks, Rogério Brito.
The only real way is to test the drive (and retest when you get a new
versions of firmware) and the whole fsync -> write barrier code path.
We use a bus analyzer to make sure that when you fsync() a file, you
will see a cache flush command coming across the bus. Of course, that is
the easy step ;-)
The second step is to test your system across power failures. We have a
"wbtest" code that we have used to catch bugs. The basic idea is to
write a file to a disk with the cache turned off, write the same file to
the disk with the write barrier (and working cache flush command) and
then randomly drop power to the box. It is important to really drop
power to the whole box since a "reset button" push often does not drop
power to the drives and will give you false passes.
Our wbtest used to be good at finding holes in the write barrier code
using 2.4 kernels and PATA drives, but we have had no luck yet in
catching known bugs with this test on 2.6 with S-ATA drives.
Ideas on how to get a more effective test are welcome - it is a very
small window that you need to hit to catch a misbehaving drive (i.e.,
your write cache flush command has returned, you want to drop power and
on reboot, validate that the platter contains that last IO correctly).
If you had enough NVRAM in a test system, you might be able to
substitute a NVRAM backed file system for the write-cache disabled drive
and get closer to catching the window.
The alternative is to either run with the write cache disabled (again,
you will need to validate that the drive really disabled the cache) or
to buy a mid-range or better storage array that provides a non-volatile
(battery backed) write cache.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|