Re: Linux does not care for data integrity

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matthias Andree wrote:

On Wed, 01 Jun 2005, Bill Davidsen wrote:

It's a matter of enforcing write order. In how far such ordering
constraints are propagated by file systems, VFS layer, down to the
hardware, is the grand question.
The problem is that in many options required to make that happen in the o/s, hardware, and application are going to kill performance. And even if you can control order of write, unless you can get write to final non-volatile media control you can get a sane database but still lose transactions.

If there was a way for the o/s to know when a physical write was done other than using flushes to force completion, then overall performance could be higher, but individual transaction might have greater latency. And the app could use fsync to force order of write as needed. In many cases groups of writes can be done in any order as long as they are all done before the next logical step takes place.

I have a déjà-vu, and I do believe that this discussion has taken place
in this list before, perhaps with a slightly different alignment, and
likely in the context of mail transfer agents and perhaps synchronous
directory (data) updates (file creation and such). Exposing a bit of the
queueing to the user space through new syscalls may be an interesting
experiment, although I do not have the resources to provide code.
Something like fsync() that doesn't flush the whole file system (which
appears to be the most common implementation) but tracks what is needed,
and that returns when data for a given file is on disk.

What I had in mind was not a "push" to flush anything anywhere, but rather a watch. As a hypothetical, I open a file and every time a write() is done a counter is incremented in the fd. That's the easy part. Then every time a physical write is completed the count is reduced. To allow for write combining the count could be in bytes rather than syscalls and physical operations. That's the hard part, I don't think the hardware is telling. In addition obviously writes may be combined between i/o related to several fds. But if that could be done, then fsync becomes "wait until my buffered byte count drops to zero," which could be an ioctl. Just having such a checkpoint would address some of the data coherency issues.

AFAIK this isn't possible with common ATA devices, and it clearly doesn't address every desirable feature. In spite of that, if someone better qualified to assess the problems and benefits cares to comment, fine. If not, at least I think I explained what I was thinking more clearly.

This would change the meaning of fsync from "force out the data" to "wait for the data to be written" in some implementations.

Naming suggestion: flazysync()



--
bill davidsen <[email protected]>
 CTO TMR Associates, Inc
 Doing interesting things with small computers since 1979

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux