David Chinner wrote:
On Thu, Jan 25, 2007 at 12:43:23AM +1100, Nick Piggin wrote:
And why not just leave it in the pagecache and be done with it?
because what is in cache is then not coherent with what is on disk,
and a direct read is supposed to read the data that is present
in the file at the time it is issued.
So after a writeout it will be coherent of course, so the point in
question is what happens when someone comes in and dirties it at the
worst possible moment? That relates to the paragraph below...
All you need is to do a writeout before a direct IO read, which is
what generic dio code does.
No, that's not good enough - after writeout but before the
direct I/O read is issued a process can fault the page and dirty
it. If you do a direct read, followed by a buffered read you should
get the same data. The only way to guarantee this is to chuck out
any cached pages across the range of the direct I/O so they are
fetched again from disk on the next buffered I/O. i.e. coherent
at the time the direct I/O is issued.
... so surely if you do a direct read followed by a buffered read,
you should *not* get the same data if there has been some activity
to modify that part of the file in the meantime (whether that be a
buffered or direct write).
but in that case you'll either have to live with some racyness
(which is what the generic code does), or have a higher level
synchronisation to prevent buffered + direct IO writes I suppose?
The XFS inode iolock - direct I/O writes take it shared, buffered
writes takes it exclusive - so you can't do both at once. Buffered
reads take is shared, which is another reason why we need to purge
the cache on direct I/O writes - they can operate concurrently
(and coherently) with buffered reads.
Ah, I'm glad to see somebody cares about doing the right thing ;)
Maybe I'll use XFS for my filesystems in future.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]