On Wed, 2007-01-24 at 09:37 +1100, David Chinner wrote:
> With the recent changes to cancel_dirty_pages(), XFS will
> dump warnings in the syslog because it can truncate_inode_pages()
> on dirty mapped pages.
>
> I've determined that this is indeed correct behaviour for XFS
> as this can happen in the case of races on mmap()d files with
> direct I/O. In this case when we do a direct I/O read, we
> flush the dirty pages to disk, then truncate them out of the
> page cache. Unfortunately, between the flush and the truncate
> the mmap could dirty the page again. At this point we toss a
> dirty page that is mapped.
This sounds iffy, why not just leave the page in the pagecache if its
mapped anyway?
> None of the existing functions for truncating pages or invalidating
> pages work in this situation. Invalidating a page only works for
> non-dirty pages with non-dirty buffers, and they only work for
> whole pages and XFS requires partial page truncation.
>
> On top of that the page invalidation functions don't actually
> call into the filesystem to invalidate the page and so the filesystem
> can't actually invalidate the page properly (e.g. do stuff based on
> private buffer head flags).
Have you seen the new launder_page() a_op? called from
invalidate_inode_pages2_range()
> So that leaves us needing to use truncate semantics and the problem
> is that none of them unmap pages in a non-racy manner - if they
> unmap pages they do it separately to the truncate of the page,
> leading to races with mmap redirtying the page between the unmap and
> the truncate ofthe page.
Isn't there still a race where the page fault path doesn't yet lock the
page and can just reinsert it?
Nick's pagefault rework should rid us of this by always locking the page
in the fault path.
> Hence we need a truncate function that unmaps the pages while they
> are locked for truncate in a similar fashion to
> invalidate_inode_pages2_range(). The following patch (unchanged from
> the last time it was sent) does this. The XFS changes are in a
> second patch.
>
> The patch has been test on ia64 and x86-64 via XFSQA and a lot
> of fsx mixing mmap and direct I/O operations.
>
> Signed-off-by: Dave Chinner <[email protected]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]