Re: Fixes for NFS file truncation race condition(s)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thursday March 2, [email protected] wrote:
> Attached are two file, the first being a diff patch file, and the second 
> being a test program that can be used to invoke the problem and confirm 
> that it has been fixed after the patches are applied.
> 
...
> 
> kernel BUG at lib/radix-tree.c:372!

....


Hey, I've been looking at that bug too!  It a good one :-)

I think your patch addresses the symptom rather than the cause.  It
tries to notice that the race has been lost and avoid an oops.

The following patch - which I sent to Trond (nfs maintainer) for
passing on upstream (though much of the patch is from him in the first
place) fixes the cause, which is subtle.

NeilBrown

----------------------------------------

Subject: Nfs: Avoid races between writebacks and truncation
From: Trond Myklebust <[email protected]>
Patch-mainline: pending
References: 144058

Author: Trond Myklebust <[email protected]> (Modified by NeilBrown <[email protected]>)
Currently, there is no serialisation between NFS asynchronous writebacks
and truncation at the page level due to the fact that nfs_sync_inode()
cannot lock the pages that it is about to write out.

This means that it is possible to be flushing out data (and calling something
like set_page_writeback()) while the page cache is busy evicting the page.
Oops...

Use the hooks provided in try_to_release_page() to ensure that dirty pages
are not evictged before being written out to store.

Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Neil Brown <[email protected]>


Index: linux-2.6.15/fs/nfs/file.c
===================================================================
--- linux-2.6.15.orig/fs/nfs/file.c	2006-02-24 14:56:42.000000000 +1100
+++ linux-2.6.15/fs/nfs/file.c	2006-03-02 13:42:42.000000000 +1100
@@ -316,6 +316,23 @@
 	return status;
 }
 
+static int nfs_invalidate_page(struct page *page, unsigned long offset)
+{
+	BUG_ON(PagePrivate(page));
+	return 1;
+}
+
+static int nfs_release_page(struct page *page, gfp_t gfp)
+{
+	/* If this was called, then PagePrivate is set, so we have
+	 * pending write back, so the page cannot be released.
+	 * However we can clear the Uptodate flag so we get the desired
+	 * effect of the page being invalidated.
+	 */
+	ClearPageUptodate(page);
+	return 0;
+}
+
 struct address_space_operations nfs_file_aops = {
 	.readpage = nfs_readpage,
 	.readpages = nfs_readpages,
@@ -324,6 +341,8 @@
 	.writepages = nfs_writepages,
 	.prepare_write = nfs_prepare_write,
 	.commit_write = nfs_commit_write,
+	.invalidatepage = nfs_invalidate_page,
+	.releasepage = nfs_release_page,
 #ifdef CONFIG_NFS_DIRECTIO
 	.direct_IO = nfs_direct_IO,
 #endif
Index: linux-2.6.15/fs/nfs/pagelist.c
===================================================================
--- linux-2.6.15.orig/fs/nfs/pagelist.c	2006-02-24 14:56:42.000000000 +1100
+++ linux-2.6.15/fs/nfs/pagelist.c	2006-03-02 13:40:23.000000000 +1100
@@ -85,6 +85,7 @@
 	atomic_set(&req->wb_complete, 0);
 	req->wb_index	= page->index;
 	page_cache_get(page);
+	SetPagePrivate(page);
 	req->wb_offset  = offset;
 	req->wb_pgbase	= offset;
 	req->wb_bytes   = count;
@@ -147,8 +148,10 @@
  */
 void nfs_clear_request(struct nfs_page *req)
 {
-	if (req->wb_page) {
-		page_cache_release(req->wb_page);
+	struct page *page = req->wb_page;
+	if (page != NULL) {
+		ClearPagePrivate(page);
+		page_cache_release(page);
 		req->wb_page = NULL;
 	}
 }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux