Re: 2.6.19 file content corruption on ext3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2006-12-19 at 21:58 +1100, Nick Piggin wrote:
> Peter Zijlstra wrote:
> > On Tue, 2006-12-19 at 02:32 -0800, Andrew Morton wrote:
> 
> >>Well it used to be.  After 2.6.19 it can do the wrong thing for mapped
> >>pages.  But it turns out that we don't feed it mapped pages, apart from
> >>pagevec_strip() and possibly races against pagefaults.
> > 
> > 
> > So how about this:
> 
> Well that's still racy. Anyway several earlier patches (including
> the one I posted) closed this race. Some were still reported to
> trigger corruption IIRC.

I can't remember a patch that removes mapped pages from this code path,
however I could have missed it. All out removing the mapping branch in
ttfb() did also fix the problem - which is a superset of page_mapped().

I'm now building a kernel with this patch, and will submit that to
rtorrent with mem=256M on a 1k ext3 filesystem on x86_64 smp preempt.

---
 fs/buffer.c |   32 +++++++++++++++++++++++++++++++-
 1 file changed, 31 insertions(+), 1 deletion(-)

Index: linux-2.6/fs/buffer.c
===================================================================
--- linux-2.6.orig/fs/buffer.c
+++ linux-2.6/fs/buffer.c
@@ -2798,11 +2798,38 @@ static inline int buffer_busy(struct buf
 		(bh->b_state & ((1 << BH_Dirty) | (1 << BH_Lock)));
 }
 
+/*
+ * AKPM sayeth:
+ *
+ * - a process does a one-byte-write to a file on a 64k pagesize, 4k
+ *   blocksize ext3 filesystem.  The page is now PageDirty, !PgeUptodate and
+ *   has one dirty buffer and 15 not uptodate buffers.
+ *
+ * - kjournald writes the dirty buffer.  The page is now PageDirty,
+ *   !PageUptodate and has a mix of clean and not uptodate buffers.
+ *
+ * - try_to_free_buffers() removes the page's buffers.  It MUST now clear
+ *   PageDirty.  If we were to leave the page dirty then we'd have a dirty, not
+ *   uptodate page with no buffer_heads.
+ *
+ *   We're screwed: we cannot write the page because we don't know which
+ *   sections of it contain garbage.  We cannot read the page because we don't
+ *   know which sections of it contain modified data.  We cannot free the page
+ *   because it is dirty.
+ *
+ * However for mapped pages this is not true; mapped pages will be fully
+ * loaded and thus cannot have not uptodate buffers.
+ *
+ * Hence allow the PG_dirty bit to stay for pages that had no not uptodate
+ * buffers (and assert that mapped pages never have those).
+ */
+
 static int
 drop_buffers(struct page *page, struct buffer_head **buffers_to_free)
 {
 	struct buffer_head *head = page_buffers(page);
 	struct buffer_head *bh;
+	int uptodate = 1;
 
 	bh = head;
 	do {
@@ -2818,11 +2845,14 @@ drop_buffers(struct page *page, struct b
 
 		if (!list_empty(&bh->b_assoc_buffers))
 			__remove_assoc_queue(bh);
+		if (!buffer_uptodate(bh))
+			uptodate = 0;
 		bh = next;
 	} while (bh != head);
 	*buffers_to_free = head;
 	__clear_page_buffers(page);
-	return 1;
+	VM_BUG_ON(page_mapped(page) && !uptodate);
+	return !uptodate;
 failed:
 	return 0;
 }


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux