Re: 2.6.19 file content corruption on ext3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton wrote:
On Tue, 19 Dec 2006 20:56:50 +1100
Nick Piggin <[email protected]> wrote:


Linus Torvalds wrote:


NOTICE? First you make a BIG DEAL about how dirty bits should never get lost, but THE VERY SAME FUNCTION actually very much on purpose DOES drop the dirty bit for when it's not in the page tables.

try_to_free_buffers is quite a special case, where we're transferring
the page dirty metadata from the buffers to the page. I think Andrew
would have a better grasp of it so he could correct me, but what it
does is legitimate.


Well it used to be.  After 2.6.19 it can do the wrong thing for mapped
pages.

Yes, that is what I was trying to get at.

 But it turns out that we don't feed it mapped pages, apart from
pagevec_strip() and possibly races against pagefaults.

True, and I think we have pretty well established that this isn't the
cause of Andrei's problem, but I think we all agree it is *a* bug?

And surely Andrei's data corruption will be of the same flavour in
that test_clear_page_dirty somewhere is now stripping pte dirty bits
where it shouldn't? (because it went away after Peter nooped that
behaviour)

I think it could be very likely that indeed the bug is a latent one in
a clear_page_dirty caller, rather than dirty-tracking itself.


The only callers are try_to_free_buffers(), truncate and a few scruffy
possibly-wrong-for-fsync filesytems which aren't being used here.


<spots a race in do_no_page()>

If a write-fault races with a read-fault and the write-fault loses, we forget
to mark the page dirty.

Hmm.. in that case will the pte still be readonly, and thus the write
faulter will have to try again I think?


Something like this, but it's probably wrong - I didn't try very hard (am
feeling ill, and vaguely grumpy)


From: Andrew Morton <[email protected]>

Signed-off-by: Andrew Morton <[email protected]>
---

 mm/memory.c |   12 ++++++++++++
 1 file changed, 12 insertions(+)

diff -puN mm/memory.c~a mm/memory.c
--- a/mm/memory.c~a
+++ a/mm/memory.c
@@ -2264,10 +2264,22 @@ retry:
 		}
 	} else {
 		/* One of our sibling threads was faster, back out. */
+		if (write_access) {
+			/*
+			 * We might have raced against a read-fault.  We still
+			 * need to dirty the page.
+			 */
+			dirty_page = vm_normal_page(vma, address, *page_table);
+			if (dirty_page) {
+				get_page(dirty_page);
+				goto dirty_it;
+			}
+		}
 		page_cache_release(new_page);
 		goto unlock;
 	}
+dirty_it:
 	/* no need to invalidate: a not-present page shouldn't be cached */
 	update_mmu_cache(vma, address, entry);
 	lazy_mmu_prot_update(entry);
_




--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux