Re: 2.6.19 file content corruption on ext3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 20 Dec 2006, Peter Zijlstra wrote:

> On Tue, 2006-12-19 at 14:58 -0800, Andrew Morton wrote:
> 
> > Well... we'd need to see (corruption && this-not-triggering) to be sure.
> > 
> > Peter, have you been able to trigger the corruption?
> 
> Yes; however the mail I send describing that seems to be lost in space.

Btw, can somebody actually explain the mess that is ext3 "dirtying".

Ext3 does NOT use __set_page_dirty_buffers. It does

	static int ext3_journalled_set_page_dirty(struct page *page)
	{
	        SetPageChecked(page);
	        return __set_page_dirty_nobuffers(page);
	}

and uses that "Checked" bit as a "whole page is dirty" bit (which it tests 
in "writepage()".

You realize what this all means? It means that ANYTHING that actually 
clears the _real_ dirty bit won't actually be doing anything at all for 
ext3, because the Checked bit will still stay set, and any IO down the 
line on that page would totally ignore the dirty bits on the buffer heads 
and just write out everything.

That is "The Mess(tm)".

It also basically means that anything that clears the dirty bit without 
just calling "writepage()" had _better_ call "invalidatepage()" for the 
whole page, because otherwise the PageChecked bit will never be cleared as 
far as I can see. Happily, at least ext3 seems to _test_ for that case in 
the release_page() function, so it appears that we do do this.

But this seems to just strengthen my argument: you can NEVER clean a page, 
unless you (a) do IO on it immediately afterwards (writeback) or (b) 
invalidate it entirely (truncate).

I'd really like to see just those two functions exist. Preferably in a 
form where you can see easily that we actually follow those rules. Rather 
than having a confusing set of "clear_page_dirty()" and
"test_and_clear_page_dirty()" functions that are called from random 
places.

IOW, I think the "clear_page_dirty_for_io()" is fine (it's case (a)) 
above, and then we should probably have a "cancel_dirty_page()" function 
that does all the current clear_page_dirty() but also makes sure that we 
actually call the invalidate_page() function itself. 

Hmm?

			Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux