Re: [rfc][patch] remove racy sync_page?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 29 May 2006 19:34:09 +1000
Nick Piggin <[email protected]> wrote:

> I'm not completely sure whether this is the bug or not,

"the bug".  Are we suposed to know what yo're referring to here?

> nor what would be
> the performance consequences of my attached fix (wrt the block layer). So
> you're probably cc'ed because I've found similar threads with your names
> on them.
> 

The performance risk is that someone will do lock_page() against a page
whose IO is queued-but-not-yet-kicked-off.  We'll go to sleep with no IO
submitted until kblockd or someone else kicks off the IO for us.

Try disabling kblockd completely, see what effect that has on performance.

> 
> lock_page (and other waiters on page flags bits) use sync_page when sleeping
> on a bit. sync_page, however, requires that the page's mapping be pinned
> (which is what we're sometimes trying to lock the page for).
> 
> Blatant offender is set_page_dirty_lock, which falls to the race it purports
> to fix. Perhaps all callers could be fixed, however it seems that the pinned
> mapping lock_page precondition is counter-intuitive and I'd bet other callers
> to lock_page or wait_on_page_bit have got it wrong too.
> 
> Also: splice can change a page's mapping, so it would have been possible to
> use the wrong mapping to "sync" a page.
> 
> Can we get rid of the whole thing, confusing memory barriers and all? Nobody
> uses anything but the default sync_page, and if block rq plugging is terribly
> bad for performance, perhaps it should be reworked anyway? It shouldn't be a
> correctness thing, right?

What this means is that it is not legal to run lock_page() against a
pagecache page if you don't have a ref on the inode.

iirc the main (only?) offender here is direct-io reads into MAP_SHARED
pagecache.  (And similar things, like infiniband and nfs-direct).

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux