Re: [rfc][patch] remove racy sync_page?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton wrote:

On Tue, 30 May 2006 12:54:53 +1000
Nick Piggin <[email protected]> wrote:


I guess so. Is plugging still needed now that the IO layer should
get larger requests? Disabling it might result in a small initial
request (although even that may be good for pipelining)...


Mysterious question, that.  A few years ago I think Jens tried pulling unplugging
out, but some devices still want it (magneto-optical storage iirc).  And I
think we did try removing it, and it caused hurt.


Would be nice to get rid of it. I guess that's out of the question
for several releases, even if there was no performance problems.


Otherwise, we could make set_page_dirty_lock use a weird non-unplugging
variant


Yes, that would work.  In fact the number of times when direct-io actually
calls set_page_dirty_lock() is infinitesimal - I had to jump through hoops
to even test that code.  The speculative
set-the-page-dirty-before-we-do-the-IO thing is very effective.  So the
performace impact of making such a change would be nil.


OK, that'll probably be the way to go for upcoming releases. I'll post
a patch soon.


That's for the direct-io case.  Other cases might be hurt more.

Also, perhaps we could poke kblockd to do it for us.


Hard, I think, to get back to ->mapping. Well not hard if we just use
a non-syncing lock_page, but in that case we don't need kblockd.


sync_page wants to get either the current mapping, or a NULL one.
The sync_page methods must then be able to handle running into a
NULL mapping.

With splice, the mapping can change, so you can have the wrong
sync_page callback run against the page.


Oh.


Don't know what we can do about that off the top of my head. Within
block_sync_page there shouldn't be any problems (at worst, we'd unplug
the wrong dev). Maybe the whole sync_page callback scheme can be removed
so nobody tries to do anything funny with it? Call blk_run_backing_dev
directly.


Well yes, writing to a page would be the main reason to set it dirty.
Is splice broken as well? I'm not sure that it always has a ref on the
inode when stealing a page.


Whereabouts?


The ->pin() calls in pipe_to_file and pipe_to_sendpage?


One for Jens...


It sounds like you think fixing the set_page_dirty_lock callers wouldn't
be too difficult? I wouldn't know (although the ptrace one should be
able to be turned into a set_page_dirty, because we're holding mmap_sem).


No, I think it's damn impossible ;)

get_user_pages() has gotten us a random pagecache page.  How do we
non-racily get at the address_space prior to locking that page?

I don't think we can.


But the vma isn't going to disappear because mmap_sem is held; and the
vma should hold a ref on the inode I think?


That's true during the get_user_pages() call.  Be we run
set_page_dirty_lock() much later, after IO completion.


Oh, I thought you specifically meant the ptrace case. Yes, in
general it is much harder.

Anyway, it is possible that most of the problems could be solved by locking
the page at the time of lookup, and unlocking it on completion/dirtying...
it's just that that would be a bit of a task.


But lock_page() requires access to the address_space.  To kick the IO so we
don't wait for ever.


It shouldn't wait for ever, because the unplug timer will go off
and kblockd will do it eventually. And I was imagining that it would
have a pin on the address space at the point it is looked up... But
reworking all callers is just a pipe dream anyway, so nevermind ;)

Where we have synchronous IO, we do want the queue to be unplugged,
however in most set_page_dirty_lock case this is normally not the
case and a locked page should be rare. So hmm yes that would make
sense to have it use a special lock_page...


Can we somehow add BUG_ONs to
lock_page to ensure we've got an inode ref?


WARN_ONs.


But is it practical? Or I suspect the warning would only ever trigger
in the really rare racy cases anyway, when dentry, inode, etc have
been reclaimed.

Anyway, I'll come up with a less intrusive patch shortly...

--

Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux