Re: smp race fix between invalidate_inode_pages* and do_no_page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 1 Apr 2006, Andrew Morton wrote:
> Andrea Arcangeli <[email protected]> wrote:
> > 
> > The new PG_truncate bitflag can be used as well to eliminate the
> > truncate_count from all vmas etc... that would save substantial memory
> > and remove some complexity, truncate_count grown a lot since the time we
> > introduced it.

truncate_count is playing a useful role with the vma tree, allowing
us to find our place again after allowing preemption in.  Though I'm
sometimes wondering whether just to make the vma _bigger_ and add a
list instead of that truncate_count: construct a list of vmas from
the prio_tree for unmap_mapping_range, so it can keep place in that,
instead of having to restart the tree whenever preempted.  But I
think NFS doesn't hold i_sem across unmapping, so it wouldn't work.

> > The PG_truncate is needed as well because we can't know in do_no_page if
> > page->mapping is legitimate null or not (think bttv and other device
> > drivers returning page->mapping null because they're private but not
> > reserved pages etc..)

That part is easily dealt with, as Nick and I have suggested,
just by marking vmas liable to having their pages truncated.
But that's certainly no more than one part of a larger solution.

> The patch will pretty clearly fix that.  But it really would be better to
> place all the cost over on the invalidate side, rather than adding overhead
> to the pagefault path.

I see ~2% slowdown in relevant faulting tests e.g. the lmbench ones.
I'd certainly prefer not to be locking pages here: or not without
word from the high-scalability people that Andrea's patch really in
practice doesn't hurt them (but the ~2% I see is on UP as much as MP).

> Could we perhaps check the page_count(page) and/or page_mapped(page) in
> invalidate_complete_page(), when we have tree_lock?

I'm thinking on it, but not finding it easy.  And I've just realized
that invalidate_inode_pages2 isn't the only problematic case here
(though agreed the more urgent one): MADV_REMOVE likewise needs to
invalidate pages without adjusting i_size, so also cannot rely on
the truncate_count/i_size method.

> Do you have handy any test code which others can use to recreate this bug?

I'd be glad of that too.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux