Re: [patch 1/4] mm: page refcount use atomic primitives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton wrote:
Nick Piggin <[email protected]> wrote:

The VM has an interesting race where a page refcount can drop to zero, but
it is still on the LRU lists for a short time. This was solved by testing
a 0->1 refcount transition when picking up pages from the LRU, and dropping
the refcount in that case.


Tell me about it...


Instead, use atomic_inc_not_zero to ensure we never pick up a 0 refcount
page from the LRU (ie. we guarantee the page will not be touched).


atomic_inc_not_zero() looks rather bloaty, but a single call site is OK.


A little. On generic cmpxchg/cas architectures it isn't too bad, and
LL/SC architectures presently implement it fairly stupidly with cmpxchg
but they can do a much better job using ll/sc directly.


This ensures we can test PageLRU without taking the lru_lock,


Let me write some changelog for you.

isolate_lru_pages() can remove live pages from the LRU at any time and
shrink_cache() can put them back at any time.  As we don't hold the
zone->lock we can race against that.


void fastcall __page_cache_release(struct page *page)
{
	if (PageLRU(page)) {
		unsigned long flags;


isolate_lru_pages() removes the page here.


		struct zone *zone = page_zone(page);
		spin_lock_irqsave(&zone->lru_lock, flags);
		if (!TestClearPageLRU(page))
			BUG();


blam.


		del_page_from_lru(zone, page);
		spin_unlock_irqrestore(&zone->lru_lock, flags);
	}

	BUG_ON(page_count(page) != 0);
	free_hot_page(page);
}



But put_page() wouldn't have entered __page_cache_release() at all, because
isolate_lru_page() is changed by this patch to elevated the page refcount
prior to clearing PG_lru:

		BUG_ON(!PageLRU(page));
		list_del(&page->lru);
		target = src;
		if (get_page_unless_zero(page)) {
			ClearPageLRU(page);


So no blam.

That's from a two-minute-peek.  I haven't thought about this dreadfully
hard.  But I'd like to gain some confidence that you have, please.  This
stuff is tricky.


Right, no blam. This is how I avoid taking the LRU lock.

 "Instead, use atomic_inc_not_zero to ensure we never **pick up a 0 refcount**
  page from the LRU (ie. we guarantee the page will not be touched)."

I don't understand what you're asking?


and allows
further optimisations (in later patches) -- we end up saving 2 atomic ops
including a spin_lock_irqsave in the !PageLRU case, and 2 or 3 atomic ops
in the PageLRU case.


Well yeah, but you've pretty much eliminated all those nice speedups by
adding several BUG_ON(atomic_op)s.  Everyone compiles with CONFIG_BUG.  So
I'd suggest that such new assertions be broken out into a separate -mm-only
patch.


Not quite eliminated because ClearPageXXX is an atomic RMW, __ClearPageXXX
is not. Also TestClearXXX includes memory barriers.

Anyway I wanted to keep it equivalent (ie. keep bugs there). I have a future
patch to move these assertions under CONFIG_DEBUG_VM.

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux