Re: [RFC] non-refcounted pages, application to slab?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nick Piggin a écrit :
On Wed, Jan 25, 2006 at 11:26:01AM +0100, Eric Dumazet wrote:
Nick Piggin a écrit :
If an allocator knows exactly the lifetime of its page, then there is no
need to do refcounting or the final put_page_zestzero (atomic op + mem
barriers).

This is probably not worthwhile for most cases, but slab did strike me
as a potential candidate (however the complication here is that some
code I think uses the refcount of underlying pages of slab allocations
eg nommu code). So it is not a complete patch, but I wonder if anyone
thinks the savings might be worth the complexity?

Is there any particular code that is really heavy on slab allocations?
That isn't mostly handled by the slab's internal freelists?
Hi Nick

After reading your patch, I have some crazy idea.

The atomic op + mem barrier you want to avoid could be avoided more generally just by changing atomic_dec_and_test(atomic_t *v).

If the current thread is the last referer (refcnt = 1), then it can safely set the value to 0 because no other CPU can be touching the value (or else there must be a bug somewhere, as the 'other cpu' could touch the value just after us and we could free an object still in use by 'other cpu'


I think that would work for this case, but you change the semantics
of the function for all users which is bad.

Yes :) I did a test with a patched kernel and I got :

BUG: atomic counter underflow at:
 <c0103a3a> show_trace+0x20/0x22   <c0103b5b> dump_stack+0x1e/0x20
 <c01d6934> _atomic_dec_and_lock+0x78/0x88   <c0177599> dput+0xbf/0x187
 <c016dc96> path_release+0x14/0x30   <c016e540> __link_path_walk+0x36d/0xd5f
 <c016ef84> link_path_walk+0x52/0xd6   <c016f2ec> do_path_lookup+0xfc/0x220
<c016f467> __path_lookup_intent_open+0x3e/0x73 <c016f4d1> path_lookup_open+0x35/0x37
 <c016fc79> open_namei+0x83/0x631   <c015f811> do_filp_open+0x38/0x56
 <c015fb83> do_sys_open+0x5c/0x99   <c015fbe7> sys_open+0x27/0x29
 <c0102bb3> sysenter_past_esp+0x54/0x75


So we cannot change atomic_dec_and_test(atomic_t *v) but introduce a new function like :

int atomic_dec_refcount(atomic_t *v)
{
#ifdef CONFIG_SMP
       /* avoid an atomic op if we are the last user of this refcount */
       if (atomic_read(v) == 1) {
               atomic_set(v, 0); /* not a real atomic op on most machines */
               return 1;
       }
#endif
	return atomic_dec_and_test(v);
}

The cost of the extra conditional branch is worth, if it can avoid an atomic op.



Such a test could be open coded in __free_page, although that does
add a branch + some icache, but that might also be an option. (and
my patch does also add to total icache footprint and is much uglier ;))

Thanks,
Nick



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux