Re: [RFC] NUMA futex hashing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 08 August 2006 17:11, Nick Piggin wrote:
> Ulrich Drepper wrote:
> > On 8/8/06, Eric Dumazet <[email protected]> wrote:
> >> The validity of the virtual address is still tested by normal get_user()
> >> call.. If the memory was freed by a thread, then a normal EFAULT error
> >> will
> >> be reported... eventually.
> >
> > This is indeed what should be done.  Private futexes are the by far
> > more frequent case and I bet you'd see improvements when avoiding the
> > mm mutex even for normal machines since futexes really are everywhere.
> > For shared mutexes you end up doing two lookups and that's fine IMO
> > as long as the first lookup is fast.
>
> The private futex's namespace is its virtual address, so I don't see
> how you can decouple that from the management of virtual addresses.
>
> Let me get this straight: to insert a contended futex into your rbtree,
> you need to hold the mmap sem to ensure that address remains valid,
> then you need to take a lock which protects your rbtree. Then to wake
> up a process and remove the futex, you need to take the rbtree lock. Or
> to unmap any memory you also need to take the rbtree lock and ensure
> there are no futexes there.
>
> So you just add another lock for no reason, or have I got a few screws
> loose myself? I don't see how you can significantly reduce lock
> cacheline bouncing in a futex heavy workload if you're just going to
> add another shared data structure. But if you can, sweet ;)

We certainly can. But if you insist of using mmap sem at all, then we have a 
problem.

rbtree would not reduce cacheline bouncing, so :

We could use a hashtable (allocated on demand) of size N, N depending on 
NR_CPUS for example. each chain protected by a private spinlock. If N is well 
chosen, we might reduce lock cacheline bouncing. (different threads fighting 
on different private futexes would have a good chance to get different 
cachelines in this hashtable)

As soon a process enters 'private futex' code, the futex code allocates this 
hashtable if the process has a NULL hash table (set to NULL at exec() time, 
or maybe re-allocated because we want to be sure futex syscall always suceed 
(no ENOMEM))

So we really can... but for 'private futexes' which are the vast majority of 
futexes needed by typical program (using POSIX pshared thread mutex attribute 
PTHREAD_PROCESS_PRIVATE, currently not used by NPTL glibc)

Of course we would need a new syscall, and to change glibc to be able to 
actually use this new private_futex syscall.

Probably a lot of work, still, but could help heavy threaded programs not 
touching mmap_sem.

We might have a refcounting problem on this 'hashtable' since several threads 
share this structure, but only at thread creation/destruction, not in futex 
call (ie no cacheline bouncing on the refcount)

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux