Re: [take25 1/6] kevent: Description.

Ulrich Drepper a écrit :

You create worker threads to handle to work for the entire program. Lookat something like a web server. When creating several queues, how doyou distribute all the connections to the different queues? To ensureevery connection is handled as quickly as possible you stuff them all inthe same queue and then have all threads use this one queue. Whenever anevent is posted a thread is woken. _One_ thread. If two events areposted, two threads are woken. In this situation we have a few atomicops at userlevel to make sure that the two threads don't pick the sameevent but that's all there is wrt "fighting".
The alternative is the sorry state we have now. In nscd, for instance,we have one single thread waiting for incoming connections and it thenhas to wake up a worker thread to handle the processing. This is donebecause we cannot "park" all threads in the accept() call since when anew connection is announced _all_ the threads are woken. With the newevent handling this wouldn't be the case, one thread only is woken andwe don't have to wake worker threads. All threads can be worker threads.

Having one specialized thread handling the distribution of work to workerthreads is better most of the time. This thread can be a worker thread byitself (to avoid context switchs), but can decide to wake up 'slave threads'if he believes it has too (for example if he can notice that a *lot* ofrequests are pending)

This is because with moderate load, it's better to have only one CPU running80% of its time, keeping its cache hot, than 'distribute' the work on fourCPU, that would be used 25% of their time, but with lot of cache line pingpongs and poor cache reuse.

If you let 'kevent'/'dumb kernel dispatcher'/'futex'/'whatever' decide to wakeup one thread for each new event, you *may* have lower performance, because ofhigher system overhead (system means : system scheduler/internals, but alsobus trafic)Only the application writer can have a clue of average use of its workerthreads, and can decide to dynamically adjust parameters if needed to handleload spikes.

SMP machines are nice, but for many workloads, it's better to avoid spreadinga working set on several CPUS that fight for common resources (memory).



Back to 'kevent':
-----------------

I think that having a syscall to commit events should not be mandatory. Asyscall is needed only to wait for new events if the ring is empty. But thenmaybe we dont need yet a new syscall to perform a wait :

We already have nice synchronisations primitives (futex for example).

User program should be able to update a 'uidx' in user space (using atomic opsonly if multi-threaded), and could just use futex infrastructure if ringbuffer is empty (uidx == kidx) , and call FUTEX_WAIT( &kidx, current value = uidx)


I think I already gave my opinion on a ring buffer, but let just rephrase it :

One part should be read/write for application (to be able to change uidx)

(or User app just give at init time to kernel the address of a futex in its vmspace)

One part could be read only for application (but could be read/write : we dontcare if user application is stupid) : kernel writes its kidx (or a copy of it)and events.

For best performance, uidx and kidx should be on different cache lines (basicisolation of producer / consumer)


When kernel wants to queue a new event in a ring buffer it can :

See if user program did consume some events since last invocation (kernelfetches uidx and compare it with its own uidx value : no syscall needed)

Check if a slot is available in ring buffer.
Copy the event in ring buffer, perform a memory barrier, then increment kidx.
call futex_wake(&kidx, 1 thread)

User application is free to have one thread/process or severalthreads/processes waiting for new events (or even no thread at all :) )


Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [take25 1/6] kevent: Description.
  - From: Andrew Morton <[email protected]>

References:
- [take25 1/6] kevent: Description.
  - From: Evgeniy Polyakov <[email protected]>
- Re: [take25 1/6] kevent: Description.
  - From: Ulrich Drepper <[email protected]>
- Re: [take25 1/6] kevent: Description.
  - From: Jeff Garzik <[email protected]>
- Re: [take25 1/6] kevent: Description.
  - From: Ulrich Drepper <[email protected]>

Prev by Date: Re: Entropy Pool Contents
Next by Date: Re: Simple script that locks up my box with recent kernels
Previous by thread: Re: [take25 1/6] kevent: Description.
Next by thread: Re: [take25 1/6] kevent: Description.
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]