I mostly agree with your comments, so I'm only responding to the points I
disagree with.
> So in fact, converting a threaded program to a pure async model should
> not improve it much because of the initial architectural design. But a
> program written from scratch to be purely async should perform better
> simply because it has less operations to perform. And there's no magics
> here : less cycles spend synchronizing and locking = more cycles available
> for the real job.
On the contrary, converting threaded programs to pure async is a disaster.
For one thing, you can never, ever block under any circumstances on pain of
total disaster. This means every single line of code is performance
critical. For all but the most trivial applications, this alone is a deal
killer.
The other problem is that no matter how careful you are, there is still
going to be some unexpected blocking. For example, suppose one connection
triggers some kind of error condition and the code to handle this error has
never faulted in. Now your entire application stalls until the error
handling code can fault in, which can be awhile if the disk is slow/busy.
> It's true that an async model is a lot of pain. But it's always where I
> got the best performance. For instance, with epoll(), I can achieve
> 20000 HTTP reqs/s with 40000 concurrent sessions. The best performance
> I have observed from threaded competitors was an order of magnitude below
> on either value (sometimes both).
I can't see how adding a second thread to your program could possibly make
it all that much worse. And when you do block unexpectedly, there's that
second thread to keep on working. And, of course, you can take advantage of
another CPU.
If adding a few more threads to your design makes it much worse, there's
something wrong with it. Perhaps you are conflating "threaded competitors"
with "thread-per-client competitors".
I also must be misunderstanding something you're saying, or you are very
confused. At one point you say that threaded programs are an order of
magnitude below pure async. At another point you say converting a threaded
program to pure async should not improve it much. I don't see how both of
these can be true unless you mean something different by "threaded" and
"async" in the two statements.
Thread-per-client works reasonably well up to some number of clients. On
most UNIXes, that number is around 300. On Windows, it's around 800. I
personally would only recommend it for applications that plan to handle 100
clients or fewer, or one platforms where you know the threading library
works well this way.
Pure async is just too painful. If you only have a single thread, the
program becomes unmaintainable because you must never ever block anywhere.
And you still can't avoid blocking sometimes, and this can cause a backup
from which you can never recover. (If you always wondered why your servers
were bursty, this is why.)
The best approach, in my experience, is to use multiple threads so that you
can take advantage of multiple CPUs, not be screwed if you block
unexpectedly, and even block intentionally in code that's not performance
critical. On Linux, 'epoll' is the most efficient socket discovery
technique, so you should use it in any design that's not thread-per-client.
DS
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]