Re: sendfile() with 100 simultaneous 100MB files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I took a look at that article, and well, it looks a bit off to me. I looked at the code it refered to and it mmap's the file and optionally copies from the map to a private buffer before writing to the socket.

The double buffering that is enabled by LOCAL_BUFFERING is a complete and total waste of both cpu and ram. There is no reason to allocate more ram and waste more cpu cycles to make a second copy of the data before passing it to the network layer. The mmap and madvice though, is a good idea, and I imagine it is causing the kernel to perform large block readahead.

If you really want to be able to simultainiously push hundreds of streams efficiently though, you want to use zero copy aio, which can have tremendous benefits in throughput and cpu usage. Unfortunately, I believe the current kernel does not support O_DIRECT on sockets.

I last looked at the kernel implementation of sendfile about 6 years ago, but I remember it not looking very good. I believe it WAS only transfering a single page at a time, and it was still making a copy from fs cache to socket buffers, so it wasn't really doing zero copy IO ( though it was one less copy than doing a read and write ).

About that time I was writing an ftp server on the NT kernel and discovered zero copy async IO. I ended up using a small thread pool and an IO completion port to service the async IO requests. The files were mmaped in 64 KB chunks, three at a time, and queued asynchronously to the socket which was set to use no kernel buffering. This allowed a PII-233 machine to push 11,820 KB/s ( that's real KB, not salesman's ) over a single session on a 100Base-T network, and saturate dual network interfaces with multiple connections, all using less than 1% of the cpu, because the NICs were able to directly perform scatter/gather DMA on the filesystem cache pages.

I'm hopefull that the Linux kernel will be able to do this soon as well, when the network stack supports O_DIRECT on sockets.

Jon Smirl wrote:
I was reading this blog post about the lighttpd web server.
http://blog.lighttpd.net/articles/2005/11/11/optimizing-lighty-for-high-concurrent-large-file-downloads
It describes problems they are having downloading 100 simultaneous 100MB files.

In this post they complain about sendfile() getting into seek storms and
ending up in 72% IO wait. As a result they built a user space
mechanism to work around the problems.

I tried looking at how the kernel implements sendfile(), I have
minimal understanding of how the fs code works but it looks to me like
sendfile() is working a page at a time. I was looking for code that
does something like this...

1) Compute an adaptive window size and read ahead the appropriate
number of pages.  A larger window would minimize disk seeks.

2) Something along the lines of as soon as a page is sent age the page
down in to the middle of page ages. That would allow for files that
are repeatedly sent, but also reduce thrashing from files that are not
sent frequently and shouldn't stay in the page cache.

Any other ideas why sendfile() would get into a seek storm?

--
Jon Smirl
[email protected]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux