Re: [PATCH] splice support #2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 30 Mar 2006, Jens Axboe wrote:
> On Thu, Mar 30 2006, Ingo Molnar wrote:
> >
> > neat stuff. One question: why do we require fdin or fdout to be a pipe?  
> > Is there any fundamental problem with implementing what Larry's original 
> > paper described too: straight pagecache -> socket transfers? Without a
> > pipe intermediary forced inbetween. It only adds unnecessary overhead.
> 
> No, not a fundamental problem. I think I even hid that in some comment
> in there, at least if it's decipharable by someone else than myself...

Actually, there _is_ a fundamental problem. Two of them, in fact.

The reason it goes through a pipe is two-fold:

 - the pipe _is_ the buffer. The reason sendfile() sucks is that sendfile 
   cannot work with <n> different buffer representations. sendfile() only 
   works with _one_ buffer representation, namely the "page cache of the 
   file".

   By using the page cache directly, sendfile() doesn't need any extra 
   buffering, but that's also why sendfile() fundamentally _cannot_ work 
   with anything else. You cannot do "sendfile" between two sockets to 
   forward data from one place to another, for example. You cannot do 
   sendfile from a streaming device.

   The pipe is just the standard in-kernel buffer between two arbitrary 
   points. Think of it as a scatter-gather list with a wait-queue. That's 
   what a pipe _is_. Trying to get rid of the pipe totally misses the 
   whole point of splice().

   Now, we could have a splice call that has an _implicit_ pipe, ie if 
   neither side is a pipe, we could create a temporary pipe and thus 
   allow what looks like a direct splice. But the pipe should still be 
   there.

 - The pipe is the buffer #2: it's what allows you to do _other_ things 
   with splice that are simply impossible to do with sendfile. Notably, 
   splice allows very naturally the "readv/writev" scatter-gather 
   behaviour of _mixing_ streams. If you're a web-server, with splice you 
   can do

	write(pipefd, header, header_len);
	splice(file, pipefd, file_len);
	splice(pipefd, socket, total_len);

   (this is all conceptual pseudo-code, of course), and this very 
   naturally has none of the issues that sendfile() has with plugging etc. 
   There's never any "send header separately and do extra work to make 
   sure it is in the same packet as the start of the data".

   So having a separate buffer even when you _do_ have a buffer like the 
   page cache is still something you want to do.

So there.

		Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux