Re: [PATCH] splice support #2

On Thu, Mar 30 2006, Linus Torvalds wrote:
> 
> 
> On Thu, 30 Mar 2006, Jens Axboe wrote:
> > On Thu, Mar 30 2006, Ingo Molnar wrote:
> > >
> > > neat stuff. One question: why do we require fdin or fdout to be a pipe?  
> > > Is there any fundamental problem with implementing what Larry's original 
> > > paper described too: straight pagecache -> socket transfers? Without a
> > > pipe intermediary forced inbetween. It only adds unnecessary overhead.
> > 
> > No, not a fundamental problem. I think I even hid that in some comment
> > in there, at least if it's decipharable by someone else than myself...
> 
> Actually, there _is_ a fundamental problem. Two of them, in fact.
> 
> The reason it goes through a pipe is two-fold:
> 
>  - the pipe _is_ the buffer. The reason sendfile() sucks is that sendfile 
>    cannot work with <n> different buffer representations. sendfile() only 
>    works with _one_ buffer representation, namely the "page cache of the 
>    file".
> 
>    By using the page cache directly, sendfile() doesn't need any extra 
>    buffering, but that's also why sendfile() fundamentally _cannot_ work 
>    with anything else. You cannot do "sendfile" between two sockets to 
>    forward data from one place to another, for example. You cannot do 
>    sendfile from a streaming device.
> 
>    The pipe is just the standard in-kernel buffer between two arbitrary 
>    points. Think of it as a scatter-gather list with a wait-queue. That's 
>    what a pipe _is_. Trying to get rid of the pipe totally misses the 
>    whole point of splice().
> 
>    Now, we could have a splice call that has an _implicit_ pipe, ie if 
>    neither side is a pipe, we could create a temporary pipe and thus 
>    allow what looks like a direct splice. But the pipe should still be 
>    there.
> 
>  - The pipe is the buffer #2: it's what allows you to do _other_ things 
>    with splice that are simply impossible to do with sendfile. Notably, 
>    splice allows very naturally the "readv/writev" scatter-gather 
>    behaviour of _mixing_ streams. If you're a web-server, with splice you 
>    can do
> 
> 	write(pipefd, header, header_len);
> 	splice(file, pipefd, file_len);
> 	splice(pipefd, socket, total_len);
> 
>    (this is all conceptual pseudo-code, of course), and this very 
>    naturally has none of the issues that sendfile() has with plugging etc. 
>    There's never any "send header separately and do extra work to make 
>    sure it is in the same packet as the start of the data".
> 
>    So having a separate buffer even when you _do_ have a buffer like the 
>    page cache is still something you want to do.
> 
> So there.

My point was mainly that the buffer itself need not necessarily be a
pipe, it could be implemented with a pipe just using the same buffer
type. But I guess it doesn't make much sense, the pipe has nice
advantages in itself.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- [PATCH] splice support #2
  - From: Jens Axboe <axboe@suse.de>
- Re: [PATCH] splice support #2
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [PATCH] splice support #2
  - From: Jens Axboe <axboe@suse.de>
- Re: [PATCH] splice support #2
  - From: Linus Torvalds <torvalds@osdl.org>

Prev by Date: Re: Save 320K on production machines?
Next by Date: Re: [PATCH] splice support #2
Previous by thread: Re: [PATCH] splice support #2
Next by thread: Re: [PATCH] splice support #2
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]