Re: [PATCH] splice support #2

* Jens Axboe <axboe@suse.de> wrote:

> > > with pipe-based buffering this approach has still the very same problems 
> > > that sendfile() has with packet boundaries, because it's not enough to 
> > > have "large enough" buffering (like a pipe has), the pipe also has to be 
> > > drained, and the networking layer has to know the precise boundary of 
> > > data.
> > > 
> > > the right solution to the packet boundary problem is to pass in a proper 
> > > "does userspace expect more data right now" flag, or to let userspace 
> > > 'flush' the socket independently - which is independent of the 
> > > pipe-in-slice issue. This solution already exists: the MSG_MORE flag.
> > 
> > We can add a SPLICE_F_MORE flag for this, right now splice doesn't set
> > the MSG_MORE flag for the end of the pipe.
> 
> Ala

>  #define SPLICE_F_MOVE	(0x01)	/* move pages instead of copying */
> +#define SPLICE_F_MORE	(0x02)	/* expect more data */

ok, nice - something like this should work. The direction of the flag is 
a philosophical question i guess: i believe in Linux we prefer to 
default to "buffering enabled", i.e. the default flag should be "expect 
more data". So maybe it would be better to pass in PLICE_F_END, to 
indicate end-of-data. [it doesnt mean 'permanent end', so all the files 
still remain open: this could be something like a HTTP 1.1 pipelined 
request.]

furthermore, the internal implementation should also get smarter and do 
a flush-socket if it would e.g. block on a pagecache page. [we often 
prefer a partial packet in such cases instead of having a half-built 
packet hang around.]

btw., that 'data boundary' detail is likely lost with the pipe 
intermediary solution: there is no direct connection between 'input 
file' and 'output socket', so a 'flush now' event doesnt get propagated 
in a natural way. (unless we extend pipes with 'data boundary' markers, 
or force their flushing, which looks a whole lot of complexity for such 
a simple thing.)

straight pagecache->socket splicing on the other hand preserves 'data 
boundary' markers in a natural way for two reasons: 1) the input and 
output objects are known to the kernel at once 2) because there is no 
internal buffering to begin with.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH] splice support #2
  - From: Jens Axboe <axboe@suse.de>

References:
- [PATCH] splice support #2
  - From: Jens Axboe <axboe@suse.de>
- Re: [PATCH] splice support #2
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [PATCH] splice support #2
  - From: Jens Axboe <axboe@suse.de>
- Re: [PATCH] splice support #2
  - From: Linus Torvalds <torvalds@osdl.org>
- Re: [PATCH] splice support #2
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [PATCH] splice support #2
  - From: Jens Axboe <axboe@suse.de>
- Re: [PATCH] splice support #2
  - From: Jens Axboe <axboe@suse.de>

Prev by Date: Re: [PATCH] splice support #2
Next by Date: OOPS 2.6.16 and 2.6.16-git14
Previous by thread: Re: [PATCH] splice support #2
Next by thread: Re: [PATCH] splice support #2
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]