Re: [PATCH] splice support #2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 30 2006, Andrew Morton wrote:
> Jens Axboe <[email protected]> wrote:
> >
> > Hi,
> > 
> > This patch should resolve all issues mentioned so far. I'd still like to
> > implement the page moving, but that should just be a separate patch.
> > After this round of patching, I thought I'd toss a fresh version out
> > there for all to look at.
> > 
> > Signed-off-by: Jens Axboe <[email protected]>
> > 
> > ...
> >
> > --- a/arch/ia64/kernel/entry.S
> > +++ b/arch/ia64/kernel/entry.S
> > @@ -1605,5 +1605,6 @@ sys_call_table:
> >  	data8 sys_ni_syscall			// reserved for pselect
> >  	data8 sys_ni_syscall			// 1295 reserved for ppoll
> >  	data8 sys_unshare
> > +	date8 sys_splice
> 
> typo

Oops, fixed.

> > +static void *page_cache_pipe_buf_map(struct file *file,
> > +				     struct pipe_inode_info *info,
> > +				     struct pipe_buffer *buf)
> > +{
> > +	struct page *page = buf->page;
> > +
> > +	lock_page(page);
> > +
> > +	if (!PageUptodate(page)) {
> 
>  || page->mapping == NULL

Fixed

> > +	struct page *pages[PIPE_BUFFERS];
> > +	struct page *page;
> > +	pgoff_t index, pidx;
> > +	int i;
> > +
> > +	index = in->f_pos >> PAGE_CACHE_SHIFT;
> > +	offset = in->f_pos & ~PAGE_CACHE_MASK;
> > +	nr_pages = (len + offset + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
> > +
> > +	if (nr_pages > PIPE_BUFFERS)
> > +		nr_pages = PIPE_BUFFERS;
> > +
> > +	/*
> > +	 * initiate read-ahead on this page range
> > +	 */
> > +	do_page_cache_readahead(mapping, in, index, nr_pages);
> > +
> > +	/*
> > +	 * Get as many pages from the page cache as possible..
> > +	 * Start IO on the page cache entries we create (we
> > +	 * can assume that any pre-existing ones we find have
> > +	 * already had IO started on them).
> > +	 */
> > +	i = find_get_pages(mapping, index, nr_pages, pages);
> > +
> > +	/*
> > +	 * common case - we found all pages and they are contiguous,
> > +	 * kick them off
> > +	 */
> > +	if (i && (pages[i - 1]->index == index + i - 1))
> > +		goto splice_them;
> > +
> > +	memset(&pages[i], 0, (nr_pages - i) * sizeof(struct page *));
> > +
> > +	/*
> > +	 * find_get_pages() may not return consecutive pages, so loop
> > +	 * over the array moving pages and filling the rest, if need be.
> > +	 */
> > +	for (i = 0, pidx = index; i < nr_pages; pidx++, i++) {
> > +		if (!pages[i]) {
> > +			int error;
> > +fill_page:
> > +			/*
> > +			 * no page there, look one up / create it
> > +			 */
> > +			page = find_or_create_page(mapping, pidx,
> > +						   mapping_gfp_mask(mapping));
> > +			if (!page)
> > +				break;
> > +
> > +			if (PageUptodate(page))
> > +				unlock_page(page);
> > +			else {
> > +				error = mapping->a_ops->readpage(in, page);
> > +
> > +				if (unlikely(error)) {
> > +					page_cache_release(page);
> > +					break;
> > +				}
> > +			}
> > +			pages[i] = page;
> > +		} else if (pages[i]->index != pidx) {
> > +			page = pages[i];
> > +			/*
> > +			 * page isn't in the right spot, move it and jump
> > +			 * back to filling this one. we know that ->index
> > +			 * is larger than pidx
> > +			 */
> > +			pages[i + page->index - pidx] = page;
> > +			pages[i] = NULL;
> > +			goto fill_page;
> > +		}
> > +	}
> > +
> > +	if (!i)
> > +		return 0;
> > +
> > +	/*
> > +	 * Now we splice them into the pipe..
> > +	 */
> > +splice_them:
> > +	return move_to_pipe(pipe, pages, i, offset, len);
> > +}
> 
> OK, and this returns the number of bytes transferred all the way up to
> userspace.
> 
> And, like read(), if we hit an IO error or ENOMEM partway through we'll
> just return a short read which is indistinguishable from EOF.
> 
> But if we do get an IO error, we'll detect it in page_cache_pipe_buf_map().
> Does the code still follow these read() return value semantics in that
> case?

Yes, see:

                        err = actor(info, buf, &sd);
                        if (err) {
                                if (!ret)
                                        ret = err;

                                break;
                        }

so if we get -EIO from the actor, then we return bytes transferred _or_
-EIO in 0.

> argh, am all reviewed out, and infiniband isn't compiling.  Will look later.

I don't blame you, you've been (as always) a huge help! Thanks a lot.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux