On Wed, Apr 26 2006, Andrew Morton wrote:
> Jens Axboe <[email protected]> wrote:
> >
> > On Wed, Apr 26 2006, Andrew Morton wrote:
> > > Jens Axboe <[email protected]> wrote:
> > > >
> > > > Running a splice benchmark on a 4-way IPF box, I decided to give the
> > > > lockless page cache patches from Nick a spin. I've attached the results
> > > > as a png, it pretty much speaks for itself.
> > >
> > > It does.
> > >
> > > What does the test do?
> > >
> > > In particular, does it cause the kernel to take tree_lock once per
> > > page, or once per batch-of-pages?
> >
> > Once per page, it's basically exercising the generic_file_splice_read()
> > path. Basically X number of "clients" open the same file, and fill those
> > pages into a pipe using splice. The output end of the pipe is then
> > spliced to /dev/null to toss it away again.
>
> OK. That doesn't sound like something which a real application is likely
> to do ;)
I don't think it's totally unlikely. Could be streaming a large media
file to many clients, for instance. Of course you are not going to push
gigabytes of data per seconds like this test, but it's still the same
type of workload.
> > The top of the 4-client
> > vanilla run profile looks like this:
> >
> > samples % symbol name
> > 65328 47.8972 find_get_page
> >
> > Basically the machine is fully pegged, about 7% idle time.
>
> Most of the time an acquisition of tree_lock is associated with a disk
> read, or a page-size memset, or a page-size memcpy. And often an
> acquisition of tree_lock is associated with multiple pages, not just a
> single page.
Yeah with mostly io then I'd be hard pressed to show a difference.
> So although the graph looks good, I wouldn't view this as a super-strong
> argument in favour of lockless pagecache.
I didn't claim it was, just trying to show some data on at least one
case where the lockless patches perform well and the stock kernel does
not :-)
Are there cases where the lockless page cache performs worse than the
current one?
> > But boy I wish find_get_pages_contig() was there
> > for that. I think I'd prefer adding that instead of coding that logic in
> > splice, it can get a little tricky.
>
> I guess it'd make sense - we haven't had a need for such a thing before.
>
> umm, something like...
>
> unsigned find_get_pages_contig(struct address_space *mapping, pgoff_t start,
> unsigned int nr_pages, struct page **pages)
> {
> unsigned int i;
> unsigned int ret;
> pgoff_t index = start;
>
> read_lock_irq(&mapping->tree_lock);
> ret = radix_tree_gang_lookup(&mapping->page_tree,
> (void **)pages, start, nr_pages);
> for (i = 0; i < ret; i++) {
> if (pages[i]->mapping == NULL || pages[i]->index != index)
> break;
> page_cache_get(pages[i]);
> index++;
> }
> read_unlock_irq(&mapping->tree_lock);
> return i;
> }
Ah clever, I didn't think of stopping on the first hole. Works well
since you need to manually get a reference on the page anyways.
Let me redo the numbers with this splice updated.
--
Jens Axboe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]