Jens Axboe wrote:
Hi,
Running a splice benchmark on a 4-way IPF box, I decided to give the
lockless page cache patches from Nick a spin. I've attached the results
as a png, it pretty much speaks for itself.
The test in question splices a 1GiB file to a pipe and then splices that
to some output. Normally that output would be something interesting, in
this case it's simply /dev/null. So it tests the input side of things
only, which is what I wanted to do here. To get adequate runtime, the
operation is repeated a number of times (120 in this example). The
benchmark does that number of loops with 1, 2, 3, and 4 clients each
pinned to a private CPU. The pinning is mainly done for more stable
results.
Thanks Jens!
It's interesting, single threaded performance is down a little. Is
this significant? In some other results you showed me with 3 splices
each running on their own file (ie. no tree_lock contention), lockless
looked slightly faster on the same machine.
In my microbenchmarks, single threaded lockless is quite a bit faster
than vanilla on both P4 and G5.
It could well be that the speculative get_page operation is naturally
a bit slower on Itanium CPUs -- there is a different mix of barriers,
reads, writes, etc. If only someone gave me an IPF system... ;)
As you said, it would be nice to see how this goes when the other end
are 4 gigabit pipes or so... And then things like specweb and file
serving workloads.
Nick
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]