Re: splice/vmsplice performance test results

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2006-11-23 at 12:24 +0100, Jens Axboe wrote: 
> On Wed, Nov 22 2006, Jim Schutt wrote:
> > 
> > On Wed, 2006-11-22 at 09:57 +0100, Jens Axboe wrote:
> > > On Tue, Nov 21 2006, Jim Schutt wrote:
> > [snip]
> > > > 
> > > > Hmmm.  Is it worth me trying to do some sort of kernel 
> > > > profiling to see if there is anything unexpected with 
> > > > my setup?  If so, do you have a preference as to what 
> > > > I would use?  
> > > 
> > > Not sure that profiling would be that interesting, as the problem
> > > probably lies in where we are _not_ spending the time. But it certainly
> > > can't hurt. Try to oprofile the kernel for a 10-20 sec interval while
> > > the test is running. Do 3 such runs for the two test cases
> > > (write-to-file, vmsplice/splice-to-file).
> > > 
> > 
> > OK, I've attached results for 20 second profiles of three
> > runs of each test: read-from-socket + write-to-file, and
> > read-from-socket + vmsplice/splice-to-file.
> > 
> > The test case and throughput is in the name: e.g. rvs-1-306MBps
> > is trial 1 of read/vmsplice/splice case, which ran at 306 MB/s.
> > 
> > Let me know if I can help with more testing, and thanks
> > again for looking into this.
> 
> As I suspected, nothing sticks out in these logs as the problem here is
> not due to a maxed out system. The traces look fairly identical, less
> time spent in copy_user with the splice approach.
> 
> Comparing the generic_file_buffered_write() and splice-to-file path,
> there really isn't a whole lot of difference. It would be interesting to
> try and eliminate some of the differences between the two approaches -
> could you try and change the vmsplice to a write-to-pipe instead? And
> add SPLICE_F_NONBLOCK to the splice-to-file as well. Basically I'm
> interested in a something that only really tests splice-to-file vs
> write-to-file. Perhaps easier if you can just run fio to test that, I'm
> inlining a job file to test that specifically.
> 

Sorry for the delayed reply.

Here's results from three fio runs, using the job file
you gave me, and fio-git-20061124142507.tar.gz:

---
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (groupid=0): err= 0:
  write: io= 16384MiB, bw=558404KiB/s, runt= 30766msec
    slat (msec): min=    0, max=    0, avg= 0.00, dev= 0.00
    clat (msec): min=    0, max=   35, avg= 0.00, dev= 0.08
    bw (KiB/s) : min=    0, max=644481, per=98.90%, avg=552243.95,
dev=561417.03
  cpu          : usr=0.91%, sys=85.03%, ctx=14121
splice: (groupid=1): err= 0:
  write: io= 16384MiB, bw=486144KiB/s, runt= 35339msec
    slat (msec): min=    0, max=    0, avg= 0.00, dev= 0.00
    clat (msec): min=    0, max=   30, avg= 0.00, dev= 0.10
    bw (KiB/s) : min=    0, max=555745, per=99.06%, avg=481567.28,
dev=488565.60
  cpu          : usr=0.85%, sys=88.87%, ctx=12956

Run status group 0 (all jobs):
  WRITE: io=16384MiB, aggrb=558404, minb=558404, maxb=558404,
mint=30766msec, maxt=30766msec

Run status group 1 (all jobs):
  WRITE: io=16384MiB, aggrb=486144, minb=486144, maxb=486144,
mint=35339msec, maxt=35339msec

Disk stats (read/write):
  md0: ios=20/270938, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
---
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (groupid=0): err= 0:
  write: io= 16384MiB, bw=547234KiB/s, runt= 31394msec
    slat (msec): min=    0, max=    1, avg= 0.00, dev= 0.00
    clat (msec): min=    0, max=   57, avg= 0.00, dev= 0.14
    bw (KiB/s) : min=    0, max=662568, per=98.94%, avg=541406.71,
dev=550958.67
  cpu          : usr=0.79%, sys=82.80%, ctx=16560
splice: (groupid=1): err= 0:
  write: io= 16384MiB, bw=473313KiB/s, runt= 36297msec
    slat (msec): min=    0, max=    1, avg= 0.00, dev= 0.00
    clat (msec): min=    0, max=   27, avg= 0.00, dev= 0.11
    bw (KiB/s) : min=    0, max=562298, per=99.12%, avg=469142.21,
dev=476426.36
  cpu          : usr=1.05%, sys=84.78%, ctx=16043

Run status group 0 (all jobs):
  WRITE: io=16384MiB, aggrb=547234, minb=547234, maxb=547234,
mint=31394msec, maxt=31394msec

Run status group 1 (all jobs):
  WRITE: io=16384MiB, aggrb=473313, minb=473313, maxb=473313,
mint=36297msec, maxt=36297msec

Disk stats (read/write):
  md0: ios=17/270784, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
---
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (g=0): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=sync, iodepth=1
splice: (g=1): rw=write, odir=0, bs=64K-64K/64K-64K, rate=0,
ioengine=splice, iodepth=1
write: (groupid=0): err= 0:
  write: io= 16384MiB, bw=561140KiB/s, runt= 30616msec
    slat (msec): min=    0, max=    1, avg= 0.00, dev= 0.00
    clat (msec): min=    0, max=   26, avg= 0.00, dev= 0.08
    bw (KiB/s) : min=    0, max=665976, per=98.94%, avg=555198.98,
dev=564632.52
  cpu          : usr=0.82%, sys=85.12%, ctx=14287
splice: (groupid=1): err= 0:
  write: io= 16384MiB, bw=487192KiB/s, runt= 35263msec
    slat (msec): min=    0, max=    0, avg= 0.00, dev= 0.00
    clat (msec): min=    0, max=   29, avg= 0.00, dev= 0.09
    bw (KiB/s) : min=    0, max=566099, per=99.08%, avg=482706.32,
dev=489726.36
  cpu          : usr=0.85%, sys=86.96%, ctx=13072

Run status group 0 (all jobs):
  WRITE: io=16384MiB, aggrb=561140, minb=561140, maxb=561140,
mint=30616msec, maxt=30616msec

Run status group 1 (all jobs):
  WRITE: io=16384MiB, aggrb=487192, minb=487192, maxb=487192,
mint=35263msec, maxt=35263msec

Disk stats (read/write):
  md0: ios=18/270851, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
---

Do these results give you something to work with, or would 
you like me to run some other tests?

Let me know.

Thanks -- Jim Schutt

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux