Re: [RFC] LZO de/compression support - take 6

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 28 May 2007 11:30:55 Adrian Bunk wrote:
> On Mon, May 28, 2007 at 08:10:31PM +0530, Nitin Gupta wrote:
> > Hi,
> >
> > Attached is tester code used for testing.
> > (developed by Daniel Hazelton -- modified slightly to now use 'take 6'
> > version for 'TinyLZO')
> >
> > Cheers,
> > Nitin
> >
> > On 5/28/07, Nitin Gupta <[email protected]> wrote:
> >> (Using tester program from Daniel)
> >>
> >> Following compares this kernel port ('take 6') vs original miniLZO code:
> >>
> >> 'TinyLZO' refers to this kernel port.
> >>
> >> 10000 run averages:
> >> 'Tiny LZO':
> >>        Combined: 61.2223 usec
> >>        Compression: 41.8412 usec
> >>        Decompression: 19.3811 usec
> >> 'miniLZO':
> >>        Combined: 66.0444 usec
> >>        Compression: 46.6323 usec
> >>        Decompression: 19.4121 usec
> >>
> >> Result:
> >> Overall: TinyLZO is 7.3% faster
> >> Compressor: TinyLZO is 10.2% faster
> >> Decompressor: TinyLZO is 0.15% faster
>
> So your the compressor of your version runs 10.2% faster than the
> original version.
>
> That's a huge difference.
>
> Why exactly is it that much faster?

All I've done is write the code used for testing the compressor in userspace, 
but I think it might have something to do with the NOP that cpu_to_le16() is 
on LE systems (the original has an open-coded byte-swap and I don't think it 
ever really gets passed) and the removal of a lot of un-needed casts. 
(version 5 of the code re-introduced one of the macro's that had a lot of 
casts and it showed a drastic slow-down). Anyway, I'm going to get back into 
my testbed and see about actually making a lot of the stuff that is currently 
just a NOP (I made exceedingly minor changes to the files to get the working 
in userspace - one of the changes was to define the kernel-specific macros's 
likely(), unlikely(), noinline and cpu_to_le16() to be NOP's.) into 
functional code. That set of changes may affect the performance and bring it 
closer to the original or it may not.

And the testbed code is open - I guess I should stick the GPL headers in it, 
even though it isn't likely to see much use outside a very small circle. If 
*anyone* is skeptical about the numbers, feel free to use the testbed 
yourself. (It currently only works as intended on LE systems, however) Just 
untar the source into a directory, run make and the run 'tester.pl' to 
actually perform the test runs and gather the numbers. There is no overhead 
included in the numbers - they come directly from running gettimeofday() 
before and after the calls to the respective functions (well, a bit of magic 
is done to make the numbers have meaning).

DRH

PS: the variable '$RUNS' in tester.pl determines how many times the script 
loops each program to generate the numbers. I'm currently using 10000 runs to 
generate the average timings, but if you feel that is too large or too small 
a sample set, go ahead and change it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux