Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts

On Tue, 30 Aug 2005, Knut Petersen wrote:
> > Probably you can make it even faster by avoiding the multiplication, like
> > 
> >    unsigned int offset = 0;
> >    for (i = 0; i < image.height; i++) {
> > 	dst[offset] = src[i];
> > 	offset += pitch;
> >    }
> 
> More than two decades ago I learned to avoid mul and imul. Use shifts, add and
> lea instead,
> that was the credo those days. The name of the game was CP/M 80/86, a86, d86
> and ddt ;-)
> 
> But let�s get serious again.

On modern CPUs, a multiplication indeed takes 1 cycle, just like an addition.
But on older CPUs (still supported by Linux), this is not true.

> Your proposed change of the patch results in a 21 ms performance decrease on
> my system.
> Yes, I do know that this is hard to believe. I tested a similar variation
> before, and the results
> were even worse.
> 
> Avoiding mul is a good idea in assembly language today, but often it is better
> to write a
> multiplication  with the loop counter in C and not to introduce an extra
> variable instead. The
> compiler will optimize the code and it�s easier for gcc without that extra
> variable.

But you are right. On actual inspection of the generated assembly code for a
very simple test case, it turns out both (m68k-linux-)gcc 2.95.2 and 3.3.3 are
smart enough to convert the multiplication to an addition...

And interestingly, if I avoid the multiplication explicitly, gcc 2.95.2 still
generates the same code, but 3.3.3 adds a few extra instructions to
save/restore local vars. So this probably explains why it turned out to be
slower for you. Ugh...

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds

References:
- [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
  - From: Knut Petersen <Knut_Petersen@t-online.de>
- Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
  - From: Geert Uytterhoeven <geert@linux-m68k.org>
- Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
  - From: Knut Petersen <Knut_Petersen@t-online.de>

Prev by Date: [Fwd: [tpmdd-devel] [PATCH] linux-2.6.13/drivers/char/tpm/tpm_atmel.c on ICH6]
Next by Date: Re: Where is the performance bottleneck?
Previous by thread: Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
Next by thread: Re: [Linux-fbdev-devel] [PATCH 1/1 2.6.13] framebuffer: bit_putcs() optimization for 8x* fonts
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind]