delay_tsc(): inefficient delay loop (2.6.16-mm1)

Hello all,

I discovered that the delay loop used there (which simply found a new place
in 2.6.16-mm1; it existed much longer) is inefficient,
since it does an unnecessary subtraction *in the main loop* which also hits
asm code:

00000024 <delay_tsc>:
  24:   53                      push   %ebx
  25:   89 c3                   mov    %eax,%ebx
  27:   0f 31                   rdtsc
  29:   89 c1                   mov    %eax,%ecx
  2b:   f3 90                   pause
  2d:   0f 31                   rdtsc
  2f:   29 c8                   sub    %ecx,%eax
  31:   39 d8                   cmp    %ebx,%eax
  33:   72 f6                   jb     2b <delay_tsc+0x7>
  35:   5b                      pop    %ebx
  36:   c3                      ret

With such a patch:

--- linux-2.6.16-mm1/arch/i386/lib/delay.c.orig	2006-03-23 12:52:45.000000000 +0100
+++ linux-2.6.16-mm1/arch/i386/lib/delay.c	2006-03-24 12:53:01.000000000 +0100
@@ -44,10 +44,12 @@
 	unsigned long bclock, now;
 
 	rdtscl(bclock);
+	/* offset with bclock to have very simple comparison below */
+	loops += bclock;
 	do {
 		rep_nop();
 		rdtscl(now);
-	} while ((now-bclock) < loops);
+	} while (now < loops);
 }
 
 /*

the result is:

00000024 <delay_tsc>:
  24:   89 c1                   mov    %eax,%ecx
  26:   0f 31                   rdtsc
  28:   01 c1                   add    %eax,%ecx
  2a:   f3 90                   pause
  2c:   0f 31                   rdtsc
  2e:   39 c8                   cmp    %ecx,%eax
  30:   72 f8                   jb     2a <delay_tsc+0x6>
  32:   c3                      ret

Improvement: no unnecessary stuff after having hit the timer target value
(read: faster), better power saving, 4 bytes less opcodes.

The patch above could be considered weird since it fiddles with "loops"
directly. A possibly cleaner way would be to introduce a new variable
end = bclock + loops.

Comments? Anything that I'm missing here as to why it hasn't been done that
way before?

If this holds water then I'll submit a final patch soon.

Thanks!

Andreas Mohr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: delay_tsc(): inefficient delay loop (2.6.16-mm1)
  - From: Edgar Toernig <froese@gmx.de>
- Re: delay_tsc(): inefficient delay loop (2.6.16-mm1)
  - From: "Ray Lee" <madrabbit@gmail.com>

Prev by Date: Re: [linux-usb-devel] compile error when building multiple EHCI host controllers as modules
Next by Date: [git patches] net driver updates
Previous by thread: 2.6.16: Broken deps of spectrum_cs
Next by thread: Re: delay_tsc(): inefficient delay loop (2.6.16-mm1)
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]