[patch 2.6.13-rc3a] i386: inline restore_fpu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



  This patch makes restore_fpu() an inline.  When L1/L2 cache are saturated
it makes a measurable difference.

  Results from profiling Volanomark follow.  Sample rate was 2000 samples/sec
(HZ = 250, profile multiplier = 8) on a dual-processor Pentium II Xeon.


Before:

 10680 restore_fpu                              333.7500
  8351 device_not_available                     203.6829
  3823 math_state_restore                        59.7344
 -----
 22854


After:

 12534 math_state_restore                       130.5625
  8354 device_not_available                     203.7561
 -----
 20888


Patch is "obviously correct" and cuts 9% of the overhead.  Please apply.

Next step should be to physically place math_state_restore() after
device_not_available().  Would such a patch be accepted?  (Yes it
would be ugly and require linker script changes.)

Signed-off-by: Chuck Ebbert <[email protected]>

Index: 2.6.13-rc3a/arch/i386/kernel/i387.c
===================================================================
--- 2.6.13-rc3a.orig/arch/i386/kernel/i387.c
+++ 2.6.13-rc3a/arch/i386/kernel/i387.c
@@ -82,17 +82,6 @@
 }
 EXPORT_SYMBOL_GPL(kernel_fpu_begin);
 
-void restore_fpu( struct task_struct *tsk )
-{
-	if ( cpu_has_fxsr ) {
-		asm volatile( "fxrstor %0"
-			      : : "m" (tsk->thread.i387.fxsave) );
-	} else {
-		asm volatile( "frstor %0"
-			      : : "m" (tsk->thread.i387.fsave) );
-	}
-}
-
 /*
  * FPU tag word conversions.
  */
Index: 2.6.13-rc3a/include/asm-i386/i387.h
===================================================================
--- 2.6.13-rc3a.orig/include/asm-i386/i387.h
+++ 2.6.13-rc3a/include/asm-i386/i387.h
@@ -22,11 +22,20 @@
 /*
  * FPU lazy state save handling...
  */
-extern void restore_fpu( struct task_struct *tsk );
-
 extern void kernel_fpu_begin(void);
 #define kernel_fpu_end() do { stts(); preempt_enable(); } while(0)
 
+static inline void restore_fpu( struct task_struct *tsk )
+{
+	if ( cpu_has_fxsr ) {
+		asm volatile( "fxrstor %0"
+			      : : "m" (tsk->thread.i387.fxsave) );
+	} else {
+		asm volatile( "frstor %0"
+			      : : "m" (tsk->thread.i387.fsave) );
+	}
+}
+
 /*
  * These must be called with preempt disabled
  */
__
Chuck
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux