Re: [PATCH] local_irq_disable removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



FWIW - STI takes 3,5 and 7 cycles on 386,486, and 386SX respectively (these
are all still popular embedded processors)

Do not forget the impact of instruction size on fetch cost - the "book"
cycle counts are only for instructions ALREADY fetched and in the pipe/queue
and IGNORE the cost of fetching said instruction.  If you happen to get an
L1 hit, you win.  If that LEA or OR stradles a dword or cache line boundary
the CLI/STI may well wind up running faster under combat conditions if line
load or second memory fetch is needed to get the second half of the
instruction.

On an x86 CPU that which appears to be fast per the book clocks is often
slower, and that which appears to clock faster is often slower in reality
due to increased fetch cost of plump instructions.  I have many benchmarks
that prove this.

----- Original Message ----- 
From: "Thomas Gleixner" <[email protected]>
To: "Daniel Walker" <[email protected]>
Cc: "Ingo Molnar" <[email protected]>; "Esben Nielsen" <[email protected]>;
<[email protected]>; <[email protected]>
Sent: Saturday, June 11, 2005 19:44
Subject: Re: [PATCH] local_irq_disable removal


> On Sat, 2005-06-11 at 13:51 -0700, Daniel Walker wrote:
> > On Sat, 11 Jun 2005, Ingo Molnar wrote:
>
> > Interesting .. So "cli" takes 7 cycles , "sti" takes 7 cycles. The
current
> > method does "lea" which takes 1 cycle, and "or" which takes 1 cycle.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux