Re: REGRESSION: the new i386 timer code fails to sync CPUs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 24, 2006 at 07:17:11PM +0200, Matthias Urlichs wrote:
> > Andi: If this is a generic issue, and not specific to Matthias' box, we
> > may need to re-think the assumption that Intel SMP is synced. You're
> > thoughts?
> > 
> "Your". ;-)
> 
> You can probably assume that they're synced on systems with no more
> than one dual-core / hyperthreaded CPU.
> 
> My system obviously has two of those.

According to Intel on all of their chipsets/motherboard reference
designs all the sockets run from a single clock crystal.

I've confirmed this for a long time on 64bit and even to some
extent on 32bit on distro kernels.

Maybe you got a broken BIOS or similar though.

> > Matthias: "clock=pmtmr" is probably the best workaround in the short
> > term. Could you send me your dmesg and dmidecode output? We'll try to
> > find something to key off of so it will mark the tsc as unstable by
> > default on your system.
> > 
> I'd assume that finding (and, possibly, being unable to correct) TSC skew 

The BIOS normally guarantee it at boot. However maybe you got a broken one.

We used to do TSC sync correction at boot on Intel, but stopped doing 
that when we found out that the TSC sync code adds an error
To an already perfectly synchronized system.

Actually I think i386 still does it, just x86-64 stopped 

My first assumption would be that you hit a bug somewhere in the new
clock code. What happens when you boot an older kernel (like 2.6.17)
with clock=tsc ? 


>  BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved)
>  BIOS-e820: 00000000fffffc00 - 0000000100000000 (reserved)
>  BIOS-e820: 0000000100000000 - 0000000128000000 (usable)
> Warning only 4GB will be used.

You should at least set CONFIG_HIGHMEM_64G or use a 64bit 
kernel if the system does long mode.

> ENABLING IO-APIC IRQs
> ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
> checking TSC synchronization across 4 CPUs:
> CPU#0 had 748437 usecs TSC skew, fixed it up.
> CPU#1 had 748437 usecs TSC skew, fixed it up.
> CPU#2 had -748437 usecs TSC skew, fixed it up.
> CPU#3 had -748437 usecs TSC skew, fixed it up.

Hmm, that looks unusual. Maybe the BIOS is really broken.
On most Intel systems I saw 

Normally Linux should fix it up here and then the TSC should
tick synchronous though (but with an small offset that the sync
code cannot entirely avoid)

> 
> Handle 0x0000, DMI type 0, 20 bytes.
> BIOS Information
>         Vendor: Phoenix Technologies LTD
>         Version: 6.00
>         Release Date: 09/29/2005
> 
> Handle 0x0001, DMI type 1, 25 bytes.
> System Information
>         Manufacturer: Intel Corporation
>         Product Name: Nocona/Tumwater Customer Reference Board
>         Version: Revision A0


Hmm, those should definitely have a synced TSC. 

However A0 suspiciously sounds like a engineering sample, normally
production systems have higher revision numbers. If it's just
a beta hardware bug we probably won't care.

Asit, do you know of any TSC sync between CPUs issues in that 
board/BIOS version?

-Andi


>         Serial Number: 0123456789
>         UUID: 0A0A0A0A-0A0A-0A0A-0A0A-0A0A0A0A0A0A
>         Wake-up Type: Power Switch
> 
> Handle 0x0002, DMI type 2, 8 bytes.
> Base Board Information
>         Manufacturer: Intel Corporation
>         Product Name: TYAN Tiger-i7320-S5350
>         Version: Revision A0
>         Serial Number: 9876543210
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux