Re: [discuss] [OOPS] powernow on smp dual core amd64

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 13, 2005 at 03:20:45PM -0700, Tom Duffy wrote:
> On Mon, 2005-06-13 at 16:47 -0500, Langsdorf, Mark wrote:
> > Okay, I think I have figured this out.  During initialization,
> > the cpufreq infrastruture only initializes the first core of
> > each processor.  When a request comes into the second core,
> > it's data structre is unitialized and we get the null point
> > dereference.
> > 
> > The solution is to assign the pointer to the data structure for
> > the first core to all the other cores.
> > 
> > Tom, could you try this patch and see if it helps?
> 
> Yes!  It fixed the panic.  I get much further.
> 
> Thanks!
> 
> Unfortunately, after starting cpuspeed daemon, I get this:
> 
> Starting cpuspeed: [  OK  ]
> Starting pcmcia:  Starting PCMCIA services:
> CPU 6: Machine Check Exception:                4 Bank 4: b200000000070f0f
> TSC 4129a3d70d

it is 

CPU 6 4 northbridge
  Northbridge Watchdog error
       bit57 = processor context corrupt
       bit61 = error uncorrected
  bus error 'generic participation, request timed out
      generic error mem transaction
      generic access, level generic'
STATUS b200000000070f0f MCGSTATUS 4

Something tried to access a physical memory address that was not
mapped in the CPU.




> Kernel panic - not syncing: Machine check
>  <1>Unable to handle kernel NULL pointer dereference at 00000000000000ff RIP:
> [<00000000000000ff>]
> PGD 41460067 PUD 3f12c067 PMD 0
> Oops: 0010 [1] SMP
> CPU 6
> Modules linked in: video container button battery ac ohci_hcd ehci_hcd i2c_nforce2 i2c_core shpchp usbnet mii dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod sata_nv libata mptscsih mptbase sd_mod scsi_mod
> Pid: 1672, comm: usb.agent Tainted: G   M  2.6.12-rc6andro
> RIP: 0010:[<00000000000000ff>] [<00000000000000ff>]
> RSP: 0000:ffff81003fe63fa0  EFLAGS: 00010006
> RAX: ffff81007f5b1fd8 RBX: 0000000000000008 RCX: 0000ffff0000ffff
> RDX: 00000000000000ff RSI: 0000000000000008 RDI: 0000ffff0000ffff
> RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> R13: 0000004129a3cf88 R14: ffffffff80370a7d R15: 0000000000000001
> FS:  00002aaaaaae0ee0(0000) GS:ffffffff80498400(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00000000000000ff CR3: 000000003e7e5000 CR4: 00000000000006e0
> Process usb.agent (pid: 1672, threadinfo ffff81007f5b0000, task ffff81007fddcff0)
> Stack: ffffffff8011ac99 ffffffff80370a7d ffffffff8010f247 ffff81007fea2c88  <EOI>
>        0000000000000005 0000000000000030 00000000000000fa 0000000000000000
>        ffffffff8011a860 0000000000000000
> Call Trace: <IRQ> <ffffffff8011ac99>{smp_call_function_interrupt+73}
>        <ffffffff8010f247>{call_function_interrupt+99}  <EOI>  <#MC> <ffffffff8011a860>{smp_really_stop_cpu+0}
>        <ffffffff8011aa68>{smp_send_stop+72} <ffffffff80136ebc>{panic+140}

Looks like the machine was too confused after the MCE to even
run panic correctly. I will investigate.

But of course fixing that will not fix the cause of the MCE.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux