Re: [PATCH] kexec: force x86_64 arches to boot kdump kernels on boot cpu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Dec 11, 2007 4:52 PM, Neil Horman <[email protected]> wrote:
> On Tue, Dec 11, 2007 at 04:16:32PM -0800, Ben Woodard wrote:
> > We may need to go back and do some additional work on this. It doesn't
> > seem to be quite as cut and dried as we initially thought.
> >
> > This quirk doesn't appear to work on virtually the same motherboard with
> > the barcelona processors in it. It also may be sensitive to the firmware
> > version.  More extensive testing on a larger number of pre-production is
> > not showing it to be as effective as it appeared to be initially on the
> > testbed.
> >
> > I'm doing some retesting to figure out what exact situations and
> > collection of patches were able to make it work before.
> >
> Ben, please lets be clear about this.  You say this patch doesn't help on a new
> system.  Even thought its almost the exact same system, its not the same system.
> Does this patch work consistently on the system you initially reported the
> problem on?  I've done enough work on this at this point that I'm invested in
> not abandoning this fix.  If this solves the problem on dual core system, but
> not quad core, I'd much rather move forward with this fix and address your quad
> core problem as a separate issue.
>
> Neil
>
>
> > -ben
> >
> >
> >
> > Neil Horman wrote:
> > > Recently a kdump bug was discovered in which a system would hang inside
> > > calibrate_delay during the booting of the kdump kernel.  This was caused by the
> > > fact that the jiffies counter was not being incremented during timer
> > > calibration.  The root cause of this problem was found to be a bios
> > > misconfiguration of the hypertransport bus.  On system affected by this hang,
> > > the bios had assigned APIC ids which used extended apic bits (more than the
> > > nominal 4 bit ids's), but failed to configure bit 17 of the hypertransport
> > > transaction config register, which indicated that the mask for the destination
> > > field of interrupt packets accross the ht bus (see section 3.3.9 of
> > > http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26094.PDF).
> > > If a crash occurs on a cpu with an APIC id that extends beyond 4 bits, it will
> > > not recieve interrupts during the kdump kernel boot, and this hang will be the
> > > result.  The fix is to add this patch, whcih add an early pci quirk check, to
> > > forcibly enable this bit in the httcfg register.  This enables all cpus on a
> > > system to receive interrupts, and allows kdump kernel bootup to procede
> > > normally.
> > >
> > > Regards
> > > Neil
> > >
> > >
> > > Signed-off-by: Neil Horman <[email protected]>
> > >
...
> > >  static struct chipset early_qrk[] __initdata = {
> > > -   { PCI_VENDOR_ID_NVIDIA, nvidia_bugs },
> > > -   { PCI_VENDOR_ID_VIA, via_bugs },
> > > -   { PCI_VENDOR_ID_ATI, ati_bugs },
> > > +   { PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID, PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, nvidia_bugs },
> > > +   { PCI_VENDOR_ID_VIA, PCI_ANY_ID, PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, via_bugs },
> > > +   { PCI_VENDOR_ID_ATI, PCI_ANY_ID, PCI_CLASS_BRIDGE_PCI, PCI_ANY_ID, ati_bugs },
> > > +   { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_K8_NB, PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, fix_hypertransport_config },

==>

+   { PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_K8_NB,
PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID, fix_hypertransport_config },
+   { PCI_VENDOR_ID_AMD, 0x1200 , PCI_CLASS_BRIDGE_HOST, PCI_ANY_ID,
fix_hypertransport_config },

I still think good way is that you ask Supermicro to update their BIOS
to use newer code from AMD.

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux