Re: [BUG] 2.6.21-rc7 hpt366 driver broken

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 16 Apr 2007 20:25:15 -0700
Mike Mattie <[email protected]> wrote:

I have added Sergei Shtylyov to the address list after seeing his recent posts on hpt366 issues, and the
git changelog for the hpt366.c driver. I am very confident that I have pinpointed the defect in the driver.

> On Mon, 16 Apr 2007 19:43:03 -0700
> Mike Mattie <[email protected]> wrote:
> 
> > On Mon, 16 Apr 2007 18:21:12 -0700
> > Mike Mattie <[email protected]> wrote:
> > 
> > > On Mon, 16 Apr 2007 16:36:13 +0200
> > > Adrian Bunk <[email protected]> wrote:
> > > 
> > > > [ Cc's added, full bug report was in
> > > > http://lkml.org/lkml/2007/4/16/18 ]
> > > > 
> > > > On Mon, Apr 16, 2007 at 04:38:22AM -0700, Mike Mattie wrote:
> > > > > On Sun, 15 Apr 2007 22:48:46 -0700
> > > > > Mike Mattie <[email protected]> wrote:
> > > > > 
> > > > > > Hello,
> > > > > > 
> > > > > > I am testing the 2.6.21-rc7 kernel release. The IDE hpt366
> > > > > > driver is crashing hanging the boot. I have basically the
> > > > > > same config as 2.6.20.7 which works fine (except for
> > > > > > netconsole mentioned in a previous mail).
> > > > > > 
> > > > > > here is the hand-copied info:
> > > > > > 
> > > > > > * "unable to handle paging request" , null deref
> > > > > > * EIP @ init_chipset_hpt366
> > > > > > 
> > > > > 
> > > > > > I am running a git-bisect to see if I can resolve it to a
> > > > > > commit.
> > > > > 
> > > > > This was identified as the first broken commit:
> > > > > 
> > > > > commit 7b73ee05d0acb926923d43d78b61add776ea4bb1
> > > > > Author: Sergei Shtylyov <[email protected]>
> > > > > Date:   Wed Feb 7 18:18:16 2007 +0100
> > > > > 
> > > > >     hpt366: init code rewrite
> > > > > 
> > > > > Reverting is conflicted so it will be a bit longer before I
> > > > > pin-point any other build-breaks.
> > > > 
> > > > Thanks for your report.
> > > > 
> > > > Can you use a digital camera for taking a photograph of the
> > > > crash?
> > > 
> > > I can later on tonight, by about 11PM west coast. I also saw
> > > some hex offsets after the function pointed to by EIP, is there
> > > a way to decode that to a line number ? I have debugging symbols
> > > enabled.
> > > 
> > > I am also doing printk breadcrumbs to pin it down to a block
> > > or a line.
> > 
> > I have narrowed the crash with breadcrumbs down to these lines:
> > 
> > 
> > 	/*
> > 	 * Only try the DPLL if we don't have a table for the PCI
> > clock that
> > 	 * we are running at for HPT370/A, always use it  for
> > anything newer... *
> > 	 * NOTE: Using the internal DPLL results in slow reads on 33
> > MHz PCI.
> > 	 * We also  don't like using  the DPLL because this causes
> > glitches
> > 	 * on PRST-/SRST- when the state engine gets reset...
> > 	 */
> > 	if (info->chip_type >= HPT374 || info->settings[clock] ==
> > NULL) { u16 f_low, delta = pci_clk < 50 ? 2 : 4;
> > 		int adjust;
> > 
> > 		printk(KERN_INFO "inside the if\n");
> > 
> > 		 /*
> > 		  * Select 66 MHz DPLL clock only if UltraATA/133
> > mode is
> > 		  * supported/enabled, use 50 MHz DPLL clock
> > otherwise... */
> > 		if (info->max_mode == 0x04) {
> > 			dpll_clk = 66;
> > 			clock = ATA_CLOCK_66MHZ;
> > 		} else if (dpll_clk) {	/* HPT36x chips don't
> > have DPLL */ dpll_clk = 50;
> > 			clock = ATA_CLOCK_50MHZ;
> > 		}
> > 
> > 		if (info->settings[clock] == NULL) {
>                 ^^^^^^^^ crashes here
> 
> since info is deref'd all over the place I am assuming it is the array
> that is blowing up.
> 
> I printk'd the value of clock which is "4". that array is either not
> setup correctly , or it is out-of-bounds (speculation)

here on line 493: the hpt302n ( The chipset I have ) is the only struct without
a .settings field , I am extremely confident this is the exact location of the bug.

static struct hpt_info hpt302n __devinitdata = {
	.chip_type	= HPT302N,
	.max_mode	= HPT302_ALLOW_ATA133_6 ? 4 : 3,
	.dpll_clk	= 77,
};

I do not know enough about the HPT chips to correctly select which settings group
this field should be initialized to. Please take a look, the fix now should be very
easy.

> > 			printk(KERN_ERR "%s: unknown bus timing!\n",
> > name); kfree(info);
> > 			return -EIO;
> > 		}
> > 
> > 		printk(KERN_INFO "select DPLL clock\n");
> > 
> > This is right around 1171 , (skewed by the crumbs I added). The last
> > message I receive is "inside if" , it dies before "select DPLL
> > clock".
> > 
> > Without knowing much about the structs I am not sure what to
> > print-out. I will narrow it further, and maybe even compare against
> > what the old working kernel had for variable values. That would take
> > some time though.
> > 
> > > 
> > > > cu
> > > > Adrian
> > > > 
> > > > --
> > > > 
> > > >        "Is there not promise of rain?" Ling Tan asked suddenly
> > > > out of the darkness. There had been need of rain for many
> > > > days. "Only a promise," Lao Er said.
> > > >                                        Pearl S. Buck - Dragon
> > > > Seed
> > > > 

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux