RE: [Fastboot] [RFC] [PATCH 2/2] kdump: cciss driver initialization issue fix

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 

> -----Original Message-----
> From: Eric W. Biederman [mailto:[email protected]] 
> Sent: Monday, June 26, 2006 11:38 AM
> To: Miller, Mike (OS Dev)
> Cc: [email protected]; Maneesh Soni; Andrew Morton; 
> [email protected]; [email protected]; 
> [email protected]; [email protected]
> Subject: Re: [Fastboot] [RFC] [PATCH 2/2] kdump: cciss driver 
> initialization issue fix
> 
> "Miller, Mike (OS Dev)" <[email protected]> writes:
> 
> > All,
> > Sorry to come in late and top post. I've been out of the office and 
> > I'm trying to get to the gist of this issue.
> > Exactly what is the problem? I'm not familiar with kdump so I don't 
> > have a clue about what's going on.
> > There are a couple of reset features supported by _some_ cciss 
> > controllers. I'd have to go back to the open spec to see 
> whats in the 
> > public domain. We're trying to get the open spec updated and more 
> > complete but we're waiting on the lawyers. :(
> 
> 
> kdump or taking crash dumps using the kexec on panic 
> mechanism could be called a drivers worst nightmare.  In the 
> latest distros this is becoming the way crash dump style 
> information is captured.
> 
> Because the initial kernel is broken we do a jump into 
> another kernel that is sufficient to record a crash dump.  
> That second kernel initializes the hardware from whatever 
> random state the first kernel left the drivers in.  That 
> first kernel is not permitted to do any device shutdown activities.
> 
> The problem is that a command the running instance of the 
> driver did not initiate completes.  At least if I read Vivek 
> patch 2/2 correctly.
> 
> So we have three options.
> - reset the card during initialization.
> - handle the case of a command we did not initiate completing.
> - mark the driver/card as impossibly hopeless for use in a crash
>   dump scenario.
> 
> 
> Eric

Thanks Eric, that helps me understand. Section 8.2.2 of the open cciss
spec supports a reset message. Target 0x00 is the controller. We could
add this to the init routine to ensure the board is made sane again but
this would drastically increase init time under normal circumstances.
And I suspect this is a hard reset, also. Not sure if that would
negatively impact kdump. If there were some condition we could test
against and perform the reset when that condition is met it would not
impact 99.9% of users.

Thoughts, comments, flames?

mikem
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux