Re: [PATCH 1/6] PCIERR : interfaces for synchronous I/O error detection on driver

Linas Vepstas wrote:

On Fri, Mar 24, 2006 at 04:47:25PM +0900, Hidetoshi Seto wrote:

However, some difficulty still remains to cover all possible error
situations even if we use callbacks. It will not help keeping data
integrity, passing no broken data to drivers and user lands, preventing
applications from going crazy or sudden death.

This is not true.  Although there are some subtle issues, (which

I invite you to describe), the goal of the current design is toinsure data integrity, and make sure that neither the driver northe userland gets corrupted data. There shouldn't be any "crazy

or sudden death" if the device drivers are any good.

OK, we are sharing the same goal even now.

I failed to mention that as you know this synchronous error detection
would be required if the async-callback needs to touch the hardware
due to recover it or to pick up diagnostic data during kernel-initiated
recovery. (I found a word "pci_check_whatever() API" at your comment
in document, Documentation/pci-error-recovery.txt)

Of course, this depends on the hardware implementation. If

your PCI bus sends corrupt data up to the driver ... all betsare off. The design is predicated on the assumption that the

hardware sends either good data or no data, ad that the latter
is associated with a bus state indicating an error has ocurred.

 - It will be useful if arch chooses panic on bus errors not to pass
   any broken data to un-reliable drivers.

I assume you meant "if arch chooses NOT to panic on bus errors ..."

Hmm, what I meant is that:
  There is an arch that chooses reboot on bus error.
  The reason why it do so is that the design is based on the assumption
  that no driver is able to handle bus error and that almost all drivers
  will go without checking hardware status. So the arch chooses rebooting
  rather than polluting user data.
  The design allows OS to determine whether the system goes reboot or not,
  but OS has no idea to know which driver actually check hardware state.
Therefore this interface will help OS to know which driver is reliable.

Of course there are some arch that chooses not to panic/reboot on bus error.
I think they are believing that all drivers working on the arch can handle
any type of errors, or they have their special feature against errors...,
or just being idiot about hardware errors.

Anyway, all that is certain is that:
 - To check the data from hardware, driver need to ask anywhere synchronously.
 - "Anywhere" would be a register, and/or something in kernel/hardware.
 - State check would be architecture dependent routine work.

Thanks,
H.Seto

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH 1/6] PCIERR : interfaces for synchronous I/O error detection on driver
  - From: linas@austin.ibm.com (Linas Vepstas)

References:
- [PATCH] PCIERR : interfaces for synchronous I/O error detection on driver
  - From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
- Re: [PATCH] PCIERR : interfaces for synchronous I/O error detection on driver
  - From: Greg KH <greg@kroah.com>
- [PATCH 1/6] PCIERR : interfaces for synchronous I/O error detection on driver
  - From: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
- Re: [PATCH 1/6] PCIERR : interfaces for synchronous I/O error detection on driver
  - From: linas@austin.ibm.com (Linas Vepstas)

Prev by Date: Re: [Alsa-devel] nm256: register strangeness related to lockups
Next by Date: Re: [PATCH 1/2] create struct compat_timex and use it everywhere
Previous by thread: Re: [PATCH 1/6] PCIERR : interfaces for synchronous I/O error detection on driver
Next by thread: Re: [PATCH 1/6] PCIERR : interfaces for synchronous I/O error detection on driver
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]