Re: noisy edac

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dave Peterson wrote:

On Monday 30 January 2006 15:44, Gunther Mayer wrote:
For each individual type of error that is specific to a particular
low-level chipset driver (e752x, amd76x, etc.) there could be an entry
in the appropriate part of the sysfs hierarchy under the given chipset
driver.  This entry could have several settings that the user may choose
from such as { ignore, syslog, panic }.  For the implementation, there
could be a generic piece of code in the core EDAC module that a chipset
driver calls into.  The generic code would do the dirty work of creating
the sysfs entries (and destroying them when the chipset module is
unloading).  How does this sound?
Over-Engineered.

Do you have an alternate suggestion?
Just printk() the exact driver specific low-level error, even if non-fatal.

Single non-fatal errors just show your system recovers correctly.

Multiple (e.g. noisy) non-fatal are either an indication of a serious problem
 (e.g. after how many corrected ECC errors on the same address in which
   time interval will you replace your dimm? How many S-ATA CRC-errors
    will indicate marginal bad cabling? )
or it shows the problem needs to be root analyzed. But don't disable the
messages as this will only hide the real problem.

Concerning Non-Fatal PCI Express errors, the error cause registers need
to be printed in case of error, too (see Intel Chipset Specifications)

-
Gunther


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux