Ross Biro wrote:
Currently Linux 2.6 assumes the BIOS (or firmware) sets the master abort
mode flag on PCI bridge chips in a coherent fashion. This is not always
the case and the consequences of getting this flag incorrect can cause
hardware to fail or silent data corruption. This patch lets the user
override the BIOS master abort setting at boot time and the distro
maintainer to set a default according to their target audience.
The comments in the patch are probably a bit too verbose, but I think it
is a good patch to start discussions around. If it is decided that
something should be done about this problem, this patch could be
included in a -mm release and migrate into Linus's kernel as appropriate.
The comments were helpful to me.
This incarnation of the patch has had minimal testing. For our internal
kernels, we always force the master abort mode to 1 and then let the
device drivers for hardware we know can't handle target aborts switch
the master abort mode to 0. This does not seem appropriate for general
release.
Some background for those who do not spend most of their waking hours
exploring buses and what can go wrong.
Is this related (or could it be -- or should it be) at all to the
current discussion on the linux-pci mailing list
[email protected]) about "PCI Error Recovery
API Proposal" ?
The master abort flag tells a PCI bridge what to do when a bus master
behind the bridge requests the bus and the bridge is unable to get the
bus. With the flag clear, for master reads the bridge returns all
0xff's (hence silent data corruption) and for master writes, it throws
the data away. With the bit set, the bridge sends a target abort to the
master. This can only happen when the system is heavily loaded.
or a PCI device isn't playing nicely?
The problem with always setting the bit is that some PCI hardware,
notably some Intel E-1000 chips (Ethernet controller: Intel Corporation:
Unknown device 1076) cannot properly handle the target abort bit. In
the case of the E-1000 chip, the driver must reset the chip to recover.
This usually leads to the machine being off the network for several
seconds, or sometimes even minutes, which can be bad for servers.
I even have a single motherboard with both a device that cannot handle
the target abort and an IDE controller that can handle the target abort
behind the same bridge. For this motherboard, I have to choose the
lesser of two evils, network hiccups or potential data corruption.
For the record, I have seen both occur. Other people may make wish to
make a different choice than we did, hence this patch allows the user to
choose the mode at runtime.
Ross
------------------------------------------------------------------------
diff -ur linux-2.6.11/drivers/pci/Kconfig linux-2.6.11-new/drivers/pci/Kconfig
--- linux-2.6.11/drivers/pci/Kconfig 2005-03-01 23:37:51.000000000 -0800
+++ linux-2.6.11-new/drivers/pci/Kconfig 2005-04-01 07:19:32.000000000 -0800
@@ -47,3 +47,38 @@
When in doubt, say Y.
+choice
+ prompt "Enable PCI Master Abort Mode"
+ depends on PCI
+ default PCI_MASTER_ABORT_DEFAULT
+ help
+ On PCI systems, when a bus is unavailable to a bus master, a
+ master abort occurs. Older bridges satisfy the master request
+ with all 0xFF's. This can lead to silent data corruption. Newer
+ bridges can send a target abort to the bus master. Some PCI
+ hardware cannot handle the target abort. Some x86 BIOSes configure
+ the buses in a suboptimal way. This option allows you to override
^^^ extra spaces
+ the BIOS setting. If unsure chose default. This choice can be
choose
+ overridden at boot time with the pci_enable_master_abort={default,
+ enable, disable}
boot option.
+
+config PCI_MASTER_ABORT_DEFAULT
+ bool "Default"
+ help
+ Choose this option if you are unsure, or believe your
+ firmware does the right thing.
+
+config PCI_MASTER_ABORT_ENABLE
+ bool "Enable"
+ help
+ Choose this option if it is more important for you to prevent
+ silent data loss than to have more hardware configurations work.
^^^^ ??
+
+
+config PCI_MASTER_ABORT_DISABLE
+ bool "Disable"
+ help
+ Choose this option if it is more important for you to have more
^^^^
The phrase "have more hardware configurations work" need something....
Maybe add something like: "Some devices are known not to work with
PCI Master Aborts. If you have one of these devices, you probably
want to Disable this option."
+ hardware configurations work than to prevent silent data loss.
+
+endchoice
diff -ur linux-2.6.11/drivers/pci/probe.c linux-2.6.11-new/drivers/pci/probe.c
--- linux-2.6.11/drivers/pci/probe.c 2005-03-01 23:38:13.000000000 -0800
+++ linux-2.6.11-new/drivers/pci/probe.c 2005-04-05 12:07:53.000000000 -0700
@@ -28,6 +28,15 @@
LIST_HEAD(pci_devices);
+/* used to force master abort mode on or off at runtime.
+ PCI_MASTER_ABORT_DEFAULT means leave alone, the BIOS got it correct.
+ PCI_MASTER_ABORT_ENABLE means turn it on everywhere.
+ PCI_MASTER_ABORT_DISABLE means turn it off everywhere.
+*/
+
+static int pci_enable_master_abort=PCI_MASTER_ABORT_VAL;
Nitpick: spaces around the '=' would enhance readability and be
appreciated.
@@ -429,6 +438,20 @@
pci_write_config_word(dev, PCI_BRIDGE_CONTROL,
bctl & ~PCI_BRIDGE_CTL_MASTER_ABORT);
+ /* Some BIOSes disable master abort mode, even though it's
+ usually a good thing (prevents silent data corruption).
+ Unfortunately some hardware (buggy e-1000 chips for
+ example) require Master Abort Mode to be off, or they will
+ not function properly. So we enable master abort mode
+ unless the user told us not to. The default value
+ for pci_enable_master_abort is set in the config file,
+ but can be overridden at setup time. */
Nit #2: kernel long-comment style is:
/*
* line1
* line2
*/
+ if (pci_enable_master_abort == PCI_MASTER_ABORT_ENABLE) {
+ bctl |= PCI_BRIDGE_CTL_MASTER_ABORT;
+ } else if (pci_enable_master_abort == PCI_MASTER_ABORT_DISABLE) {
+ bctl &= ~PCI_BRIDGE_CTL_MASTER_ABORT;
+ }
+
pci_enable_crs(dev);
if ((buses & 0xffff00) && !pcibios_assign_all_busses() && !is_cardbus) {
@@ -932,6 +955,22 @@
kfree(b);
return NULL;
}
+
+static int __devinit pci_enable_master_abort_setup(char *str)
Why __devinit? Looks to me like __init would be fine.
+{
+ if (strcmp(str, "enable") == 0) {
+ pci_enable_master_abort = PCI_MASTER_ABORT_ENABLE;
+ } else if (strcmp(str, "disable") == 0) {
+ pci_enable_master_abort = PCI_MASTER_ABORT_DISABLE;
+ } else if (strcmp(str, "default") == 0) {
+ pci_enable_master_abort = PCI_MASTER_ABORT_DEFAULT;
+ } else {
+ printk (KERN_ERR "PCI: Unknown Master Abort Mode (%s).", str);
+ }
+}
+
+__setup("pci_enable_master_abort=", pci_enable_master_abort_setup);
--
~Randy
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]