Re: Fedora 11 boot fails on md with two HBAs [SOLVED]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Switched to dracut, which does not use kernel raid autodetect.  Works great, though subject to BZ 513267.




----- Original Message ----
From: S Murthy Kambhampaty <smk_va@xxxxxxxxx>
To: fedora-list@xxxxxxxxxx
Sent: Thursday, September 3, 2009 3:52:43 AM
Subject: Fedora 11 boot fails on md with two HBAs

[Fixed typo in header line, etc.; added info at end.]


With Fedora 11 on IBM x3650 M2 (x86_64), I am having problems booting after adding a second, add-in, SAS controller with 12 SAS disks in an external enclosure.  Without the add-in SAS controller, the system boots fine and provides file sharing over samba, etc.

The main problem with this machine is that it has an EFI BIOS which appears to bind the external/add-on SAS HBA before the internal one on which the system drives are hosted (they are both LSI SAS 3801e HBAs - the internal one has an IBM part number, while the external one is LSI).  Changing the BIOS order in the option ROMs and disabling boot services on the external/add-on HBA does not seem to affect the adapter binding order, and the EFI BIOS does not provide any mechanism for controlling this.  Which brings us to the bootup issues.

Note that after booting into rescue mode from the installer image mdadm works as expected, with raid arrays detected and started without any problems on the disks located on both controllers.  The md configuration is :

/boot is on /dev/md0 with devices  /dev/sd[mn]3, raid1
/root is on /dev/md1 with devices /dev/sd[op]1, raid1
/usr, /var and swap are on an LVM vg on /dev/md2 with devices /dev/sd[m-p]2, raid10
/boot/efi is on /dev/md64 with devices /dev/sd[mn]2, raid1

A separate vg for the data volume is on /dev/md127 with devices /dev/sd[a-l], raid6 (whole disks)

/dev/sd[m-p] are hosted on the internal HBA, and appear as /dev/sd[a-d] when the add-in HBA is disabled.  /dev/sd[a-l] hang of the add-in HBA.  Note that in rescue mode, the disks start at /dev/sdc, as I'm booting from a virtual console, and /dev/sda and /dev/sdb are assigned to virtual USB disks.

The problem seems to be that during bootup the raid arrays are autodetected rather than using mdadm.  If the raid456 module is not included in the initrd image, booting fails with raid6 personality not detected, when the boot process tries to start /dev/md1 incrementally with /dev/sda as its first member.  (This appears not to reference /etc/mdadm.conf in the initrd at all.)

If the raid456 module is included in the initrd image (using --with=raid456), booting fails with /dev/sda added incrementally to /dev/md1 and /dev/sdb added incrementally to /dev/md2; it appears autodetect fails because the raid device was built from rescue mode, so the components are listed with different letters in the superblock.

If I put a line in /etc/mdadm.conf in the initird image, to only scan partitioned disks (DEVICE /dev/sd*[1234]), boot hangs after loading the raid modules. (Potentially on the call to mkblkdevs after scsi_wait_scan is rmmod-ed.)

Partitioning /dev/sd[a-l] and setting the partition type to other than 'raid' does not seem to make any difference, during bootup the kernel still tries to assemble the root raid device (/dev/md1) from /dev/sda (though it is on /dev/md[op]1.

This seems to suggest that the md devices are being started by kernel raid autodetection rather than mdadm.  Simply switching to mdamd would likely solve the problem, given it works fine in rescue mode.  Alternative suggestions are welcome, of course.

Thanks for the help,
   Murthy


      

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines

[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux