Re: very strange issue with sata,<4G Ram, and ext3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I forgot to mention the kernels that have been tried- 2.6.8.1, 2.6.11.7, 
2.6.12-rc3, and a redhat 2.6.9.


On Thursday 28 April 2005 12:16 pm, Rick Warner wrote:
> Hello,
>  We are having a very strange issue on some 64bit systems.  We have a 32
> node cluster of EM64T's (supermicro boards).  We are using our node restore
> software to propagate a linux install onto them.  We do a pxe boot to a
> kernel and initrd image.  The initrd has some config info, a basic root
> filesystem, and a restore script.  The kernel is passed init=/restore  (the
> restore script itself).  The script runs dhcp, gets an ip, then nfs mounts
> the master node of the cluster.  The backup image is stored on the master
> node's nfs mount.  The script then applies a backed up partition table and
> then mkfs's the partitions, mounts them, untars a backup tar to the drive,
> and then makes it bootable with grub.
>
>  On these systems, we are getting ext2 errors from the initrd during the
> untarring.  Soon after, we start getting seg faults on random things (looks
> like stuff caused by the still running dhcp client), and then a continuous
> stream of segfaults on the restore script itself (restore[1]).
>
>  The systems being restored are dual em64t's with 2G of ram and 200G sata
> drives.  If we up the memory to 4G, the restores complete without error. If
> we reduce down to 512M, the segfaults start at the mkfs stage instead of
> the untar stage. We've tried different sata drives and controllers without
> change.  Switching to ide drives works.  Switching to reiserfs instead of
> ext3 for the destination drives works too.  We've tried enabling the scsi
> debug stuff as well as the jbd debug stuff for ext3 without getting any
> more info.  We also enabled the kernel debug options too.  We've also tried
> using the deprecated ide based sata drivers instead of the scsi based ones
> without success.  We have tried restoring to Intel's Jarell EM64T systems
> as well as an Arima HDAMA opteron with the same errors.  We've also tried
> adding swap space ASAP in the inird image.
>
>  This problem is really baffling us and we're not quite sure what to check
> into next.  Any ideas?

-- 
Richard Warner
Lead Systems Integrator
Microway, Inc
(508)732-5517
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux