Re: F12 NFS Failures

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2009-12-01 at 11:00 -0500, Todd Denniston wrote:
> John Austin wrote, On Tue, 24 Nov 2009 12:21:58 +0000:
> > On Mon, 2009-11-23 at 15:00 -0800, Rick Stevens wrote:
> >> On 11/21/2009 10:41 AM, John Austin wrote:
> >>> On Sat, 2009-11-21 at 11:11 -0700, Greg Woods wrote:
> >>>> On Sat, 2009-11-21 at 10:09 +0000, John Austin wrote:
> >>>>
> >>>>> When copying a large file (2.7GB) from the server to the
> >>>>> F12 m/c a complete freeze of the F12 machine occurs.
> >>>>
> >>>> I haven't seen freezes, but I have seen corruption when trying to copy
> >>>> large files (e.g. like a DVD iso image) via NFS. In fact, this happened
> >>>> to me when I was trying to install an F12 virtual machine on my F11 box
> >>>> (so I could try it out before deciding whether or not to bite the bullet
> >>>> and upgrade the host OS). I copied over the DVD iso image, then tried to
> >>>> install a VM from it, and it failed the media test. Sure enough, it also
> >>>> failed the sha256sum test. Copying the same DVD iso file via scp instead
> >>>> worked fine. I do not trust NFS for large files.
> >>>>
> >>>> --Greg
> >>>>
> >>>>
> >>> Hi Greg
> >>>
> >>> That's interesting and very worrying - surely it can't/shouldn't happen!
> >>>
> >>> I have been using NFS for years for all types/sizes of files and
> >>> never had a problem until the last couple of months.
> >>>
> >>> 1.  The Centos/RHEL 5.3/5.4 kernel had a serious bug that has been fixed with the
> >>> 	latest kernel update
> >>>
> >>> 2.  Now this F12 problem
> >>>
> >>> Surely a very large worldwide community uses NFS ?
> >>>
> >>> OK the F12 case could be my finger trouble or even a hardware problem
> >>>
> >>> I will install F12 on a second machine and test again (against the same server)
> >> Can you verify that you run into the same issue if you run NFS over TCP
> >> as opposed to NFS over UDP (it's an option in the mount command on the
> >> client, use either "proto=tcp" or "proto=udp").
> >>
> >> By default, the system queries the server and selects a protocol based
> >> on what's being asked of it.  See the "TRANSPORT METHODS" section of
> >> "man nfs".
> >> ----------------------------------------------------------------------
> >> - Rick Stevens, Systems Engineer                      ricks@xxxxxxxx -
> >> - AIM/Skype: therps2        ICQ: 22643734            Yahoo: origrps2 -
> >> -                                                                    -
> >> -               The Theory of Rapitivity: E=MC Hammer                -
> >> -                                  -- Glenn Marcus (via TopFive.com) -
> >> ----------------------------------------------------------------------
> > 
> > 
> > Hi Rick
> > 
> > Many thanks for the reply - you have found a work-around !!
> > 
> > Just tested my machine with UDP and TCP
> > This was using md5sum for about 10GB over the NFS mount
> > 
> > 1. The default for F12/Centos5.4 appears to be TCP - which freezes
> > 2. Forcing UDP gives NO errors for 10GB transfer
> > 3. Forcing TCP gives a freeze
> > 
> > Having briefly read the man pages this is the opposite of what I would
> > expect and of what you suggest !!
> > 
> > There must be a timing problem somewhere - 
> > 
> > Please see the other thread "Sky2 NIC Problem? - Was F12 NFS Failures"
> > for other tests I have carried out
> > 
> > Regards
> > 
> > John
> > 
> > 
> > 
> > 
> 
> what are your other mount options?
> having seen the "Sky2 NIC Problem" message, your card/driver may be having issues, but some nfs 
> options may help/hurt.
> 
> I am assuming that you only have 'hard' and not 'hard,intr' as options to the mount.
> And for transferring large files over NFS, I have had experiences that say stay away from 'soft' NFS.
> 
> it is interesting that TCP nfs locks the machine and fails to copy the very large file, while UDP 
> succeeds in copying the same file with the same device/drver. BTW when you say that UDP gave no 
> errors, do you mean that from the user program perspective (cp, and then sha256sum) there were no 
> errors, or that from both the user and syslog perspective there were no errors?

Purely from the user point of view, I did not check the number of
re-transmission, log files etc.

> I am wondering if 
> you have found a place where the UDP code deals with a bad packet correctly and the TCP version has 
> not seen enough (bad environment) testing.


> Wouldn't happen to have a serial cable around so you can 
> capture where the kernel goes bonkers at would you? (note, never done the serial console myself.)
> 
I've probably got a serial cable in the roof somewhere but the machine
has no serial ports! Shuttle SA76G2.

Hi Todd

I must admit that I have basically given up with the sky2 driver for the
moment.

I gave up after reading about problems with the sky2 driver way back to
something like 2.6.18.

I had a spare D-Link gigabit NIC and have been using that.

My whole network depends on NFS working perfectly so a dodgy driver is
no use to me.

It must be a very subtle bug as I cannot cause the freeze with
1. scp 10GB across the network
2. md5sum across a CIFS samba mount
3. md5sum across NFS4 UDP

Maybe you are right and it would fail if I tried harder/longer

Regards

John


  





-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines

[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux