On Tue, 2009-12-01 at 11:00 -0500, Todd Denniston wrote: > John Austin wrote, On Tue, 24 Nov 2009 12:21:58 +0000: > > On Mon, 2009-11-23 at 15:00 -0800, Rick Stevens wrote: > >> On 11/21/2009 10:41 AM, John Austin wrote: > >>> On Sat, 2009-11-21 at 11:11 -0700, Greg Woods wrote: > >>>> On Sat, 2009-11-21 at 10:09 +0000, John Austin wrote: > >>>> > >>>>> When copying a large file (2.7GB) from the server to the > >>>>> F12 m/c a complete freeze of the F12 machine occurs. > >>>> > >>>> I haven't seen freezes, but I have seen corruption when trying to copy > >>>> large files (e.g. like a DVD iso image) via NFS. In fact, this happened > >>>> to me when I was trying to install an F12 virtual machine on my F11 box > >>>> (so I could try it out before deciding whether or not to bite the bullet > >>>> and upgrade the host OS). I copied over the DVD iso image, then tried to > >>>> install a VM from it, and it failed the media test. Sure enough, it also > >>>> failed the sha256sum test. Copying the same DVD iso file via scp instead > >>>> worked fine. I do not trust NFS for large files. > >>>> > >>>> --Greg > >>>> > >>>> > >>> Hi Greg > >>> > >>> That's interesting and very worrying - surely it can't/shouldn't happen! > >>> > >>> I have been using NFS for years for all types/sizes of files and > >>> never had a problem until the last couple of months. > >>> > >>> 1. The Centos/RHEL 5.3/5.4 kernel had a serious bug that has been fixed with the > >>> latest kernel update > >>> > >>> 2. Now this F12 problem > >>> > >>> Surely a very large worldwide community uses NFS ? > >>> > >>> OK the F12 case could be my finger trouble or even a hardware problem > >>> > >>> I will install F12 on a second machine and test again (against the same server) > >> Can you verify that you run into the same issue if you run NFS over TCP > >> as opposed to NFS over UDP (it's an option in the mount command on the > >> client, use either "proto=tcp" or "proto=udp"). > >> > >> By default, the system queries the server and selects a protocol based > >> on what's being asked of it. See the "TRANSPORT METHODS" section of > >> "man nfs". > >> ---------------------------------------------------------------------- > >> - Rick Stevens, Systems Engineer ricks@xxxxxxxx - > >> - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - > >> - - > >> - The Theory of Rapitivity: E=MC Hammer - > >> - -- Glenn Marcus (via TopFive.com) - > >> ---------------------------------------------------------------------- > > > > > > Hi Rick > > > > Many thanks for the reply - you have found a work-around !! > > > > Just tested my machine with UDP and TCP > > This was using md5sum for about 10GB over the NFS mount > > > > 1. The default for F12/Centos5.4 appears to be TCP - which freezes > > 2. Forcing UDP gives NO errors for 10GB transfer > > 3. Forcing TCP gives a freeze > > > > Having briefly read the man pages this is the opposite of what I would > > expect and of what you suggest !! > > > > There must be a timing problem somewhere - > > > > Please see the other thread "Sky2 NIC Problem? - Was F12 NFS Failures" > > for other tests I have carried out > > > > Regards > > > > John > > > > > > > > > > what are your other mount options? > having seen the "Sky2 NIC Problem" message, your card/driver may be having issues, but some nfs > options may help/hurt. > > I am assuming that you only have 'hard' and not 'hard,intr' as options to the mount. > And for transferring large files over NFS, I have had experiences that say stay away from 'soft' NFS. > > it is interesting that TCP nfs locks the machine and fails to copy the very large file, while UDP > succeeds in copying the same file with the same device/drver. BTW when you say that UDP gave no > errors, do you mean that from the user program perspective (cp, and then sha256sum) there were no > errors, or that from both the user and syslog perspective there were no errors? Purely from the user point of view, I did not check the number of re-transmission, log files etc. > I am wondering if > you have found a place where the UDP code deals with a bad packet correctly and the TCP version has > not seen enough (bad environment) testing. > Wouldn't happen to have a serial cable around so you can > capture where the kernel goes bonkers at would you? (note, never done the serial console myself.) > I've probably got a serial cable in the roof somewhere but the machine has no serial ports! Shuttle SA76G2. Hi Todd I must admit that I have basically given up with the sky2 driver for the moment. I gave up after reading about problems with the sky2 driver way back to something like 2.6.18. I had a spare D-Link gigabit NIC and have been using that. My whole network depends on NFS working perfectly so a dodgy driver is no use to me. It must be a very subtle bug as I cannot cause the freeze with 1. scp 10GB across the network 2. md5sum across a CIFS samba mount 3. md5sum across NFS4 UDP Maybe you are right and it would fail if I tried harder/longer Regards John -- fedora-list mailing list fedora-list@xxxxxxxxxx To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines