Re: nfs udp 1000/100baseT issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/16/06, Neil Brown <[email protected]> wrote:
> On Thursday March 16, [email protected] wrote:
> > On 3/16/06, Jan Engelhardt <[email protected]> wrote:
> > > >
> > > >a while ago i noticed a issue when one has a nfs server that has
> > > >gigabit connection
> > > >to a network and a client that connects to that network instead via 100baseT
> > > >that udp connection from client to server fails the client gets a
> > > >server not responding
> > > >message when trying to access a file, interesting bit is you can get a directory
> > > >listing without issue
> > > >work around i found for this is adding proto=tcp to the client side
> > > >and all works
> > > >without error
> > >
> > > UDP has its implications, like silently dropping packets when the link
> > > is full, by design. Try tcpdump on both systems and compare what packets
> > > are sent and which do arrive. The error message is then probably because
> > > the client is confused of not receiving some packets.
> >
> > after compairing a working and not working client i found that
> > packets containing offset 19240, 20720, 22200 are missing
> > and the 100baseT client had an extra offset of 32560
> > on the working client it ends at 31080
> >
> > the missing ones are mostly constantly missing 22200 appears every so often
> > on retransmission and 23680 also disappears every so often
> >
> > i hope that isnt too confusing i dont use tcpdump type stuff much
> > (well i did give up on tcpdump and had to use ethereal...)
>
> This is all to be expected.  I remember having this issue with a
> server on 100M and clients in 10M...
>
> There is no flow control in UDP

is this a linux design flaw or just nature of udp?

>.  If anything gets lots, the client
> has to resend the request, and the server then has to respond again.
> If the respond is large (e.g. a read) and gets fragmented (if > 1500bytes)
> then there is a good chance that one or more fragments of a reply will
> get lots in the switch stepping down from 1G to 100M.  Every time.
>
> Your options include:
>
>   - use tcp

im wondering why this isnt the default to begin with

>   - get a switch with a (much) bigger packet buffer
>   - drop the server down to 100M
>   - drop the nfs rsize down to 1024 to you don't get fragments.
these last 2 options sound rather painfull speed wise
tcp work around is prob by far the easiest

>
> NeilBrown
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux