Re: TSO and IPoIB performance degradation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 20, 2006 at 02:37:04AM -0800, David S. Miller wrote:
> From: "Michael S. Tsirkin" <[email protected]>
> Date: Mon, 20 Mar 2006 12:22:34 +0200
> 
> > Quoting r. David S. Miller <[email protected]>:
> > > The path an SKB can take is opaque and unknown until the very last
> > > moment it is actually given to the device transmit function.
> > 
> > Why, I was proposing looking at dst cache. If that's NULL, well,
> > we won't stretch ACKs. Worst case we apply the wrong optimization.
> > Right?
> 
> Where you receive a packet from isn't very useful for determining
> even the full patch on which that packet itself flowed.
> 
> More importantly, packets also do not necessarily go back out over the
> same path on which packets are received for a connection.  This is
> actually quite common.
> 
> Maybe packets for this connection come in via IPoIB but go out via
> gigabit ethernet and another route altogether.
> 
> > What I'd like to clarify, however: rfc2581 explicitly states that in
> > some cases it might be OK to generate ACKs less frequently than
> > every second full-sized segment. Given Matt's measurements, TCP on
> > top of IP over InfiniBand on Linux seems to hit one of these cases.
> > Do you agree to that?
> 
> I disagree with Linux changing it's behavior.  It would be great to
> turn off congestion control completely over local gigabit networks,
> but that isn't determinable in any way, so we don't do that.
> 
> The IPoIB situation is no different, you can set all the bits you want
> in incoming packets, the barrier to doing this remains the same.
> 
> It hurts performance if any packet drop occurs because it will require
> an extra round trip for recovery to begin to be triggered at the
> sender.
> 
> The network is a black box, routes to and from a destination are
> arbitrary, and so is packet rewriting and reflection, so being able to
> say "this all occurs on IPoIB" is simply infeasible.
> 
> I don't know how else to say this, we simply cannot special case IPoIB
> or any other topology type.

David is right. If you care about performance, you are already using SDP
or verbs layer for the transport anyway. If I am going to be doing IPoIB,
it's because eventually I expect the packet might get off the IB network
and onto some other network and go halfway across the country.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux