On Thu, Nov 30, 2006 at 08:22:06PM -0800, David Miller wrote:
>
> What MAX_HEADER's setting is trying to do is optimistically allocate
> enough for a single level of tunnelling. It does not handle nested
> tunneling at all, of course.
Agreed, I should've said MAX_HEADER.
> Actually, I wonder how antiquated this all is. I bet we could get rid
> of MAX_HEADER, then if we have to realloc headroom, we adjust some
> per-device header thing which will behave like your global value idea
> does. On the next allocation, we'll do the right thing. Although I
> cannot come up with a scheme that works without reintroducing another
> net_device pointer to sk_buff, which seems necessary to handle arbitrary
> nesting. :-/
Actually the scarier part is that TCP as well as ip_route_me_harder
doesn't guarantee enough headroom for IPsec. Fortunately TCP reserves
enough room (128 bytes) by default that it's unlikely to break with
non-nested IPsec. But it's still pretty nasty.
So in general when allocating packets we have two scenarios:
1) The dst is known and fixed, i.e., all datagram protocols. This is
the easy case where the headroom is known exactly beforehand.
2) The dst is unknown or may vary, this includes TCP, SCTP and DCCP.
This is where we currently use MAX_HEADER plus some protocol-specific
headroom.
Right now the normal (non-IPsec) dst output path always checks for
sufficient headroom and reallocates if necessary (ip_finish_output2).
I propose that we make IPsec do the same thing.
This change will make the stack safe from underflow crashes in IPsec.
There is also the ip_route_me_harder path where the dst varies. It
also tries to reallocate the packet if there isn't enough headroom for
the new dst. As long both IPsec and the normal path does the headroom
check, this can in fact be removed.
We can then make it more optimal because in the cases of TCP/SCTP/DCCP
we usually have a dst object. The only problem of course is that it
may vary. However, the common case by far is that the dst stays
constant. So we can optimise for it by getting the headroom from the
current dst and rely on the last-ditch reallocation to fix things up
if needed.
For standard MTU-sized packets this discussion is moot since we have
2K of memory in each chunk. However, for ACKs it could save a bit of
memory.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]