Re: [RFC][PATCH 0/3] TCP/IP Critical socket communication mechanism

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Also, all this stuff is just a band aid because linux OOM behavior is so
> fucked up.

In our internal discussions, characterizing this as "OOM" came
up a lot, and I don't think of it as that at all. OOM is exactly what the
scheme is trying to avoid!

The actual situation we have in mind is a swap device management system
in a cluster where a remote system tells you (via socket communication to
a user-land management app) that a swap device is going to fail over and
it'd be a good idea not to do anything that requires paging out or
swapping for a short period of time. The socket communication must work,
but the system is not at all out of memory, and the important point is
that it never will be if you limit allocations to those things that are
required for the critical socket to work (and nothing/little else).
        Receiver side allocations are unavoidable, because you don't know
if you can drop the packet or not until you look at it. Some 
infrastructure
must work. But everything else can fail or succeed based on ordinary churn
in ordinary memory pools, until the "in_emergency" condition has passed.
        The critical socket(s) simply have to be out of the zero-sum game
for the rest of the allocations, because those are the (only) path to
getting a working swap device again.

If you're out of memory without a network mechanism to get you more,
this doesn't do anything for you (and it isn't intended to). And if you
mark any socket that isn't going to get you failed over or otherwise
get you more swap, it isn't going to help you, either. It isn't a priority
scheme for low-memory, it's a failover mechanism that relies on 
networking.
There are exactly 2 priorities: critical (as in "you might as well crash 
if
these aren't satisfied") and everything else.

Doing other, more general things that handle low memory, or OOM, or 
identified
priorities are great, but the problem we're interested in solving here is
really just about making socket communication work when the alternative is
a completely dead system. I think these patches do that in a reasonable 
way.
A better solution would be great, too, if there is one. :-)

                                                        +-DLS

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux