Re: [RFC 0/7] Postphone reclaim laundry to write at high water marks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2007-08-22 at 13:16 -0700, Christoph Lameter wrote:
> On Wed, 22 Aug 2007, Peter Zijlstra wrote:


> > > > As shown, there are cases where there just isn't any memory to reclaim.
                                                                       ^^^^^^^
> > > > Please accept this.

> > > That is an extreme case that AFAIK we currently ignore and could be 
> > > avoided with some effort.
> > 
> > Its not extreme, not even rare, and its handled now. Its what
> > PF_MEMALLOC is for.
> 
> No its not. If you have all pages allocated as anonymous pages and your 
> writeout requires more pages than available in the reserves then you are 
> screwed either way regardless if you have PF_MEMALLOC set or not.

Christoph, we were talking about memory to reclaim, no about exhausting
the reserves.

> > > The initial PF_MEMALLOC patchset seems to be 
> > > still enough to deal with your issues.
> > 
> > Take the anonyous workload, user-space will block once the page
> > allocator hits ALLOC_MIN. Network will be able to receive until
> > ALLOC_MIN|ALLOC_HIGH - if the completion doesn't arrive by then it will
> > start dropping all packets until there is memory again. But userspace is
> > wedged and hence will not consume the network traffic, hence we
> > deadlock.
> > 
> > Even if there is something to reclaim initially, if the pressure
> > persists that can eventually be exhausted.
> 
> Sure ultimately you will end up with pages that are all unreclaimable if 
> you reclaim all reclaimable memory.
> 
> > > multiple critical tasks on various devices that have various memory needs. 
> > > So multiple critical spots can happen concurrently in multiple 
> > > application contexts.
> > 
> > yes, reclaim can be unbounded concurrent, and that is one of the
> > (theoretically) major problems we currently have.
> 
> So your patchset is not fixing it?

No, and I never said it would. I've been meaning to do one that does
though. Just haven't come around to actually doing it :-/

> > > We have that with PF_MEMALLOC.
> > 
> > Exactly. But if you recognise the need for PF_MEMALLOC then what is this
> > argument about?
> 
> The PF_MEMALLOC patchset f.e. is about avoiding to go out of 
> memory when there is still memory available even if we are doing a 
> PF_MEMALLOC allocation and would OOM otherwise.

Right, but as long as there is a need for PF_MEMALLOC there is a need
for the patches I proposed.

> > Networking can currently be seen as having two states:
> > 
> >  1 receive packets and consume memory
> >  2 drop all packets (when out of memory)
> > 
> > I need a 3rd state:
> > 
> >  3 receiving packets but not consuming memory
> 
> So far a good idea. If you are not consuming memory then why are the 
> allocators involved?

Because I do need to receive some packets, its just that I'll free them
again. So it won't keep consuming memory. This needs a little pool of
memory in order to operate in a stable state.

Its: alloc, receive, inspect, free
total memory use: 0
memory delta: a little
 
(its just that you need to be able to receive a significant number of
packets, not 1, due to funny things like ip-defragmentation before you
can be sure to actually receive 1 whole tcp packet - but the idea is the
same)

> > Now, I need this state when we're in PF_MEMALLOC territory, because I
> > need to be able to process an unspecified amount of network traffic in
> > order to receive the writeout completion.
> > 
> > In order to operate this 3rd network state, some memory is needed in
> > which packets can be received and when deemed not important freed and
> > reused.
> > 
> > It needs a bounded amount of memory in order to process an unbounded
> > amount of network traffic.
> > 
> > What exactly is not clear about this? If you accept the need for
> > PF_MEMALLOC you surely must also agree that at the point you're using it
> > running reclaim is useless.
> 
> Yes looks like you would like to add something to the network layer to 
> filter important packets. As long as you stay within PF_MEMALLOC 
> boundaries you can allocate and throw packets away. If you want to have a 
> reserve that is secure and just for you then you need to take it away from 
> the reserves (which in turn will lead reclaim to restore them).

Ah, but also note that _using_ PF_MEMALLOC is the trigger to enter that
3rd network state. These two are tightly coupled. You only need this 3rd
state when under PF_MEMALLOC, otherwise we could just receive normally.

So, my thinking was that, if the current reserves are good enough to
keep the system 'deadlock' free, I can just enlarge the reserves by
whatever it is I need for that network state and we're all good, no?

Why separate these two? If the current reserve is large enough (and
theoretically it is not - but I'm meaning to fix that) it will not
consume the extra memory I added below.

Note how:
  [PATCH 09/10] mm: emergency pool
pushes up the current reserves in a fashion so as to maintain the
relative operating range of the page allocator (distance between
min,low,high and scaling of the wmarks under ALLOC_HIGH|ALLOC_HARDER).

> > > > Also, failing a memory allocation isn't bad, why are you so worried
> > > > about that? It happens all the time.
> > > 
> > > Its a performance impact and plainly does not make sense if there is 
> > > reclaimable memory availble. The common action of the vm is to reclaim if 
> > > there is a demand for memory. Now we suddenly abandon that approach?
> > 
> > I'm utterly confused by this, on one hand you recognise the need for
> > PF_MEMALLOC but on the other hand you're saying its not needed and
> > anybody needing memory (even reclaim itself) should use reclaim.
> 
> The VM reclaims memory on demand but in exceptional limited cases where we 
> cannot do so we use the reserves. I am sure you know this.

Its the abandon part I got confused about. I'm not at all abandoning
reclaim, its just that I must operate under PF_MEMALLOC, so reclaim is
pointless.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux