Re: [PATCH] VM: add vm.free_node_memory sysctl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 03, 2005 at 02:59:22PM -0500, Ray Bryant wrote:
> On Wednesday 03 August 2005 09:38, Andi Kleen wrote:
> > On Wed, Aug 03, 2005 at 10:24:40AM -0400, Martin Hicks wrote:
> > > On Wed, Aug 03, 2005 at 04:15:29PM +0200, Andi Kleen wrote:
> > > > On Wed, Aug 03, 2005 at 09:56:46AM -0400, Martin Hicks wrote:
> > > > > Here's the promised sysctl to dump a node's pagecache.  Please
> > > > > review!
> > > > >
> > > > > This patch depends on the zone reclaim atomic ops cleanup:
> > > > > http://marc.theaimsgroup.com/?l=linux-mm&m=112307646306476&w=2
> > > >
> > > > Doesn't numactl --bind=node memhog nodesize-someslack do the same?
> > > >
> > > > It just might kick in the oom killer if someslack is too small
> > > > or someone has unfreeable data there. But then there should be
> > > > already an sysctl to turn that one off.
> > >
> Hmmm.... What happens if there are already mapped pages (e. g. mapped in the 
> sense that pages are mapped into an address space) on the node and you want 
> to allocate some more, but can't because the node is full of clean page cache 
> pages?   Then one would have to set the memhog argument to the right thing to 

If you have a bind policy in the memory grabbing program then the standard try_to_free_pages
should DTRT. That is because we generated a custom zone list only containing nodes
in that zone and the zone reclaim only looks into those.

With prefered or other policies it's different though, in that cases t_t_f_p
will also look into other nodes because the policy is not binding.

That said it might be probably possible to even make non bind policies more
aggressive at freeing in the current node before looking into other nodes. 
I think the zone balancing has been mostly tuned on non NUMA systems, so
some improvements might be possible here.

Most people don't use BIND and changing the default policies like this 
might give NUMA systems a better "out of the box" experience.  However this 
memory balance is very subtle code and easy to break, so this would need some
care.

I don't think sysctls or new syscalls are the way to go here though.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux