On (28/09/07 11:41), Christoph Lameter didst pronounce:
> On Fri, 28 Sep 2007, Peter Zijlstra wrote:
>
> > memory got massively fragemented, as anti-frag gets easily defeated.
> > setting min_free_kbytes to 12M does seem to solve it - it forces 2 max
> > order blocks to stay available, so we don't mix types. however 12M on
> > 128M is rather a lot.
>
> Yes, strict ordering would be much better. On NUMA it may be possible to
> completely forbid merging.
The forbidding of merging is trivial and the code is isolated to one function
__rmqueue_fallback(). We don't do it because the decision at development
time was that it was better to allow fragmentation than take a reclaim step
for example[1] and slow things up. This is based on my initial assumption
of anti-frag being mainly of interest to hugepages which are happy to wait
long periods during startup or fail.
> We can fall back to other nodes if necessary.
> 12M is not much on a NUMA system.
>
> But this shows that (unsurprisingly) we may have issues on systems with a
> small amounts of memory and we may not want to use higher orders on such
> systems.
>
This is another option if you want to use a higher order for SLUB by
default. Use order-0 unless you are sure there is enough memory. At boot
if there is loads of memory, set the higher order and up min_free_kbytes on
each node to reduce mixing[2]. We can test with Peters uber-hostile
case to see if it works[3].
> The case you got may be good to use as a testcase for the virtual
> fallback. Hmmmm...
For sure.
> Maybe it is possible to allocate the stack as a virtual
> compound page. Got some script/code to produce that problem?
>
[1] It might be tunnel vision but I still keep hugepages in mind as the
principal user of anti-frag. Andy used to have patches that force evicted
pages of the "foreign" type when mixing occured so the end result was
no mixing. We never fully completed them because it was too costly
for hugepages.
[2] This would require the identification of mixed blocks to be a
statistic available in mainline. Right now, it's only available in -mm
when PAGE_OWNER is set
[3] The definition of working in this case being that order-0
allocations fail which he has produced
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]