This is V2 for the memory compaction patches. They depend on the two starting
patches from the memory hot-remove patchset which I've included here as the
first patch. All comments are welcome and they should be in a state useful
for wider testing.
Changelog since V1
o Bug fix when checking if a given node ID is valid or not
o Using latest patch from Kame-san to compact memory in-kernel
o Added trigger for direct compaction instead of direct reclaim
o Obey watermarks in split_pagebuddy_pages()
o Do not call lru_add_drain_all() frequently
The patchset implements memory compaction for the page allocator reducing
external fragmentation so that free memory exists as fewer, but larger
contiguous blocks. Instead of being a full defragmentation solution,
this focuses exclusively on pages that are movable via the page migration
mechanism.
The compaction mechanism operates within a zone and moves movable pages
towards the higher PFNs. Grouping pages by mobility biases the location
of unmovable pages is biased towards the lower addresses, so the strategies
work in conjunction.
A full compaction run involves two scanners operating within a zone - a
migration and a free scanner. The migration scanner starts at the beginning
of a zone and finds all movable pages within one pageblock_nr_pages-sized
area and isolates them on a migratepages list. The free scanner begins at
the end of the zone and searches on a per-area basis for enough free pages to
migrate all the pages on the migratepages list. As each area is respecively
migrated or exhaused of free pages, the scanners are advanced one area.
A compaction run completes within a zone when the two scanners meet.
This is what /proc/buddyinfo looks like before and after a compaction run.
mel@arnold:~/results$ cat before-buddyinfo.txt
Node 0, zone DMA 150 33 6 4 2 1 1 1 1 0 0
Node 0, zone Normal 7901 3005 2205 1511 758 245 34 3 0 1 0
mel@arnold:~/results$ cat after-buddyinfo.txt
Node 0, zone DMA 150 33 6 4 2 1 1 1 1 0 0
Node 0, zone Normal 1900 1187 609 325 228 178 110 32 6 4 24
Memory compaction may be triggered explicitly by writing a node number to
/proc/sys/vm/compact_node. When a process fails to allocate a high-order
page, it may compact memory in an attempt to satisfy the allocation. Explicit
compaction does not finish until the two scanners meet. Direct compaction
ends if a suitable page becomes available.
The first patch is a rollup from the memory hot-remove patchset. The two
patches after that are changes to page migration. The second patch allows
CONFIG_MIGRATION to be set without CONFIG_NUMA. The third patch allows
LRU pages to be isolated in batch instead of acquiring and releasing the
LRU lock a lot.
The fourth patch exports some metrics on external fragmentation which
are relevant to memory compaction. The fifth patch is what implements
memory compaction for a single zone. The sixth patch enables a node to be
compacted explicitly by writing to a special file in /proc and the final
patch implements direct compaction.
This version of the patchset should be usable on all machines and I
consider it ready for testing. It's passed tests here on x86, x86_64 and
ppc64 machines.
Here are some outstanding items on a TODO list in
no particular order.
o Have split_pagebuddy_order make blocks MOVABLE when the free page order
is greater than pageblock_order
o Avoid racing with other allocators when direct compaction by taking the page
the moment it becomes free
o Implement compaction_debug boot-time option like slub_debug
o Implement compaction_disable boot-time option just in case
o Investigate using debugfs as the manual compaction trigger instead of proc
--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]