Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated

Mel Gorman wrote:

On Mon, 4 Dec 2006, Andrew Morton wrote:

, but I would of course prefer to avoid
merging the anti-frag patches simply based on their stupendous size.It seems to me that lumpy-reclaim is suitable for the e1000 problem
, but perhaps not for the hugetlbpage problem.
I believe you'll hit similar problems even with lumpy-reclaim for thee1000 (I've added Andy to the cc so he can comment more). Lumpy providesa much smarter way of freeing higher-order contiguous blocks withouthaving to reclaim 95%+ of memory - this is good. However, if you arecurrently seeing situations where the allocations fails even after youpage out everything possible, smarter reclaim that eventually pages outeverything anyway will not help you (chances are it's something likepage tables that are in your way).

The pre-lumpy algorithm is capable of producing reasonable numbers ofvery low order pages. Lumpy should improve success rates producingsucessful reclaim at higher order than that. Its success is limitedhowever by the percentage of non-reclaimable pages and their distribution.

The e1000 problem is that it wants order=3 pages ie. 8 pages in size.For lumpy to have a high chance of success we would need the averageunmovable page count to be significantly less than 1 in 8 pages(assuming a random distribution) (<12% pinned). In stress testing wefind we can reclaim of the order of 70% of memory, this tends toindicates that the pinned memory is more like 25% than 10%. It wouldsuggest that we are going to find reclaim rates above order=2 are poorwithout explicit placement control.

Obviously this all depends on the workload. Our test workloads areknown to be fairly hostile in terms of fragmentation. So I would loveto see lumpy tested in the problem scenario to get some data on that setup.

This is where anti-frag comes in. It clusters pages together based ontheir type - unmovable, reapable (inode caches, short-lived kernelallocations, skbuffs etc) and movable. When kswapd kicks in, the slabcaches will be reaped. As reapable pages are clustered together, thatwill free some contiguous areas - probably enough for the e1000allocations to succeed!
If that doesn't work, kswapd and direct reclaim will start reclaimingthe "movable" pages. Without lumpy reclaim, 95%+ of memory could bepaged out which is bad. Lumpy finds the contiguous pages faster and withless IO, that's why it's important.
Tests I am aware of show that lumpy-reclaim on it's own makes little orno difference to the hugetlb page problem. However, with anti-frag,hugetlb-sized allocations succeed much more often even when under memorypressure.

At high order both traditional and lumpy reclaim are next to uselesswithout placement controls.

Whereas anti-fragmentation adds
vastly more code, but can address both problems?  Or something.
Anti-frag goes a long way to addressing both problems. Lumpy-reclaimincreases it's success rates under memory pressure and reduces theamount of reclaim that occurs.
IOW: big-picture where-do-we-go-from-here stuff.
Start with lumpy reclaim, then I'd like to merge page clustering pieceby piece, ideally with one of the people with e1000 problems testing tosee does it make a difference.
Assuming they are shown to help, where we'd go from there would be stufflike;
1. Keep non-movable and reapable allocations at the lower PFNs as much as
   possible. This is so DIMMS for higher PFNs can be removed (doesn't
   exist)
2. Use page migration to compact memory rather than depending solely on
   reclaim (doesn't exist)
3. Introduce a mechanism for marking a group of pages as being offlined so
   that they are not reallocated (code that does something like this
   exists)
4. Resurrect the hotplug-remove code (exists, but probably very stale)
5. Allow allocations for hugepages outside of the pool as long as the
   process remains with it's locked_vm limits (patches were posted to
   libhugetlbfs last Friday. will post to linux-mm tomorrow).


-apw

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
  - From: [email protected] (Mel Gorman)
- Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
  - From: Andrew Morton <[email protected]>
- Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
  - From: Mel Gorman <[email protected]>
- Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
  - From: Andrew Morton <[email protected]>
- Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
  - From: [email protected] (Mel Gorman)
- Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
  - From: Andrew Morton <[email protected]>
- Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
  - From: Mel Gorman <[email protected]>

Prev by Date: Re: [PATCH v2 04/13] Connection Manager
Next by Date: Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
Previous by thread: Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
Next by thread: Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]