Re: [PATCH] BUG: gfp_zone() not mapping zone modifiers correctly and bad ordering of fallback lists

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 13 Jan 2006, Dave Hansen wrote:

> On Fri, 2006-01-13 at 12:16 -0800, Andrew Morton wrote:
> > [email protected] (Mel Gorman) wrote:
> > > build_zonelists() attempts to be smart, and uses highest_zone() so that it
> > > doesn't attempt to call build_zonelists_node() for empty zones.  However,
> > > build_zonelists_node() is smart enough to do the right thing by itself and
> > > build_zonelists() already has the zone index that highest_zone() is meant
> > > to provide. So, remove the unnecessary function highest_zone().
> >
> > Dave, Andy: could you please have a think about the fallback list thing?
>
> It's bogus.  Mel, I didn't take a close enough look when we were talking
> about it earlier, and I fear I led you astray.  I misunderstood what it
> was trying to do, and though that the zone_populated() check replaced
> the highest_zone() check, when they actually do completely different
> things.
>
> highest_zone(zone_nr) actually means, given these "zone_bits" (which is
> actually a set of __GFP_XXXX flags), what is the highest zone number
> that we could possibly use to satisfy an allocation with those __GFP
> flags.
>

True, but what is passed in is not "zone_bits", but a ZONE_something type.
Given the bits __GFP_HIGHMEM, it would return ZONE_HIGHMEM. What actually
happens is highest_zone(ZONE_HIGHMEM) gets called and returns ZONE_DMA.
The fallback lists then look like;

[17179569.184000] Dumping pgdat fallback lists
[17179569.184000] 0   Normal      DMA
[17179569.184000] 1      DMA
[17179569.184000] 2  HighMem   Normal      DMA
[17179569.184000] 3      DMA

This currently happens to work. It goes wrong when an additional zone is
created unless more bits are consumed for the zone modifiers than is
really required.

The additional zone is for a zone-based approach to anti-fragmentation.

> We can't just get rid of it.  If we do, we might put a highmem zone in
> the fallback list for a normal zone.  Badness.
>

That would be obviously broken, but I am having trouble seeing how it
could happen as with this patch, we start with the highest possible zone
that can be used and count back.

> So, Mel, I have couple of patches that I put together that the two
> copies of build_zonelists(), and move some of build_zonelists() itself
> down into build_zonelists_node(), including the highest_zone() call.
> They're no good to you by themselves.  But, I think we can make a little
> function to go into the loop in build_zonelists_node().  The new
> function would ask, "can this zone be used to satisfy this GFP mask?"
> We'd start the loop at the absolutely highest-numbered zone.  I think
> that's a decently clean way to do what you want with the reclaim zone.
>

Ok.

> In the process of investigating this, I noticed that Andy's nice
> calculation and comment for GFP_ZONETYPES went away.  Might be nice to
> put it back, just so we know how '5' got there:
>
> http://www.kernel.org/git/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=ac3461ad632e86e7debd871776683c05ef3ba4c6
>
> Mel, you might also want to take a look at what Linus is suggesting
> there.
>
> -- Dave
>

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux