Re: 2.6.15-rc1-mm2 -- Bad page state at free_hot_cold_page (in process 'aplay', page c18eef30)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At Mon, 21 Nov 2005 15:46:50 +0000 (GMT),
Hugh Dickins wrote:
> 
> On Mon, 21 Nov 2005, Takashi Iwai wrote:
> > 
> > Sorry, I still don't figure out how __GFP_COMP solved this problem.
> > Could you enlighten me a bit?
> 
> The sequence of problems was this.
> 
> Nick's core PageReserved removal patch in 2.6.15-rc1 (and -rc2) 
> changed VM_RESERVED vmas never to free their pages on unmapping (e.g.
> on exit) - fine for remap_pfn_range areas, but a leak where others set
> VM_RESERVED; and PageReserved not to inhibit decrementing page count.
> 
> In -rc1-mm2 I tried to fix that leak by restoring VM_RESERVED to its
> previous behaviour, and using a different flag VM_UNPAGED, set in
> remap_pfn_range, for the don't-free-when-unmapping behaviour.
> 
> But there's then a problem when the underlying page was allocated
> as a high-order page, but the separate individual 0-order constituent
> pages are mapped into userspace by nopage: the page count of the first
> 0-order is raised by allocation, but the following 0-order pages are
> left with page count 0.  nopage's get_page raises that to 1,
> zap_pte_range (or whatever it uses to actually do the freeing) lowers
> that to 0, and hence frees the page, even though it's a constituent of
> the not-yet-freed high-order page.  (This had not been a problem while
> PageReserved was inhibiting decrementing the page count.)
> 
> So another of my patches in -rc1-mm2 made the PageCompound technique
> available always, no longer under #ifdef CONFIG_HUGETLB_PAGE: so that
> get_page and put_page on the later constituents of the high-order
> page get redirected to the first one, and it should work okay again.
> 
> Except that I'd missed that you actually have to choose to have your
> high-order pages supplied as compound pages, by passing __GFP_COMP.
> Since I wasn't passing that, they still weren't allocated as compound
> pages, so were still being freed too soon - and the PG_reserved flag
> found while freeing gave rise to the "Bad page state" messages seen.

I see, thanks for explanation!

Now another question arises:  Which is the recommended method for
mmapping RAM pages, vma nopage callback or remap_pfn_range()?

IIRC, in the ealier versions, the former was recommended because
remap_page_range() with page-reserve was regarded as a hack.
But, looking through these changes, I feel that remap_pfn_range() is
better (easier and stabler) than vma nopage...


> > Isn't it needed for dma_alloc_coherent() (for i386, particularly),
> > too?  dma_alloc_coherent() also gets pages with __get_free_pages().
> 
> Didn't I deal with that by adding __GFP_COMP in snd_malloc_dev_pages?

Oh yes, I overlooked it.  It must be fine.


> And (in a separate patch run past davem and wli first, to be aggregated
> with the sound/core/memalloc patch when I sign off and send to Andrew)
> in the sparc and sparc64 sbus_alloc_consistent.
> 
> It's only an issue when the high-order page might be mapped into
> userspace, then its constituents freed up by zap_pte_range; or
> locked down with get_user_pages and later released: when constituents
> of a high-order page pass through common code designed for 0-order pages.
> 
> > Also, I think we can remove all Set/ClearPageReserved() in memalloc.c
> > now.  It was there just for mmap...
> 
> That is so, but we'd prefer you to hold off for now.  The way it's
> proceeding is, for 2.6.15 we're not actually removing the Set/Clear
> PageReserved from any of the drivers or from any of the architecture
> initialization; but PageReserved is no longer serving any functional
> purpose, except where PG_reserved acts as a double-check when the page
> is freed, as to whether it's all working right.  Which was useful in
> this case, to identify where I'd forgotten all about __GFP_COMP; and
> I fear may prove useful in some other cases too.  Retaining this use
> of PG_reserved for now, gives greater confidence in our safety when we
> later advance to removing all the Set/ClearPageReserved hocus-pocus.

OK.  Let's fix them right later.


Takashi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux