Re: assert/crash in __rmqueue() when enabling CONFIG_NUMA

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bob Picco wrote:
> Nick Piggin wrote:	[Tue May 02 2006, 10:25:57AM EDT]
> 
>>Martin J. Bligh wrote:
>>
>>>>Oh that's a 32bit kernel. I don't think the 32bit NUMA has ever worked
>>>>anywhere but some Summit systems (at least every time I tried it it 
>>>>blew up on me and nobody seems to use it regularly). Maybe it would be 
>>>>finally time to mark it CONFIG_BROKEN though or just remove it (even 
>>>>by design it doesn't work very well) 
>>>
>>>
>>>Bollocks. It works fine, and is tested every single day, on every git
>>>release, and every -mm tree.
>>
>>Whatever the case, there definitely does not appear to be sufficient
>>zone alignment enforced for the buddy allocator. I cannot see how it
>>could work if zones are not aligned on 4MB boundaries.
>>
>>Maybe some architectures / subarch code naturally does this for us,
>>but Ingo is definitely hitting this bug because his config does not
>>(align, that is).
>>
>>I've randomly added a couple more cc's.
>>
> 
> The patch below isn't compile tested or correct for those cases where
> alloc_remap is called or where arch code has allocated node_mem_map for
> CONFIG_FLAT_NODE_MEM_MAP. It's just conveying what I believe the issue is.
> 
> Andy added code to buddy allocator which doesn't require the zone's endpoints
> to be aligned to MAX_ORDER. I think the issue is that the buddy
> allocator requires the node_mem_map's endpoints to be MAX_ORDER aligned. 
> Otherwise __page_find_buddy could compute a buddy not in node_mem_map
> for partial MAX_ORDER regions at zone's endpoints. page_is_buddy will
> detect that these pages at endpoints aren't PG_buddy (they were zeroed
> out by bootmem allocator and not part of zone).  Of course the negative
> here is we could waste a little memory but the positive is eliminating
> all the old checks for zone boundary conditions.

Yes this is correct.  The buddy location algorithm uses the relative pfn
number to locate the buddy.  Both the old anew new free detect
algorithms require a struct page exist for that buddy regardless of
whether the page exists in memory.  Thus the node_mem_map needs to exist
out to a MAX_ORDER boundry in both directions.  With real machines we
would likely get this as memory is mostly in larger chunks than
MAX_ORDER and generally maximally aligned.  From what I can see we could
potentially not be allocating the end correctly but the rmap allocation
would likely be larger than the request so we'd get away with it.

> 
> SPARSEMEM won't encounter this issue because of MAX_ORDER size
> constraint when SPARSEMEM is configured. ia64 VIRTUAL_MEM_MAP doesn't
> need the logic either because the holes and endpoints are handled
> differently.  This leaves checking alloc_remap and other arches which
> privately allocate for node_mem_map.
> 
> Any how I could be totally wrong but like I said this requires more
> thought.

I'll have a go at testing this to see what difference it makes.  I think
there might be a problem with the buddy merging at zone boundries as we
are anyhow (or the code commentry is incomplete), so am going to have a
look at that at the same time.

-apw
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux