On Tue, 2006-09-12 at 09:26 +0100, Andy Whitcroft wrote:
> keith mannthey wrote:
> > On Mon, 2006-09-11 at 22:44 +0100, Andy Whitcroft wrote:
> >
> >
> >>>> The primary reason that the mem_map is cut from the end of ZONE_NORMAL
> >>>> is so that memory that would back that stolen KVA gets pushed out into
> >>>> ZONE_HIGHMEM, the boundary between them is moved down. By using
> >>>> reserve_bootmem we will mark the pages which are currently backing the
> >>>> KVA you are 'reusing' as reserved and prevent their release; we pay
> >>>> double for the mem_map.
> >>> Perhaps just freeing the reserve pages and remapping them at an
> >>> appropriate time could accomplish this? Sorry I don't know the KVA
> >>> "freeing" path can you describe it a little more? When are these pages
> >>> returned to the system? It was my understanding that that KVA pages
> >>> were lost (the original wayu shrinks ZONE_NORMAL and creates a hole
> >>> between the zones).
> >>
> >> No it does seem like we loose the memory at the end of NORMAL when we
> >> shrink it, but really happens is we move the boundary down. Any page
> >> above the boundary is then in HIGHMEM and available to be allocated.
> >
> > How is it available for allocation? I see it is in highmem but the
> > pmd's for the kva area are set with node local information. I don't see
> > any special code to reclaim the kva area or extend ZONE_HIGHMEM.... How
> > does having the KVA area in ZONE_HIGHMEM allow you to reclaim it?
> > (sorry if this is an easy question but I an still sorting out how it is
> > "reclaimed" in the original implementation and why it can't be reclaimed
> > as part of ZONE_NORMAL).
>
> If the boundary is at page 1024 then pages 0-1024 are direct mapped into
> KVA, page 1024 is not mapped at all and part of HIGHMEM, when that zone
> is freed up later those pages will be in the HIGHMEM zone and
> allocatable. Move the boundary down to 1000 then 24 pages of KVA will
> no longer be used as direct maps, and pages from 1000 onwards are in
> zone HIGHMEM, non direct mapped, but available to allocatable. Cut a
> section out of the middle of NORMAL and you just have to bin the pages
> 'behind' that KVA as you can persuade them to be part of HIGHMEM. You
> can't adjust the boundary to allow for it.
>
> The key here is that moving the end of NORMAL downwards moves the
> beginning of HIGHMEM downwards also pulling the pages into HIGHMEM
> implicitly, that cannot be done anywhere but at the end of NORMAL.
>
> >>>> If the initrd's are falling into this space, can we not allocate some
> >>>> bootmem for those and move them out of our way? As filesystem images
> >>>> they are essentially location neutral so this should be safe?
> >>> AFAIK bootloaders choose where map initrds. Grub seems to put it around
> >>> the top of ZONE_NORMAL but it is pretty free to map it where it wants. I
> >>> suppose INITRD_START INITRD_END and all that could be dynamic and moved
> >>> around a bit but it seems a little messy. I would rather see the special
> >>> case (i386 numa the rare beast it is) jump thought a few extra hoops
> >>> than to muck with the initrd code.
> >> Right we can't change where grub puts it. But doesn't it tell us where
> >> it is as part of the kernel parameterisation. That would allow us to
> >> move it out of our way and change the parameters to that new location,
> >> allowing normal processing to find it in the new location.
> >
> > Yea we know right where the initrd is at. All this code is running
> > before the bootmem allocator is even setup in fact this function is
> > setting everything up to call setup_bootmem_allocator (at the end of the
> > function)...
> >
>
> Right, so the instant we detect it at the location grub specifies can we
> not just move it somewhere else random like 20Mb up or something and
> change the pointer, then carry on?
>
> > Are you sure there isn't another way to reclaim these pages?
>
> Not that I am aware of, as if a page is in NORMAL it is expected to be
> direct mapped, you have stolen their mapping for KVA thus they have to
> be junked.
>
> >> Be interested to see the layout during boot on one of these boxes :).
> >
> > It is as easy as booting with an initrd :) I can post some initrd
> > locations it a little while.
kernel /vmlinuz-2.6.18-km ro root=/dev/VolGroup00/LogVol00
console=ttyS0,115200
console=tty0
[Linux-bzImage, setup=0x1e00, size=0x1ca562]
initrd /initrd-18km
[Linux-initrd @ 0x37ddf000, 0x21008a bytes]
Linux version 2.6.17 (root@elm3a25) (gcc version 4.1.1 20060817 (Red Hat
4.1.1-18)) #1 SMP Wed Sep 13 01:42:30 PDT 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009c400 (usable)
BIOS-e820: 000000000009c400 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000eff91840 (usable)
BIOS-e820: 00000000eff91840 - 00000000eff9c340 (ACPI data)
BIOS-e820: 00000000eff9c340 - 00000000f0000000 (reserved)
BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved)
BIOS-e820: 0000000100000000 - 00000001d0000000 (usable)
Node: 0, start_pfn: 0, end_pfn: 156
Node: 0, start_pfn: 256, end_pfn: 982929
Node: 0, start_pfn: 1048576, end_pfn: 1900544
My box isn't booting numa right now. My SRAT isn't getting loaded right
I am looking at that first.
Thanks,
Keith
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]