Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

>> just to make sure i didnt get it wrong, wouldnt we get most of the 
>> benefits Andy is seeking by having a: boot-time option which sets aside 
>> a "hugetlb zone", with an additional sysctl to grow (or shrink) the pool 
>> - with the growing happening on a best-effort basis, without guarantees?
> 
> Boot-time option to set the hugetlb zone, yes.
> 
> Grow-or-shrink, probably not. Not in practice after bootup on any machine 
> that is less than idle.
> 
> The zones have to be pretty big to make any sense. You don't just grow 
> them or shrink them - they'd be on the order of tens of megabytes to 
> gigabytes. In other words, sized big enough that you will _not_ be able to 
> create them on demand, except perhaps right after boot.
> 
> Growing these things later simply isn't reasonable. I can pretty much 
> guarantee that any kernel I maintain will never have dynamic kernel 
> pointers: when some memory has been allocated with kmalloc() (or 
> equivalent routines - pretty much _any_ kernel allocation), it stays put. 
> Which means that if there is a _single_ kernel alloc in such a zone, it 
> won't ever be then usable for hugetlb stuff.
> 
> And I don't want excessive complexity. We can have things like "turn off 
> kernel allocations from this zone", and then wait a day or two, and hope 
> that there aren't long-term allocs. It might even work occasionally. But 
> the fact is, a number of kernel allocations _are_ long-term (superblocks, 
> root dentries, "struct thread_struct" for long-running user daemons), and 
> it's simply not going to work well in practice unless you have set aside 
> the "no kernel alloc" zone pretty early on.

Exactly. But that's what all the anti-fragmentation stuff was about - trying
to pack unfreeable stuff together. 

I don't think anyone is proposing dynamic kernel pointers inside Linux,
except in that we could possibly change the P-V mapping underneath from
the hypervisor, so that the phys address would change, but you wouldn't
see it. Trouble is, that's mostly done on a larger-than-page size
granularity, so we need SOME larger chunk to switch out (preferably at
least a large-paged size, so we can continue to use large TLB entries for
the kernel mapping).

However, the statically sized option is hugely problematic too.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
  - From: andy@thermo.lanl.gov (Andy Nelson)
- Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
  - From: Linus Torvalds <torvalds@osdl.org>
- Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
  - From: Paul Jackson <pj@sgi.com>
- Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
  - From: Linus Torvalds <torvalds@osdl.org>

Prev by Date: Re: [2.6 patch] drivers/block/pktcdvd.c: remove write-only variable in pkt_iosched_process_queue()
Next by Date: Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
Previous by thread: Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
Next by thread: Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]