Re: [PATCH 4/7] ppc64 - Specify amount of kernel memory at boot time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 23 Feb 2006, Dave Hansen wrote:

On Thu, 2006-02-23 at 18:01 +0000, Mel Gorman wrote:
On Thu, 23 Feb 2006, Dave Hansen wrote:
OK, back to the hapless system admin using kernelcore. They have a
4-node system with 2GB of RAM in each node for 8GB total.  They use
kernelcore=1GB.  They end up with 4x1GB ZONE_DMA and 4x1GB
ZONE_EASYRCLM.  Perfect.  You can safely remove 4GB of RAM.

Now, imagine that the machine has been heavily used for a while, there
is only 1 node's memory available, but CPUs are available in the same
places as before.  So, you start up your partition again have 8GB of
memory in one node.  Same kernelcore=1GB option.  You get 1x7GB ZONE_DMA
and 1x1GB ZONE_EASYRCLM.  I'd argue this is going to be a bit of a
surprise to the poor admin.


That sort of surprise is totally unacceptable but the behaviour of
kernelcore needs to be consistent on both the x86 and the ppc (any any
other ar. How about;

1. kernelcore=X determines the total amount of memory for !ZONE_EASYRCLM
    (be it ZONE_DMA, ZONE_NORMAL or ZONE_HIGHMEM)

Sounds reasonable.  But, if you're going to do that, should we just make
it the opposite and explicitly be easy_reclaim_mem=?  Do we want the
limit to be set as "I need this much kernel memory", or "I want this
much removable memory".  I dunno.


I think we should keep it at kernelcore=. If you have too little easyrclm memory, then hot-remove and hugetlb availability is impaired. If you have too little kernel memory, you have a really bad day.

2. For every node that can have ZONE_EASYRCLM, split the kernelcore across
    the nodes as a percentage of the node size

    Example: 4 nodes, 1 GiB each, kernelcore=512MB
 		node 0 ZONE_DMA = 128MB
 		node 1 ZONE_DMA = 128MB
 		node 2 ZONE_DMA = 128MB
 		node 3 ZONE_DMA = 128MB

 	    2 nodes, 3GiB and 1GIB, kernelcore=512MB
 		node 0 ZONE_DMA = 384
 		node 1 ZONE_DMA = 128

It gets a bit more complex on NUMA for x86 because ZONE_NORMAL is
involved but the idea would essentially be the same.

Yes, chopping it up seems like the right thing (or as close as we can
get) to me.


Ok, will rework the code to make it happen.

--
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux