On Wed, 2 Nov 2005, Kamezawa Hiroyuki wrote:
> Ingo Molnar wrote:
> > so it's all about expectations: _could_ you reasonably remove a piece of
> > RAM? Customer will say: "I have stopped all nonessential services, and free
> > RAM is at 90%, still I cannot remove that piece of faulty RAM, fix the
> > kernel!". No reasonable customer will say: "True, I have all RAM used up in
> > mlock()ed sections, but i want to remove some RAM nevertheless".
> >
> Hi, I'm one of men in -lhms
>
> In my understanding...
> - Memory Hotremove on IBM's LPAR? approach is
> [remove some amount of memory from somewhere.]
> For this approach, Mel's patch will work well.
> But this will not guaranntee a user can remove specified range of
> memory at any time because how memory range is used is not defined by an
> admin
> but by the kernel automatically. But to extract some amount of memory,
> Mel's patch is very important and they need this.
>
> My own target is NUMA node hotplug, what NUMA node hotplug want is
> - [remove the range of memory] For this approach, admin should define
> *core* node and removable node. Memory on removable node is removable.
> Dividing area into removable and not-removable is needed, because
> we cannot allocate any kernel's object on removable area.
> Removable area should be 100% removable. Customer can know the limitation
> before using.
>
In this case, we would want some mechanism that says "don't put awkward
pages in this NUMA node" in a clear way. One way we could do this is;
1. Move fallback_allocs to be per-node. fallback_allocs is currently
defined as
int fallback_allocs[RCLM_TYPES-1][RCLM_TYPES+1] = {
{RCLM_NORCLM, RCLM_FALLBACK, RCLM_KERN, RCLM_EASY, RCLM_TYPES},
{RCLM_EASY, RCLM_FALLBACK, RCLM_NORCLM, RCLM_KERN, RCLM_TYPES},
{RCLM_KERN, RCLM_FALLBACK, RCLM_NORCLM, RCLM_EASY, RCLM_TYPES}
};
The effect is that a RCLM_NORCLM allocation, falls back to
RCLM_FALLBACK, RCLM_KERN, RCLM_EASY and then gives up.
2. Architectures would need to provide a function that allocates and
populates a fallback_allocs[][] array. If they do not provide one, a
generic function uses array like the one above
3. When adding a node that must be removable, make the array look like
this
int fallback_allocs[RCLM_TYPES-1][RCLM_TYPES+1] = {
{RCLM_NORCLM, RCLM_TYPES, RCLM_TYPES, RCLM_TYPES, RCLM_TYPES},
{RCLM_EASY, RCLM_FALLBACK, RCLM_NORCLM, RCLM_KERN, RCLM_TYPES},
{RCLM_KERN, RCLM_TYPES, RCLM_TYPES, RCLM_TYPES, RCLM_TYPES},
};
The effect of this is only allocations that are easily reclaimable will
end up in this node. This would be a straight-forward addition to build
upon this set of patches. The difference would only be visible to
architectures that cared.
> What I'm considering now is this:
> - removable area is hot-added area
> - not-removable area is memory which is visible to kernel at boot time.
> (I'd like to achieve this by the limitation : hot-added node goes into only
> ZONE_HIGHMEM)
ZONE_HIGHMEM can still end up with PTE pages if allocating PTE pages from
highmem is configured. This is bad. With the above approach, nodes that
are not hot-added that have a ZONE_HIGHMEM will be able to use it for PTEs
as well. But when a node is hot-added, it will have a ZONE_HIGHMEM that is
not used for PTE allocations because they are not RCLM_EASY allocations.
--
Mel Gorman
Part-time Phd Student Java Applications Developer
University of Limerick IBM Dublin Software Lab
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]