Hello. Linus-san.
> NOTE! Even if the machine has 4GB or more of memory, it's entirely likely
> that the quick "use NODE(0)" hack will work fine.
>
> Why? Because the bootmem memory should still be allocated low-to-high by
> default, which means that as logn as NODE(0) has _enough_ memory in the
> DMA range, we should be ok.
>
> So I _think_ the simple one-liner NODE(0) patch is sufficient, and should
> work (and is a lot more acceptable for 2.6.14 than switching the node
> ordering around yet again, or doing bigger surgery on the bootmem code).
>
> So the only thing that worried me (and made me ask whether there might be
> machines where it doesn't work) is if some machines might have their high
> memory (or no memory at all) on NODE(0). It does sound unlikely, but I
> simple don't know what kind of strange NUMA configs there are out there.
>
> And I'm definitely only interested in machines that are out there, not
> some theoretical issues.
In our making IA64 machine node 0 might not have any low-memory, and
another node can have low-memory instead.
This cause comes from hotplug whole of one node.
For example, please imagine following case.
1) In this case, firmware remembers pxm 1's node has low memory.
node 0 node 1
+--------------+ +-----------+
| pxm = 1 | | pxm = 2 |
| low memory | | |
+--------------+ +-----------+
2) If one node is hot-added at pxm = 0 (pxm is decided from physical
locate by firmware.), new node will be node 2.
node 2 node 0 node 1
+-----------+ +--------------+ +-----------+
| pxm = 0 | | pxm = 1 | | pxm = 2 |
| | | low memory | | |
+-----------+ +--------------+ +-----------+
3) If user reboots the machine, Linux decides node id from pxm's order.
But firmware still remembers which node has low memory.
So, node 0 will not have any low memory.
node 0 node 1 node 2
+-----------+ +--------------+ +-----------+
| pxm = 0 | | pxm = 1 | | pxm = 2 |
| | | low memory | | |
+-----------+ +--------------+ +-----------+
So, just "use NODE(0)" is not enough hack for our machine.
If "use NODE(0)" is selected, kernel must sort pgdat link and
node id by memory address. I think that hot add code will be a
bit messy instead.
Thanks.
--
Yasunori Goto
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]