On 10/30/06, Balbir Singh <[email protected]> wrote:
+----+---------+------+---------+------------+----------------+-----------+
|ii | No | Yes | configfs| Memory, | Plans to | Yes |
| | | | | task limit.| provide a | |
| | | | | Plans to | framework | |
| | | | | allow | to write new | |
| | | | | CPU and I/O| controllers | |
I have a port of Rohit's memory controller to run over my generic containers.
d. Fake NUMA Nodes
This approach was suggested while discussing the memory controller
Advantages
(i) Accounting for zones is already present
(ii) Reclaim code can directly deal with zones
Disadvantages
(i) The approach leads to hard partitioning of memory.
(ii) It's complex to
resize the node. Resizing is required to allow change of limits for
resource management.
(ii) Addition/Deletion of a resource group would require memory hotplug
support for add/delete a node. On deletion of node, its memory is
not utilized until a new node of a same or lesser size is created.
Addition of node, requires reserving memory for it upfront.
A much simpler way of adding/deleting/resizing resource groups is to
partition the system at boot time into a large number of fake numa
nodes (say one node per 64MB in the system) and then use cpusets to
assign the appropriate number of nodes each group. We're finding a few
ineffiencies in the current code when using such a large number of
small nodes (e.g. slab alien node caches), but we're confident that we
can iron those out.
(iv) How do we account for shared pages? Should it be charged to the first
container which touches the page or should it be charged equally among
all containers sharing the page?
A third option is to allow inodes to be associated with containers in
their own right, and charge all pages for those inodes to the
associated container. So if several different containers are sharing a
large data file, you can put that file in its own container, and you
then have an exact count of how many pages are in use in that shared
file.
This is cheaper than having to keep track of multiple users of a page,
and is also useful when you're trying to do scheduling, to decide who
to evict. Suppose you have two jobs each allocating 100M of anonymous
memory and each accessing all of a 1G shared file, and you need to
free up 500M of memory in order to run a higher-priority job.
If you charge the first user, then it will appear that the first job
is using 1.1G of memory and the second is using 100M of memory. So you
might evict the first job, thinking it would free up 1.1G of memory -
but it would actually only free up 100M of memory, since the shared
pages would still be in use by the second job.
If you share the charge between both users, then it would appear that
each job is using 600M of memory - but it's still the case that
evicting either one would only free up 100M of memory.
If you can see that the shared file that they're both using is
accounting for 1G of the memory total, and that they're each using
100M of anon memory, then it's easier to see that you'd need to evict
*both* jobs in order to free up 500M of memory.
Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]