Pavel Emelianov wrote:
> Balbir Singh wrote:
>> Account RSS usage of a task and the associated container. The definition
>> of RSS was debated and discussed in the following thread
>>
>> http://lkml.org/lkml/2006/10/10/130
>>
>>
>> The code tracks all resident pages (including shared pages) as RSS. This patch
>> can easily adapt to the definition of RSS that will be agreed upon. This
>> implementation provides a proof of concept RSS controller.
>>
>> The accounting is inspired from Rohit Seth's container patches.
>>
>> TODO's
>>
>> 1. Merge file_rss and anon_rss tracking with the current rss tracking to
>> maximize code reuse
>> 2. Add/remove RSS tracking as the definition of RSS evolves
>>
>>
>> Signed-off-by: Balbir Singh <[email protected]>
>> ---
>>
>
> [snip]
>
>> --- linux-2.6.19-rc2/kernel/res_group/memctlr.c~container-memctlr-acct 2006-11-09 21:46:22.000000000 +0530
>> +++ linux-2.6.19-rc2-balbir/kernel/res_group/memctlr.c 2006-11-09 21:47:06.000000000 +0530
>> @@ -37,6 +37,8 @@ static struct resource_group *root_rgrou
>> static const char version[] = "0.01";
>> static struct memctlr *memctlr_root;
>>
>> +#define MEMCTLR_MAGIC 0xdededede
>> +
>> struct mem_counter {
>> atomic_long_t rss;
>> };
>> @@ -49,6 +51,7 @@ struct memctlr {
>> /* Statistics */
>> int successes;
>> int failures;
>> + int magic;
>
> What is this magic for? Is it just for debugging?
>
Yes
> [snip]
>
>> +static inline struct memctlr *get_memctlr_from_page(struct page *page)
>> +{
>> + struct resource_group *rgroup;
>> + struct memctlr *res;
>> +
>> + /*
>> + * Is the resource groups infrastructure initialized?
>> + */
>> + if (!memctlr_root)
>> + return NULL;
>> +
>> + rcu_read_lock();
>> + rgroup = (struct resource_group *)rcu_dereference(current->container);
>> + rcu_read_unlock();
>> +
>> + res = get_memctlr(rgroup);
>> + if (!res)
>> + return NULL;
>> +
>> + BUG_ON(res->magic != MEMCTLR_MAGIC);
>> + return res;
>> +}
>
> I don't see how page passed to this function is involved into
> 'struct memctlr *res' determining. Could you comment this?
>
Yeah, from page is a misnomer. We just use the current task task.
I'll fix the naming convention
> [snip]
>
>> --- linux-2.6.19-rc2/mm/rmap.c~container-memctlr-acct 2006-11-09 21:46:22.000000000 +0530
>> +++ linux-2.6.19-rc2-balbir/mm/rmap.c 2006-11-09 21:46:22.000000000 +0530
>> @@ -537,6 +537,7 @@ void page_add_anon_rmap(struct page *pag
>> if (atomic_inc_and_test(&page->_mapcount))
>> __page_set_anon_rmap(page, vma, address);
>> /* else checking page index and mapping is racy */
>> + memctlr_inc_rss(page);
>> }
>>
>> /*
>> @@ -553,6 +554,7 @@ void page_add_new_anon_rmap(struct page
>> {
>> atomic_set(&page->_mapcount, 0); /* elevate count by 1 (starts at -1) */
>> __page_set_anon_rmap(page, vma, address);
>> + memctlr_inc_rss(page);
>> }
>>
>> /**
>> @@ -565,6 +567,7 @@ void page_add_file_rmap(struct page *pag
>> {
>> if (atomic_inc_and_test(&page->_mapcount))
>> __inc_zone_page_state(page, NR_FILE_MAPPED);
>> + memctlr_inc_rss(page);
>
> Consider a task maps one file page 100 times in different places
> and touches 'all of them'. In this case I see that you'll get
> 100 in rss counter while real rss will be just 1.
>
Hmmm... something for me to think about. Depending on how we define
RSS, the code for accounting should be easy to add & modify depending
on how we define RSS. But you bring up a very good point.
>> }
>>
>> /**
>> @@ -596,8 +599,9 @@ void page_remove_rmap(struct page *page)
>> if (page_test_and_clear_dirty(page))
>> set_page_dirty(page);
>> __dec_zone_page_state(page,
>> - PageAnon(page) ? NR_ANON_PAGES : NR_FILE_MAPPED);
>> + PageAnon(page) ? NR_ANON_PAGES : NR_FILE_MAPPED);
>
> What is this extra space after a question-mark for?
This is again something I changed and looks my undo was not very good.
Please ignore it, I'll remove it from the diff.
>
>> }
>> + memctlr_dec_rss(page, mm);
>> }
>>
>> /*
>> diff -puN include/linux/rmap.h~container-memctlr-acct include/linux/rmap.h
>> --- linux-2.6.19-rc2/include/linux/rmap.h~container-memctlr-acct 2006-11-09 21:46:22.000000000 +0530
>> +++ linux-2.6.19-rc2-balbir/include/linux/rmap.h 2006-11-09 21:46:22.000000000 +0530
>> @@ -8,6 +8,7 @@
>> #include <linux/slab.h>
>> #include <linux/mm.h>
>> #include <linux/spinlock.h>
>> +#include <linux/memctlr.h>
>>
>> /*
>> * The anon_vma heads a list of private "related" vmas, to scan if
>> @@ -84,6 +85,7 @@ void page_remove_rmap(struct page *);
>> static inline void page_dup_rmap(struct page *page)
>> {
>> atomic_inc(&page->_mapcount);
>> + memctlr_inc_rss(page);
>> }
>
> I'm not sure this is correct. page_dup_rmap() happens in the context
> of forking process and thus you'll increment rss counter on current.
> But this must be incremented at new task's counter, mustn't it?
This is fixed in the next patch container-memctlr-task-migration.
Thanks for spotting it.
--
Thanks,
Balbir Singh,
Linux Technology Center,
IBM Software Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]