Re: [Lhms-devel] [PATCH 0/7] Fragmentation Avoidance V19

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > A per-task stat requires walking the tasklist, to build a list of the
> > tasks to query.
> 
> Nope, just task->mm->whatever.

Nope.

Agreed - once you have the task, then sure, that's enough.

However - a batch scheduler will end up having to figure out what tasks
there are to inquire, by either listing the tasks in a cpuset, or
by listing /proc.  Either way, that's a tasklist scan.  And it will
have to do that pretty much every iteration of polling, since it has
no a priori knowledge of what tasks a job is firing up.


> Well no.  Because the filtered-whatsit takes two spinlocks and does a bunch
> of arith for each and every task, each time it calls try_to_free_pages(). 

Neither spinlock is global - the task and a lock in its cpuset.

I see a fair number of existing locks and semaphores, some global
and some in loops, that look to be in the code invoked by
try_to_free_pages(). And far more arithmetic than in that little
filter.

Granted, its cost seen by all, for the benefit of few.  But other sorts
of per-task or per-mm stats are not going to be free either.  I would
have figured that doing something per-page, even the most trivial
"counter++" (better have that mm locked) will likely cost more than
doing something per try_to_free_pages() call.


> The frequency of that could be very high indeed, even when nobody is
> interested in the metric which is being maintained(!)

When I have a task start allocating memory as fast it can, it is only
able to call try_to_free_pages() about 10 times a second on an idle
ia64 SN2 system, with a single thread, or about 20 times a second
running several threads at once allocating memory.

  That's not "very high" in my book.

What sort of load would hit this much more often?  


If more folks need these detailed stats, then that's how it should be.

But I am no fan of exposing more than the minimum kernel vm details for
use by production software.

We agree that my per-cpuset memory_reclaim_rate meter certainly hides
more detail than the sorts of stats you are suggesting.  I thought that
was good, so long as what was needed was still present.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <[email protected]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux