Re: [patch] swapin rlimit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Andrew Morton <[email protected]> wrote:

> Ingo Molnar <[email protected]> wrote:
> >
> > * Andrew Morton <[email protected]> wrote:
> > 
> >  > Similarly, that SGI patch which was rejected 6-12 months ago to kill 
> >  > off processes once they started swapping.  We thought that it could be 
> >  > done from userspace, but we need a way for userspace to detect when a 
> >  > task is being swapped on a per-task basis.
> > 
> >  wouldnt the clean solution here be a "swap ulimit"?
> 
> Well it's _a_ solution, but it's terribly specific.
> 
> How hard is it to read /proc/<pid>/nr_swapped_in_pages and if that's 
> non-zero, kill <pid>?

on a system with possibly thousands of taks, over /proc, on a 
high-performance node where for a 0.5% improvement they are willing to 
sacrifice maidens? :)

Seriously, while nr_swapped_in_pages ought to be OK, i think there is a 
generic problem with /proc based stats.

System instrumentation people are already complaining about how costly 
/proc parsing is. If you have to get some nontrivial stat from all 
threads in the system, and if Linux doesnt offer that counter or summary 
by default, it gets pretty expensive.

One solution i can think of would be to make a binary representation of 
/proc/<pid>/stats readonly-mmap-able. This would add a 4K page to every 
task tracked that way, and stats updates would have to update this page 
too - but it would make instrumentation of running apps really 
unintrusive and scalable.

Another addition would be some mechanism for a monitoring app to capture 
events in the PID space: so that they can mmap() new tasks [if they are 
interested] on a non-polling basis, i.e. not like readdir on /proc. This 
capability probably has to be a system-call though, as /proc seems too 
quirky for it. The system does not wait on the monitoring app(s) to 
catch up - if it's too slow in reacting and the event buffer overflows 
then tough luck - monitoring apps will have no impact on the runtime 
characteristics of other tasks. In theory this is somewhat similar to 
auditing, but the purpose would be quite different, and it only cares 
about PID-space events like 'fork/clone', 'exec' and 'exit'.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux