* Andrew Morton <[email protected]> wrote:
> Ingo Molnar <[email protected]> wrote:
> >
> > * Andrew Morton <[email protected]> wrote:
> >
> > > Similarly, that SGI patch which was rejected 6-12 months ago to kill
> > > off processes once they started swapping. We thought that it could be
> > > done from userspace, but we need a way for userspace to detect when a
> > > task is being swapped on a per-task basis.
> >
> > wouldnt the clean solution here be a "swap ulimit"?
>
> Well it's _a_ solution, but it's terribly specific.
>
> How hard is it to read /proc/<pid>/nr_swapped_in_pages and if that's
> non-zero, kill <pid>?
on a system with possibly thousands of taks, over /proc, on a
high-performance node where for a 0.5% improvement they are willing to
sacrifice maidens? :)
Seriously, while nr_swapped_in_pages ought to be OK, i think there is a
generic problem with /proc based stats.
System instrumentation people are already complaining about how costly
/proc parsing is. If you have to get some nontrivial stat from all
threads in the system, and if Linux doesnt offer that counter or summary
by default, it gets pretty expensive.
One solution i can think of would be to make a binary representation of
/proc/<pid>/stats readonly-mmap-able. This would add a 4K page to every
task tracked that way, and stats updates would have to update this page
too - but it would make instrumentation of running apps really
unintrusive and scalable.
Another addition would be some mechanism for a monitoring app to capture
events in the PID space: so that they can mmap() new tasks [if they are
interested] on a non-polling basis, i.e. not like readdir on /proc. This
capability probably has to be a system-call though, as /proc seems too
quirky for it. The system does not wait on the monitoring app(s) to
catch up - if it's too slow in reacting and the event buffer overflows
then tough luck - monitoring apps will have no impact on the runtime
characteristics of other tasks. In theory this is somewhat similar to
auditing, but the purpose would be quite different, and it only cares
about PID-space events like 'fork/clone', 'exec' and 'exit'.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]