[Patch 0/9] Per-task delay accounting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is the next iteration of the delay accounting patches
last posted at
	http://www.ussg.iu.edu/hypermail/linux/kernel/0602.3/0893.html

The major changes to the patches (and names of commenters whose concerns
have been addressed) are:

- considerable revision of the synchronous block I/O delay collection (Arjan)
Now most delays seen in I/O submission are not being collected.
- removal of dependency on schedstats by using common code (Nick) 
- removal of sysctl or dynamic configurability of delay accounting (Arjan)
Now it it can be set on/off at boot time only.
- providing I/O delays through /proc/<tgid>/stats (Andi)
- revisions to the generic netlink interface (Jamal)

Unaddressed comments are suggestions from Jamal on 
further improving the generic netlink interface which is ongoing.

Series 

nstimestamp-diff.patch
delayacct-setup.patch
delayacct-blkio-init.patch
delayacct-blkio-collect.patch
delayacct-swapin.patch
delayacct-procfs.patch
delayacct-schedstats.patch
genetlink-utils.patch
delayacct-genetlink.patch


A short note on expected usage of this functionality follows which
addresses a comment from Nick in the previous iteration.
Basically, the patches measures a subset of the delays encountered 
by a task waiting for a resource such as cpu, disk I/O and page 
frames. Which subset is chosen depends on whether a task's delay 
is something that can be controlled by playing with its priority 
or rss limit etc.
e.g. cpu delays can directly be affected by its static CPU priority.
Similarly I/O delay can be affected by its I/O priority (assuming 
one uses the right iosched). In page frame delay, we only count 
the page faults due to swapin since that is directly affected by 
the rss limits for a task (well, ok, process). 

Delays can be used as feedback to decide whether something can/should
be done about a task (or group of tasks) which is not performing 
as one expects. 

http://www.research.ibm.com/journal/sj/362/amanaut.html
lists a fairly complicated way of using such data in 
workload management.

At a simpler level, one can think of utilities that can use this data
to control individual apps using a simple feedback loop.
At this point we do not have such a utility. 

Of course, its useful to visually inspect such data. Some of it is 
being exported through /proc as suggested by Andi, but the primary 
interface is through a netlink socket so that userspace can get 
data for exiting tasks too. 

The netlink socket also allows userspace apps to efficiently 
collect  delay data and group it in arbitrary ways 
(as envisaged by CKRM, CSA or ELSA) for reporting or 
control purposes. 


--Shailabh

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux