Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shailabh Nagar wrote:
jamal wrote:

On Mon, 2006-03-07 at 18:01 -0700, Andrew Morton wrote:

On Mon, 03 Jul 2006 20:54:37 -0400
Shailabh Nagar <[email protected]> wrote:


What happens when a listener exits without doing deregistration
(or if the listener attempts to register another cpumask while a current
registration is still active).


( Jamal, your thoughts on this problem would be appreciated)

Problem is that we have a listener task which has "registered" with taskstats and caused its pid to be stored in various per-cpu lists of listeners. Later, when some other task exits on a given cpu, its exit data is sent using genlmsg_unicast on each pid present on that cpu's list.

If the listener exits without doing a "deregister", its pid continues to be kept around, obviously not a good thing. So we need some way of detecting the situation (task is no longer listening on
these cpus events) that is efficient.


Also need to address the case where the listener has closed off his file
descriptor but continues to run.

So hooking into listener's exit() isn't appropriate - the teardown is
associated with the lifetime of the fd, not of the process. If we do that, exit() gets handled for free.



If you are always going to send unicast messages, then  -ECONNREFUSED
will tell you the listener has closed their fd - this doesnt meant it
has exited.


Thats good. So we have atleast one way of detecting the "closed fd without
deregistering" within taskstats itself.

Besides that one process could open several sockets. I know
that would not be the app you would write - but it doesnt stop other
people from doing it.


As far as API is concerned, even a taskstats listener is not being
prevented from opening multiple sockets. As Andrew also pointed out,
everything needs to be done per-socket.

I think i may not follow what you are doing - for some reason i thought
you may have many listeners in user space and these messages get
multicast to them?


That was the design earlier. In the past week, the design has changed to
one where there are still many listeners in user space but messages
get unicast to each of them. Earlier listeners would get messages generated
on task exit from every cpu, now they get it only from cpus for which
they have explicitly registered interest (via a cpumask passed in through
another genetlink command).

Does the user space program somehow communicate its pid to the kernel?


Yes. When the listener registers interest in a set of cpus, as described
above, its (genl_info->pid) is being stored in the per-cpu list of
listeners for those cpus. When a task exits on one of those cpus, the
exit data is only sent via genetlink_unicast to those pids
(really, nl_pids) who are on that cpu's listener list.


Now that I think more about it, netlink is really maintaining a pidhash
of nl_pids, not process pids, right ? So if one userapp were to open
multiple sockets using NETLINK_GENERIC protocol (regardless of how many
of those are for the taskstats), each of them would have to use a
different nl_pid. Hence, it would be valid for the taskstats layer to use netlink_lookup() at any time to see if the corresponding socket were
closed ?


Here's a strawman for the problem we're trying to solve: get
notification of the close of a NETLINK_GENERIC socket that had
been used to register interest for some cpus within taskstats.

From looking at the netlink code, the way to go seems to be

- it maintains a pidhash of nl_pids that are currently
registered to listen to atleast one cpu. It also stores the
cpumask used.
- taskstats registers a notifier block within netlink_chain
and receives a callback on the NETLINK_URELEASE event, similar
to drivers/scsci/scsi_transport_iscsi.c: iscsi_rcv_nl_event()

- the callback checks to see that the protocol is NETLINK_GENERIC
and that the nl_pid for the socket is in taskstat's pidhash. If so, it
does a cleanup using the stored cpumask and releases the nl_pid
from the pidhash.

We can even do away with the deregister command altogether and
simply rely on this autocleanup.

--Shailabh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux