Re: [Patch][RFC] Disabling per-tgid stats on task exit in taskstats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Andrew Morton wrote:

On Thu, 29 Jun 2006 15:10:31 -0400
Shailabh Nagar <[email protected]> wrote:

I agree, and I'm viewing this as blocking the taskstats merge.  Because if
this _is_ a problem then it's a big one because fixing it will be
intrusive, and might well involve userspace-visible changes.


First off, just a reminder that this is inherently a netlink flow control issue...which was being exacerbated
earlier by taskstats decision to send per-tgid data (no longer the case).

But I'd like to know whats our target here ? How many messages per second do we want to be able to be sent and received without risking any loss of data ? Netlink will lose messages at a high enough rate so the design point
will need to be known a bit.

For statistics type usage of the genetlink/netlink, I would have thought that userspace, provided it is reliably informed about the loss of data through ENOBUFS, could take measures to just account for the missing data and carry on ?

Could be so.  But we need to understand how significant the impact of this
will be in practice.

We could find, once this is deployed is real production environments on
large machines that the data loss is sufficiently common and sufficiently
serious that the feature needs a lot of rework.

Now there's always a risk of that sort of thing happening with all
features, but it's usually not this evident so early in the development
process.  We need to get a better understanding of the risk before
proceeding too far.

And there's always a 100% reliable fix for this: throttling.  Make the
sender of the messages block until the consumer can catch up.
Is blocking exits an option ?

In some
situations, that is what people will want to be able to do.  I suspect a
good implementation would be to run a collection daemon on each CPU and
make the delivery be cpu-local.  That's sounding more like relayfs than
netlink.
Yup...the per-cpu, high speed requirements are up relayfs' alley, unless Jamal or netlink folks are planning something (or can shed light on) how large flows can be managed over netlink. I suspect
this discussion has happened before :-)


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux