On Mon, 2005-03-28 at 13:42 -0800, Paul Jackson wrote: > Guillaume wrote: > > The lmbench shows that the overhead (the construction and the sending > > of the message) in the fork() routine is around 7%. > > Thanks for including the numbers. The 7% seems a bit costly, for a bit > more accounting information. Perhaps dean's suggestion, to not use > ascii, will help. I hope so, though I doubt it will make a huge > difference. Was this 7% loss with or without a user level program > consuming the sent messages? I would think that the number of interest > would include a minimal consumer task. There is no overhead at all using CBUS. On my old P2/256mb SMP machine it took about 950 usec to create+exit process both with fork connector turned on and without it even compiled. Direct connector's method call took about 1000-1100 usec. Current fork connector does not use CBUS [yet, I hope]. > I don't see a good reason to make fork_connector() inline. Since it > calls other subroutines and is not just a few lines, perhaps better to > make it a real routine, so we can see it in "nm --print-size" output and > debug stacks. > > Having the "#ifdef CONFIG_FORK_CONNECTOR" chunk of code right in fork.c > seems unfortunate. Can the real fork_connector() be put elsewhere, and > the ifdef put in a header file that makes it a no-op if not configured, > or simply a function declaration, if configured? > > What's the status of the connector driver patch? I perhaps wasn't > paying close enough attention, but all I see of it now is a couple of > patches sent to lkml, from Evgeniy Polyakov, in September and January. > I don't see it in my copies of *-mm or recent Linus bk trees. Am I > missing something? It was dropped from -mm tree, since bk tree where it lives was in maintenance mode. I think connector will be appeared in the next -mm release. > This still seems to me like more apparatus than is desirable, just to > get another form of session id, as best as I can figure it. However > we've already been there, and apparently my concerns were not > persuasive. If one does go down this path, then using this connector > patch is a good an alternative as any I know of. Well, that or relayfs. > My uneducated assumption is that relayfs might at least batch data > packets up into big buffer chunks better, but someone more knowledgeable > than me needs to consider that. > > It's a little sad, when almost all the required accounting information > comes out in packed 64 byte records, carefully buffered and sent in > big chunks, to minimize per-task costs. Then this one extra detail, > of <parent-pid, child-pid> requires an entire netlink packet of > its own of what size -- another 50 or 100 bytes? Is this packet > received as a separate data packet, on its own recv(2) system call, > by the user task, not in a big block of packets? The efficiency > of getting this one extra <parent-pid, child-pid> out of the kernel > seems to be one or two orders of magnitude worse than the rest of > the accounting data. It can be easily changed. One may send kernel/acct.c acct_t structure out of the kernel - overhead will be the same: kmalloc probably will get new area from the same 256-bytes pool, skb is still in cache. > === > > Hmmm ... perhaps one could add a _second_ accounting file, cutting and > pasting code in kernel/acct.c and enabling writing additional > information to that second file, using the same mechanisms as now used > for the primary file. Use a more extensible record format for the > second file (say start each record with a magic cookie, a byte record > type and a byte record length, then that many bytes). That way, we have > an escape valve for adding additional record types in the future. > And that way we can efficiently write short records, with just say > a couple of interesting values, and minimal overhead. > > Don't worry if the magic cookie appears as part of the raw data. If one > has to resync such a data stream, one can look for a series of records, > each starting with the magic cookie, sensible record type byte, and a > length that ends right at the next such valid record. The occassional > duplication of the same cookie within the data stream would not thwart a > resync for long. And the main purpose of the magic cookie is to make > sure you are still in sync, not reverting to garbage-in, garbage-out, > mode. Almost any magic value other than 0x0000 will suffice for that > purpose. > > I just ran a silly little test on my PC desktop Linux box, scanning > /proc/kcore. The _least_ common 2 byte word seen was 0x2B91, with 31 > instances in a half-billion words scanned, so I nominate that value for > the magic cookie ;). > > The key reason that it might make sense here to adapt the existing > accounting file direct write mechanism, rather than using "connector" or > "relayfs", is that we really do want to get this data to disk initially. > Relayfs is optimized for getting alot of data to a user daemon, and the > connector for sending smaller packets of data to a user daemon. But > accounting processing is sometimes done out of a cron job off-hours. > During the day (the busy hours) you might just want to stash the stuff > with as little performance impact is possible. If one can avoid _any_ > other task having to context switch in, in order to get this data on its > way, that is a huge win. File writing accounting [kernel/acct.c] is slower, it takes global locks and requires process' context to work with system calls. realyfs is interesting project, but it has different aims, as far as I can see, - it is created for transferring huge amounts of data, and it is succeded in it, while connector is purely control/notification mechanism, for example, for gathering short-living per-process accounting data. -- Evgeniy Polyakov Crash is better than data corruption -- Arthur Grabowski
Attachment:
signature.asc
Description: This is a digitally signed message part
- Follow-Ups:
- Re: [patch 1/2] fork_connector: add a fork connector
- From: Paul Jackson <[email protected]>
- Re: [patch 1/2] fork_connector: add a fork connector
- From: Greg KH <[email protected]>
- Re: [patch 1/2] fork_connector: add a fork connector
- References:
- [patch 1/2] fork_connector: add a fork connector
- From: Guillaume Thouvenin <[email protected]>
- Re: [patch 1/2] fork_connector: add a fork connector
- From: Paul Jackson <[email protected]>
- [patch 1/2] fork_connector: add a fork connector
- Prev by Date: Re: How's the nforce4 support in Linux?
- Next by Date: Re: [patch 1/2] fork_connector: add a fork connector
- Previous by thread: Re: [patch 1/2] fork_connector: add a fork connector
- Next by thread: Re: [patch 1/2] fork_connector: add a fork connector
- Index(es):