2) On the client facing side (port 53), I'd very much hope for a
way to
do 'recvv' on datagram sockets, so I can retrieve a whole bunch of
UDP datagrams with only one kernel transition.
I want to highlight this point that Bert is making.
Whenever we talk about AIO and kernel threads some folks are rightly
concerned that we're talking about a thread *per IO* and fear that
memory consumption will be fatal.
Take the case of userspace which implements what we'd think of as
page cache writeback. (*coughs, points at email address*). It wants
to issue thousands of IOs to disjoint regions of a file. "Thousands
of kernel threads, oh crap!"
But it only issues each IO with a separate syscall (or io_submit()
op) because it doesn't have an interface that lets it specify IOs
that vector user memory addresses *and file position*.
If we had a seemingly obvious interface that let it kick off batched
IOs to different parts of the file, the looming disaster of a thread
per IO vanishes in that case.
struct off_vec {
off_t pos;
size_t len;
};
long sys_sgwrite(int fd, struct iovec *memvec, size_t mv_count,
struct off_vec *ovec, size_t ov_count);
It doesn't take long to imagine other uses for this that are less
exotic.
Take e2fsck and its iterating through indirect blocks or directory
data blocks. It has a list of disjoint file regions (blocks) it
wants to read, but it does them serially to keep the code from
getting even more confusing. blktrace a clean e2fsck -f some time..
it's leaving *HALF* of the disk read bandwith on the table by
performing serial block-sized reads. If it could specify batches of
them the code would still be simple but it could tell the kernel and
IO scheduler *exactly* what it wants, without having to mess around
with sys_readahead() or AIO or any of that junk :).
Anyway, that's just something that's been on my mind. If there are
obvious clean opportunities to get more done with single syscalls, it
might not be such a bad thing.
- z
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]