On 1/15/07, Christoph Hellwig <[email protected]> wrote:
On Mon, Jan 15, 2007 at 05:54:50PM -0800, Nate Diller wrote:
> This series is an attempt to generalize the async I/O paths to be
> implementation agnostic. It completely eliminates knowledge of
> the kiocb structure in the generic code and makes it private within the
> current aio code. Things get noticeably cleaner without that layering
> violation.
>
> The new interface takes a file_endio_t function pointer, and a private data
> pointer, which would normally be aio_complete and a kiocb pointer,
> respectively. If the aio submission function gets back EIOCBQUEUED, that is
> a guarantee that the endio function will be called, or *already has been
> called*. If the file_endio_t pointer provided to aio_[read|write] is NULL,
> the FS must block on I/O completion, then return either the number of bytes
> read, or an error.
I don't really like this patchet at all. At some point it's a lot nicer
to have a lot of paramaters that are related and passed down a long
callchain into a structure, and I think the aio code is over that threshold.
The completion function cleanups look okay to me, but I'd rather add
that completion function to struct kiocb instead of removing kiocb use.
I have this slight feeling you want to use this completions for something
else than the current aio code, if that's the case it would help
if you could explain briefly in what direction your heading.
Actually I agree with you more than you might think. I had intended
this to mesh with your struct iodesc idea, where iodesc would contain
the iovec pointer, nr_segs, iov_length, and whatever else needs to be
there, potentially even the endio function and its private data, tying
those to the iovec instead of a separate structure that needs to be
kept in sync. There's a distinct layering that should exist between
things that should accompany the iovec transparently, and private data
that should be attached opaquely by layers above.
The biggest thing I have in mind for this patch, actually, is to fix
up the *sync* paths. I don't think we should be waiting on sync I/O
at the *top* of the call stack, like with wait_on_sync_kiocb(), I'd
say the best place to wait is at the *bottom*, down in the I/O
scheduler. This would make it a lot easier to clean up the completion
paths, because in the sync case, you'd be right back in process
context again as you traverse upward through the RAID, encryption,
loopback, directIO, FS log commit, etc. It doesn't by itself
eliminate the need for all the threads and workqueues and such that
those layers each own, but it is a step in the right direction.
Now if you want to talk about long-term vaporware style ideas, yeah, I
do have my own thoughts on how aio should work. And from Agami's
perspective, this patch also makes it easier for us to do certain
debugging traces that we wish to hack together, in order to profile
performance on our platform. But I'd be hesitant to make those
arguments, cause they are largely irrelevant (we can obviously carry
the patch for debugging without buy-in from the community). This is
the right thing to do from a design perspective. Hopefully it enables
a new architecture that can reduce context switches in I/O completion,
and reduce overhead. That's the real motive ;)
NATE
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]