Re: [RFC][PATCH] New iovec support & VFS changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2005-12-20 at 20:00 +0200, Avi Kivity wrote:
> Badari Pulavarty wrote:
> 
> >I was trying to add support for preadv()/pwritev() for threaded
> >databases. Currently the patch is in -mm tree.
> >
> >http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.15-
> >rc5/2.6.15-rc5-mm3/broken-out/support-for-preadv-pwritev.patch
> >
> >This needs a new set of system calls. Ulrich Drepper pointed out
> >that, instead of adding a system call for the limited functionality
> >it provides, why not we add new iovec interface as follows (offset-per-
> >segment) which provides greater functionality & flexibility.
> >
> >+struct niovec
> >+{
> >+	void __user *iov_base;
> >+	__kernel_size_t iov_len;
> >+	__kernel_loff_t iov_off; /* NEW */
> >+};
> >
> >In order to support this, we need to change all the file_operations
> >(readv/writev) and its helper functions to take this new structure.
> >
> >I took a stab at doing it and I want feedback on whether this is
> >acceptable. All the patch does - is to make kernel use new structure,
> >but the existing syscalls like readv()/writev() still deals with
> >original one to keep the compatibility. (pipes and sockets need 
> >changing too - which I have not addressed yet).
> >
> >Is this the right approach ?
> >
> >  
> >
> You can io_submit() a list of IO_CMD_PREAD[V]s and immediately 
> io_getevents() them. In addition to specifying different file offsets 
> you can mix reads and writes, mix file descriptors, and reap nonblocking 
> events quickly (by specifying a timeout of zero).
> 
> Sure, it's two syscalls instead of one, but it's much more flexibles, 
> and databases should be using aio anyway. Oh, and no kernel changes 
> needed, apart from merging vectored aio.


Yes. We discussed this also earlier. Using AIO is the alternative.
But using AIO is not simple as doing preadv()/pwritev() for the
applications doesn't care about using AIO. AIO needs extra coding
to setup context, iocb, submits and getevents etc..

And also, inside the kernel - AIO requests go through lots of
code/routines -- before coming to ->aio_read() -- which I was
planning to avoid by having a direct syscall to do preadv/pwritev.

BTW, we still don't have vectored AIO support in the kernel.
Zack is working on it - which would add another set of
file operations aio_readv/aio_writev.

Thanks,
Badari

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux