Re: [RFC][PATCH] New iovec support & VFS changes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Badari Pulavarty wrote:
> I was trying to add support for preadv()/pwritev() for threaded
> databases. Currently the patch is in -mm tree.
> 
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.15-
> rc5/2.6.15-rc5-mm3/broken-out/support-for-preadv-pwritev.patch
> 
> This needs a new set of system calls. Ulrich Drepper pointed out
> that, instead of adding a system call for the limited functionality
> it provides, why not we add new iovec interface as follows (offset-per-
> segment) which provides greater functionality & flexibility.
> 
> +struct niovec
> +{
> +	void __user *iov_base;
> +	__kernel_size_t iov_len;
> +	__kernel_loff_t iov_off; /* NEW */
> +};

For a database, it's also helpful to know when an operation is going
to block on I/O (i.e. because the data isn't cached, or write buffers
full) and if that's going to happen, move it to another thread, or
move other operations to another thread.  This allows a program to
continue to work on other things concurrently with I/O more
effectively than thread pool guesswork.

So if you add these new syscalls, it would be helpful to add a "flags"
argument to each of them, and define one flag: "don't block on I/O".
When the flag is set, the syscalls should do as much reading or
writing as they can without blocking, and then return the count, or
EAGAIN.

(FreeBSD's sendfile() has an SF_NODISKIO flag which means this, and it
is used in exactly that way: so a program can move the sendfile() to
another thread iff that is necessary to avoid blocking the program.)

There's also a case for making these into async I/O operations.
However, if there is any possibility of async I/O blocking a task for
a long time (which there is with Linux async I/O apparently), that is
not half as useful as a flag to stop I/O when it would block, and let
the program decide what to do.

I mention this precisely because it's relevant to I/O performance of
databases and similar programs, and therefore a reason to have a
"flags" argument to these new syscalls, even if no flags are defined
at first.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux