Re: [patch 1/1] sys_sync_file_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jens Axboe <[email protected]> wrote:
>
> On Mon, Apr 03 2006, Neil Brown wrote:
> > On Friday March 31, [email protected] wrote:
> > > On Thu, Mar 30, 2006 at 06:58:46PM +1100, Neil Brown wrote:
> > > > On Wednesday March 29, [email protected] wrote:
> > > > > Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
> > > > > fadvise() additions, do it in a new sys_sync_file_range() syscall
> > > > > instead. 
> > > > 
> > > > Hmmm... any chance this could be split into a sys_sync_file_range and
> > > > a vfs_sync_file_range which takes a 'struct file*' and does less (or
> > > > no) sanity checking, so I can call it from nfsd?
> > > > 
> > > > Currently I implement COMMIT (which has a range) with a by messing
> > > > around with filemap_fdatawrite and filemap_fdatawait (ignoring the
> > > > range) and I'd rather than a vfs helper.
> > > 
> > > I'm not 100% sure, but it looks like the PF_SYNCWRITE process flag
> > > should be set on the nfsd's while they're doing that, which doesn't
> > > seem to be happening atm.  Looks like a couple of the IO schedulers
> > > will make use of that knowledge now.  All the more reason for a VFS
> > > helper here I guess. ;)
> > 
> > PF_SYNCWRITE? What's that???
> > 
> > (find | xargs grep ...)
> > Oh.  The block device schedulers like to know if a request is sync or
> > async (and all reads are assumed to be sync) - which is reasonable -
> > and so have a per-task flag to tell them - which isn't (IMO).
> > 
> > md/raid (particularly raid5) often does the write from a different
> > process than generated the original request, so that will break
> > completely. 
> 
> I don't think any disagrees with you, the sync-write process flag is
> indeed an atrocious beast...

Yeah.  PF_SYNCWRITE was a performance tweak for the anticipatory scheduler.
As cfq is using it as well now (hopefully to good effect) I guess it could
be formalised more.

> > What is wrong with a bio flag I wonder....
> 
> Nothing, in fact I would love for it to be changed. I'm sure such a
> patch would be accepted with open arms! :-)

I think once someone starts coding it, they'll become a big fan of
PF_SYNCWRITE...

For the page writeback functions it's probably possible to use
writeback_control.sync_mode=WB_SYNC_ALL as a trigger, propagate that into
the IO layer.  Maybe that'll always be sufficient - it's hard to tell.  The
writeback paths are twisty and deep...

Then again, using WB_SYNC_ALL as a hint that this process will be waiting
for this writeout to complete is a bit hacky too - it doesn't _really_ mean
that - it just means that I/O should be _started_ against all relevant
dirty data.

Good luck ;)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux