Re: [RFD] BIO_RW_BARRIER - what it means for devices, filesystems, and dm/md.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday May 28, [email protected] wrote:
> Neil Brown writes:
>  > 
> 
> [...]
> 
>  > Thus the general sequence might be:
>  > 
>  >   a/ issue all "preceding writes".
>  >   b/ issue the commit write with BIO_RW_BARRIER
>  >   c/ wait for the commit to complete.
>  >          If it was successful - done.
>  >          If it failed other than with EOPNOTSUPP, abort
>  >          else continue
>  >   d/ wait for all 'preceding writes' to complete
>  >   e/ call blkdev_issue_flush
>  >   f/ issue commit write without BIO_RW_BARRIER
>  >   g/ wait for commit write to complete
>  >        if it failed, abort
>  >   h/ call blkdev_issue
>  >   DONE
>  > 
>  > steps b and c can be left out if it is known that the device does not
>  > support barriers.  The only way to discover this to try and see if it
>  > fails.
>  > 
>  > I don't think any filesystem follows all these steps.
> 
> It seems that steps b/ -- h/ are quite generic, and can be implemented
> once in a generic code (with some synchronization mechanism like
> wait-queue at d/).

Yes and no.
It depends on what you mean by "preceding write".

If you implement this in the filesystem, the filesystem can wait only
for those writes where it has an ordering dependency.   If you
implement it in common code, then you have to wait for all writes
that were previously issued.

e.g.
  If you have two different filesystems on two different partitions on
  the one device, why should writes in one filesystem wait for a
  barrier issued in the other filesystem.
  If you have a single filesystem with one thread doing lot of
  over-writes (no metadata changes) and the another doing lots of
  metadata changes (requiring journalling and barriers) why should the
  data write be held up by the metadata updates?

So I'm not actually convinced that doing this is common code is the
best approach.  But it is the easiest.  The common code should provide
the barrier and flushing primitives, and the filesystem gets to use
them however it likes.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux