Re: [PATCH] Prevent large file writeback starvation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Chinner <[email protected]> wrote:
>
> > In which case, yes, perhaps leaving big-dirty-expired inode on s_io is the
> > right thing to do.  But should we be checking that it has expired before
> > deciding to do this?
> 
> Well, we are writing it out because it has expired in the first place,
> right? And it remains expired until we actually clean it, so I
> don't see any need for a check such as this....

OK.  I was worried about redirtyings while inode_lock is dropped, but the
I_DIRTY and _LOCK logic looks tight to me.

> > We don't want to get in a situation where continuous
> > overwriting of a large file causes other files on that fs to never be
> > written out.
> 
> Agreed. That's why i proposed the s_more_io queue - it works on those inodes
> that need more work only after all the other inodes have been written out.
> That prevents starvation, and makes large inode flushes background work (i.e.
> occur when there is nothing else to do). it will get much better disk
> utilisation than the method I originally proposed, as well, because it'll keep
> the disk near congestion levels until the data is written out...
> 

Yes, s_more_io does make sense.  So now dirty inodes can be on one of three
lists.  It'll be fun writing the changelog for this one.  And we'll need a
big fat comment describing what the locks do, and the protocol for handling
them.

We need to be extra-careful to not break sys_sync(), umount, etc.  I guess
if !for_kupdate, we splice s_dirty and s_more_io onto s_io and go for it.

So the protocol would be:

s_io: contains expired and non-expired dirty inodes, with expired ones at
the head.  Unexpired ones (at least) are in time order.

s_more_io: contains dirty expired inodes which haven't been fully written. 
Ordering doesn't matter (unless someone goes and changes
dirty_expire_centisecs - but as long as we don't do anything really bad in
response to this we'll be OK).

s_dirty: contains expired and non-expired dirty inodes.  The non-expired
ones are in time-of-dirtying order.



I wonder if it would be saner to have separate lists for expired and
unexpired inodes.  If when writing an expired inode we don't write it all
out, just rotate it to the back of the expired inode list.  On entry to
sync_sb_inodes, do a walk of s_dirty, moving expired inodes onto the
expired list.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux