Re: [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi

On Tue, 2 Oct 2007 10:00:40 +0800 Fengguang Wu <[email protected]> wrote:

> writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
> 
> On a busy-writing system, a writer could be hold up infinitely on a
> light-load device. It will be trying to sync more than available dirty data.
> 
> The problem case:
> 
> 0. sda/nr_dirty >= dirty_limit;
>    sdb/nr_dirty == 0
> 1. dd writes 32 pages on sdb
> 2. balance_dirty_pages() blocks dd, and tries to write 6MB.
> 3. it never gets there: there's only 128KB dirty data.
> 4. dd may be blocked for a loooong time

Please quantify loooong.

> Fix it by returning on 'zero dirty inodes' in the current bdi.
> (In fact there are slight differences between 'dirty inodes' and 'dirty pages'.
> But there is no available counters for 'dirty pages'.)
> 
> But the newly introduced 'break' could make the nr_writeback drift away
> above the dirty limit. The workaround is to limit the error under 1MB.

I'm still not sure that we fully understand this yet.

If the sdb writer is stuck in balance_dirty_pages() then all sda writers
will be in balance_dirty_pages() too, madly writing stuff out to sda.  And
pdflush will be writing out sda as well.  All this writeout to sda should
release the sdb writer.

Why isn't this happening?

> Cc: Chuck Ebbert <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Signed-off-by: Fengguang Wu <[email protected]>
> ---
>  mm/page-writeback.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> --- linux-2.6.22.orig/mm/page-writeback.c
> +++ linux-2.6.22/mm/page-writeback.c
> @@ -250,6 +250,11 @@ static void balance_dirty_pages(struct a
>  			pages_written += write_chunk - wbc.nr_to_write;
>  			if (pages_written >= write_chunk)
>  				break;		/* We've done our duty */
> +			if (list_empty(&mapping->host->i_sb->s_dirty) &&
> +			    list_empty(&mapping->host->i_sb->s_io) &&
> +			    nr_reclaimable + global_page_state(NR_WRITEBACK) <=
> +				    dirty_thresh + (1 << (20-PAGE_CACHE_SHIFT)))
> +				break;
>  		}
>  		congestion_wait(WRITE, HZ/10);
>  	}

Well that has a nice safetly net.  Perhaps it could fail a bit later on,
but that depends on why it's failing.

How well tested was this?

If we merge this for 2.6.23 then I expect that we'll immediately unmerge it
for 2.6.24 because Peter's stuff fixes this problem by other means.

Do we all agree with the above sentence?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
  - From: Fengguang Wu <[email protected]>

References:
- Re: A unresponsive file system can hang all I/O in the system on linux-2.6.23-rc6 (dirty_thresh problem?)
  - From: Chuck Ebbert <[email protected]>
- [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
  - From: Fengguang Wu <[email protected]>

Prev by Date: [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
Next by Date: Re: [Patch / 002](memory hotplug) Callback function to create kmem_cache_node.
Previous by thread: [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
Next by thread: Re: [PATCH] writeback: avoid possible balance_dirty_pages() lockup on a light-load bdi
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]