Re: [PATCH 09/12] mm: remove throttle_vm_writeback

On Thu, 2007-04-05 at 15:44 -0700, Andrew Morton wrote:
> On Thu, 05 Apr 2007 19:42:18 +0200
> root@programming.kicks-ass.net wrote:
> 
> > rely on accurate dirty page accounting to provide enough push back
> 
> I think we'd like to see a bit more justification than that, please.

it should read like this:

        for ( ; ; ) {
		get_dirty_limits(&background_thresh, &dirty_thresh, NULL, NULL);

                /*
                 * Boost the allowable dirty threshold a bit for page
                 * allocators so they don't get DoS'ed by heavy writers
                 */
                dirty_thresh += dirty_thresh / 10;      /* wheeee... */

                if (global_page_state(NR_FILE_DIRTY) + 
		    global_page_state(NR_UNSTABLE_NFS) +
		    global_page_state(NR_WRITEBACK) <= dirty_thresh)
                        	break;

                congestion_wait(WRITE, HZ/10);
        }

[ note the extra NR_FILE_DIRTY ]

now, balance_dirty_pages() is there to ensure:

  nr_dirty + nr_unstable + nr_writeback < dirty_thresh      (1)

reclaim will (with the introduction of dirty page tracking) never
generate dirty pages, so the only disturbance of that equation is an
increase in nr_writeback.

[ pageout() sets wbc.for_reclaim=1, so NFS traffic will not generate
  unstable pages ]

So, what throttle_vm_writeout() does is limit the number of added
writeback pages to 10% of the total limit.

pageout() seems to avoid stuffing pages down a congested bdi 
(TODO: has details), along with the much smaller io-queues, the initial
purpose of this function - which was to avoid all memory getting stuck
in io-queues - seems to be handled.

Now the problems...

Trouble is that it currently does not take nr_dirty into account which
in the worst case limits it to 110% of the limit.

Also, I'm seeing (2.6.23-rc8-mm1) live-locks in throttle_vm_writeback()
where nr_dirty + nr_unstable > thresh - which according to (1) should
not happen, and will not change without explicit action.

Hmm maybe the 10% is < nr_cpus * ratelimit_pages.

2 cpus, mem=128M -> ratelimit_pages ~ 512
threshold ~ 1500

so indeed: 150 < 1024.

Still not conclusive but at least getting somewhere.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Prev by Date: [REGRESSION from 2.6.23-rc8] (was: Re: 2.6.23-rc4-mm1 and -rc6-mm1: boot failure on HP nx6325, related to clockevents)
Next by Date: Re: [PATCH 7/8] taskstats: fix stats->ac_exitcode to work on threads and use group_exit_code
Previous by thread: Re: + uninline-find_task_by_xxx-set-of-functions.patch added to -mm tree
Next by thread: 2.6.23-rc8-rt1
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]