On Fri, 23 Sep 2005 10:36:16 +1000, Con Kolivas said: (Adding Andrea to the To: list...) > On Fri, 23 Sep 2005 05:59, [email protected] wrote: > > Am seeing reproducible wedging up when writing large (20M+) files to an > > ext3 file system. Oddly enough, if something *else* writes files to the > > file system as well, it will unwedge for a while and make progress. Also, > > a 'sync' command will relieve things temporarily - but after a few > > megabytes it comes to a halt again. Looks like a borkage someplace not > > causing it to actually finish pushing dirty file pages out - gkrellm > > reports little/no disk activity in progress. File activity on *other* > > filesystems continues unimpeded. > > > Could be the write throttling patches. > > Try backing these out (in this order I think): > > http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.14-rc2/2.6 .14-rc2-mm1/broken-out/per-task-predictive-write-throttling-1-tweaks.patch > http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.14-rc2/2.6 .14-rc2-mm1/broken-out/per-task-predictive-write-throttling-1.patch Bingo. I haven't built a kernel with these excluded, but writing 0 to /proc/sys/vm/dirty_ratio_centisecs fixes the problem, so I'm pretty sure this is it. (For the record, I've noticed the starvation issue that Andrea is trying to address, where one process can lock out others, so I *do* think work is needed here...) Tuning/debugging info I gathered: 1) I was seeing 'future_pages' values averaging as high as 7K to 14K - is this a "reasonable" number when writing a 38M file to a fairly sluggish laptop disk (gkrellm showing 1M/sec to 3M/sec often, maybe 10M/sec if it's a single fairly linear write..). Popular values that were seen on multiple trials included 14364, 13851, and 13338 (though a few times, a wedge would hit 14364, and a few seconds later would "burp" up a bunch of disk I/O and drop to 7626 and stay there). The reproducability of the numbers probably says more about the fact that the system is otherwise basically idle at 3AM than anything else (so the number of available pages isn't bouncing around due to other processes). 2) 'centiseconds' value of 0 disabled as designed. The default value of '500' is *waaay* too high on my laptop - even 100 is consistently too much. 40 was consistently low enough, 50 was usually OK, 75 was usually *not* OK, but would sometimes "stutter" through and not completely grind to a halt. I'm not sure exactly where the "knee" is, or if it moves during higher-load (my laptop works harder during the day in the office than 2AM at home, usually). The patch includes documentation: +dirty_ratio_centisecs +----------------- + +Throttle the I/O if the per-task writing bandwidth is high enough for +the dirty_ratio to be reached in less than dirty_ratio_centisecs. This +makes the write throttling per-process and avoids making too much +memory dirty at the same time. Ideally in the future we should add +some feedback from the backing_dev_info to know the max disk bandwidth. I'm pretty convinced that for this patch to work, it *will* need feedback from the actual (not max) disk bandwidth and possibly the actual amount of RAM - what works on Andrea's 1G workstation with (presumably) a real disk system is waay too much for 256M and a single laptop-class disk. For now, I'm leaving centisecs set to 40, and will see how that works - most of my "problem cases" involve an FTP on a 10/100mbit connection, so that will get tried tomorrow.....
Attachment:
pgpBxrUY8xkyd.pgp
Description: PGP signature
- Follow-Ups:
- Re: 2.6.14-rc2-mm1 - ext3 wedging up
- From: Andrea Arcangeli <[email protected]>
- Re: 2.6.14-rc2-mm1 - ext3 wedging up
- From: Con Kolivas <[email protected]>
- Re: 2.6.14-rc2-mm1 - ext3 wedging up
- From: Andrea Arcangeli <[email protected]>
- Re: 2.6.14-rc2-mm1 - ext3 wedging up
- References:
- 2.6.14-rc2-mm1 - ext3 wedging up
- From: [email protected]
- Re: 2.6.14-rc2-mm1 - ext3 wedging up
- From: Con Kolivas <[email protected]>
- 2.6.14-rc2-mm1 - ext3 wedging up
- Prev by Date: Re: [patch 0/6] mm: alloc_percpu and bigrefs
- Next by Date: Re: [Fastboot] [PATCH] Kdump(x86): add note type NT_KDUMPINFO tokernel core dumps
- Previous by thread: Re: 2.6.14-rc2-mm1 - ext3 wedging up
- Next by thread: Re: 2.6.14-rc2-mm1 - ext3 wedging up
- Index(es):