On Thu, Jun 01, 2006 at 06:06:59PM -0700, Andrew Morton wrote:
> On Thu, 01 Jun 2006 11:51:25 +0200
> [email protected] wrote:
>
> > This is an attempt to have per-superblock unused dentry lists.
>
> Fairly significant clashes with
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm2/broken-out/fix-dcache-race-during-umount.patch
>
> I guess Neil's patch will go into the 2.6.18 tree, so you'd be best off
> working against that.
Though this patch series fixes the same problem in a much cleaner
way. It effectively obsoletes Neil's fix.
> Also, you're making what appears to be a quite deep design change to a
> pretty important part of the memory reclaim code and all the info we have
> is this:
>
>
> + /*
> + * Try to be fair to the unused lists:
> + * sb_count/sb_unused ~ count/global_unused
> + *
> + * Additionally, if the age_limit of the
> + * superblock is expired shrink at least one
> + * dentry from the superblock
> + */
> + tmp = sb->s_dentry_stat.nr_unused /
> + ((unused / count) + 1);
> + if (!tmp && time_after(jiffies,
> + sb->s_dentry_unused_age))
> + tmp = 1;
>
>
> Please, we'll need much much more description of what this is trying to
> achieve, why it exists, analysis, testing results, etc, etc. Coz my
> immediate reaction is "wtf is that, and what will that do to my computer?".
Discussed in this thread:
http://marc.theaimsgroup.com/?l=linux-fsdevel&m=114890371801114&w=2
Short summary of the problem: due to SHRINK_BATCH resolution, a proportional
reclaim based on "count" across all superblocks will not shrink anything on
lists 2 orders of magnitude smaller than the longest list as tmp will evaluate
as zero. Hence to prevent small unused lists from never being reclaimed and
pinning memory until >90% of the dentry cache has been reclaimed we need to
turn them over slowly. However, if we turn them over too quickly, the dentry
cache does no caching for small filesystems.
This is not a problem a single global unused list has...
> In particular, `jiffies' has near-to-zero correlation with the rate of
> creation and reclaim of these objects, so it looks highly inappropriate
> that it's in there. If anything can be used to measure "time" in this code
> it is the number of scanned entries, not jiffies.
Sure, but SHRINK_BATCH resolution basically makes it impossible to reconcile
lists of vastly different lengths. If the shrinker simply called us
with the entire count it now hands us in batches, I doubt that this would be
an issue.
In the mean time, we need some other method to ensure we do eventually free
up these items on small lists. The above implements an idle timer used to
determine when we start to turn over a small cache. Maybe if we wrap it in:
> + if (!tmp && dentry_lru_idle(sb))
> + tmp = 1;
with a more appropriate comment it would make more sense?
Suggestions on other ways to resolve the problem are welcome....
Cheers,
Dave.
--
Dave Chinner
R&D Software Enginner
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]