On Thu, Sep 01, 2005 at 05:48:28PM +0200, Eric Dumazet wrote:
> dentry cache uses sophisticated RCU technology (and prefetching if
> available) but touches 2 cache lines per dentry during hlist lookup.
>
> This patch moves d_hash in the same cache line than d_parent and d_name
> fields so that :
>
> 1) One cache line is needed instead of two.
> 2) the hlist_for_each_rcu() prefetching has a chance to bring all the
> needed data in advance, not only the part that includes d_hash.next.
>
> I also changed one old comment that was wrong for 64bits.
>
> A further optimisation would be to separate dentry in two parts, one that
> is mostly read, and one writen (d_count/d_lock) to avoid false sharing on
> SMP/NUMA but this would need different field placement depending on 32bits
> or 64bits platform.
Do you have performance numbers that show the benefits ? In the
past, I did try some optimizations like this but found no demonstrable
benefits. If it ain't broken .....
Thanks
Dipankar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|