Dipankar Sarma a écrit :
Eric,
Going by your patch, I converted my nr-files patch to use
percpu counters - except that I just used the existing
percpu counter code. This patch is untested, just for comments.
If you agree with the approach, we can go with it. I will
get some benchmark numbers measured on a 16-CPU box.
Thanks
Dipankar
The way we do file struct accounting is not very suitable for batched
freeing. For scalability reasons, file accounting was constructor/destructor
based. This meant that nr_files was decremented only when
the object was removed from the slab cache. This is
susceptible to slab fragmentation. With RCU based file structure,
consequent batched freeing and a test program like Serge's,
we just speed this up and end up with a very fragmented slab -
llm22:~ # cat /proc/sys/fs/file-nr
587730 0 758844
At the same time, I see only a 2000+ objects in filp cache.
The following patch I fixes this problem.
This patch changes the file counting by removing the filp_count_lock.
Instead we use a separate percpu counter, nr_files, for now and all
accesses to it are through get_nr_files() api. In the sysctl
handler for nr_files, we populate files_stat.nr_files before returning
to user.
Counting files as an when they are created and destroyed (as opposed
to inside slab) allows us to correctly count open files with RCU.
Signed-off-by: Dipankar Sarma <[email protected]>
---
fs/dcache.c | 2 -
fs/file_table.c | 80 +++++++++++++++++++++++++++++++--------------------
include/linux/file.h | 2 -
include/linux/fs.h | 2 +
kernel/sysctl.c | 5 ++-
net/unix/af_unix.c | 2 -
6 files changed, 57 insertions(+), 36 deletions(-)
Hi Dipankar
I believe your patch is good.
However there is a bug in get_empty_filp() :
if security_file_alloc(f) returns TRUE : The goto fail_sec; is done and
fail_sec: does a file_free(f)
This means an extra percpu_counter_mod(&nr_files, -1L); will be done.
So I think you must apply this patch too :
fail_sec:
- file_free(f);
+ kmem_cache_free(filp_cachep, f);
fail:
return NULL;
(Ie no need for a delayed free done via rcu here, just a direct
kmem_cache_free() call)
Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]