Re: [PATCH 2/2] fix file counting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dipankar Sarma <[email protected]> wrote:
>
> I have benchmarked this on an x86_64 NUMA system and see no
>  significant performance difference on kernbench. Tested on
>  both x86_64 and powerpc.
> 
>  The way we do file struct accounting is not very suitable for batched
>  freeing. For scalability reasons, file accounting was constructor/destructor
>  based. This meant that nr_files was decremented only when
>  the object was removed from the slab cache. This is
>  susceptible to slab fragmentation. With RCU based file structure,
>  consequent batched freeing and a test program like Serge's,
>  we just speed this up and end up with a very fragmented slab -
> 
>  llm22:~ # cat /proc/sys/fs/file-nr
>  587730  0       758844
> 
>  At the same time, I see only a 2000+ objects in filp cache.
>  The following patch I fixes this problem. 
> 
>  This patch changes the file counting by removing the filp_count_lock.
>  Instead we use a separate percpu counter, nr_files, for now and all
>  accesses to it are through get_nr_files() api. In the sysctl
>  handler for nr_files, we populate files_stat.nr_files before returning
>  to user.
> 
>  Counting files as an when they are created and destroyed (as opposed
>  to inside slab) allows us to correctly count open files with RCU.

Fair enough.

What do you think of these changes?




From: Andrew Morton <[email protected]>

- Nuke the blank line between "}" and EXPORT_SYMBOL().  That's never seemed
  pointful to me.

- Make the get_max_files export use _GPL - only unix.ko uses it.

- Use `-1' in the arg to percpu_counter_mod() rather than `-1L'.  The
  compiler will dtrt and we shouldn't be peering inside percpu_counter
  internals here anyway.

- Scrub that - use percpu_counter_dec() and percpu_counter_inc().

- percpu_counters can be inaccurate on big SMP.  Before we actually fail a
  get_empty_filp() attempt, use the (new in -mm) expensive
  percpu_counter_sum() to check whether we're really over the limit.

- Make get_nr_files() static.  Which is just as well - any callers might
  want the percpu_counter_sum() treatment.  In which case we'd be better off
  exporting some

	bool are_files_over_limit(how_many_i_want);

  API.

Cc: Dipankar Sarma <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---

 fs/file_table.c    |   21 +++++++++++++--------
 include/linux/fs.h |    1 -
 2 files changed, 13 insertions(+), 9 deletions(-)

diff -puN fs/file_table.c~fix-file-counting-fixes fs/file_table.c
--- devel/fs/file_table.c~fix-file-counting-fixes	2006-02-18 01:02:43.000000000 -0800
+++ devel-akpm/fs/file_table.c	2006-02-18 01:02:43.000000000 -0800
@@ -22,6 +22,7 @@
 #include <linux/fsnotify.h>
 #include <linux/sysctl.h>
 #include <linux/percpu_counter.h>
+
 #include <asm/atomic.h>
 
 /* sysctl tunables... */
@@ -42,14 +43,14 @@ static inline void file_free_rcu(struct 
 
 static inline void file_free(struct file *f)
 {
-	percpu_counter_mod(&nr_files, -1L);
+	percpu_counter_dec(&nr_files);
 	call_rcu(&f->f_u.fu_rcuhead, file_free_rcu);
 }
 
 /*
  * Return the total number of open files in the system
  */
-int get_nr_files(void)
+static int get_nr_files(void)
 {
 	return percpu_counter_read_positive(&nr_files);
 }
@@ -61,8 +62,7 @@ int get_max_files(void)
 {
 	return files_stat.max_files;
 }
-
-EXPORT_SYMBOL(get_max_files);
+EXPORT_SYMBOL_GPL(get_max_files);
 
 /*
  * Handle nr_files sysctl
@@ -95,15 +95,20 @@ struct file *get_empty_filp(void)
 	/*
 	 * Privileged users can go above max_files
 	 */
-	if (get_nr_files() >= files_stat.max_files &&
-				!capable(CAP_SYS_ADMIN))
-		goto over;
+	if (get_nr_files() >= files_stat.max_files && !capable(CAP_SYS_ADMIN)) {
+		/*
+		 * percpu_counters are inaccurate.  Do an expensive check before
+		 * we go and fail.
+		 */
+		if (percpu_counter_sum(&nr_files) >= files_stat.max_files)
+			goto over;
+	}
 
 	f = kmem_cache_alloc(filp_cachep, GFP_KERNEL);
 	if (f == NULL)
 		goto fail;
 
-	percpu_counter_mod(&nr_files, 1L);
+	percpu_counter_inc(&nr_files);
 	memset(f, 0, sizeof(*f));
 	if (security_file_alloc(f))
 		goto fail_sec;
diff -puN include/linux/fs.h~fix-file-counting-fixes include/linux/fs.h
--- devel/include/linux/fs.h~fix-file-counting-fixes	2006-02-18 01:02:43.000000000 -0800
+++ devel-akpm/include/linux/fs.h	2006-02-18 01:02:43.000000000 -0800
@@ -35,7 +35,6 @@ struct files_stat_struct {
 	int max_files;		/* tunable */
 };
 extern struct files_stat_struct files_stat;
-extern int get_nr_files(void);
 extern int get_max_files(void);
 
 struct inodes_stat_t {
_

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux