Re: VFS: file-max limit 50044 reached

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 17, 2005 at 08:01:21PM +0200, Eric Dumazet wrote:
> Dipankar Sarma a écrit :
> >On Mon, Oct 17, 2005 at 09:16:25AM -0700, Linus Torvalds wrote:
> >
> >>>Absolutely. Keeping a count of (percpu) queued items is basically free 
> >>>if kept
> >>>in the cache line used by list head, so the 'queue length on this cpu' 
> >>>is a
> >>>cheap metric.
> >>
> >>The only downside to TIF_RCUUPDATE is that those damn TIF-flags are 
> >>per-architecture (probably largely unnecessary, but while most 
> >>architectures don't care at all, others seem to have optimized their 
> >>layout so that they can test the work bits more efficiently). So it's a 
> >>matter of each architecture being updated with its TIF_xyz flag and their 
> >>work function.
> >>
> >>Anybody willing to try? Dipankar apparently has a lot on his plate, this 
> >>_should_ be fairly straightforward. Eric?
> >
> >
> >I *had*, when this hit me :) It was one those spurt things. I am going to
> >look at this, but I think we will need to do this with some careful
> >benchmarking.
> >
> >At the moment however I do have another concern - open/close taking too
> >much time as I mentioned in an earlier email. It is nearly 4 times
> >slower than 2.6.13. So, that is first up in my list of things to
> >do at the moment.
> >
> 
> <lazy_mode=ON>
> Do we really need a TIF_RCUUPDATE flag, or could we just ask for a resched ?
> </lazy_mode>
> 
> This patch only take care of call_rcu(), I'm unsure of what can be done 
> inside call_rcu_bh()
> 
> The two stress program dont hit OOM anymore with this patch applied (even 
> with maxbatch=10)

Keeping the per-CPU count of queued callbacks seems eminently reasonable
to me, as does the set_need_resched().  But the proposed (but fortunately
commented out) call of rcu_do_batch() from call_rcu() does have deadlock
issues.

> Eric
> 

> --- linux-2.6.14-rc4/kernel/rcupdate.c	2005-10-11 03:19:19.000000000 +0200
> +++ linux-2.6.14-rc4-ed/kernel/rcupdate.c	2005-10-17 21:52:18.000000000 +0200
> @@ -109,6 +109,10 @@
>  	rdp = &__get_cpu_var(rcu_data);
>  	*rdp->nxttail = head;
>  	rdp->nxttail = &head->next;
> +
> +	if (unlikely(++rdp->count > 10000))
> +		set_need_resched();
> +
>  	local_irq_restore(flags);
>  }
>  
> @@ -140,6 +144,12 @@
>  	rdp = &__get_cpu_var(rcu_bh_data);
>  	*rdp->nxttail = head;
>  	rdp->nxttail = &head->next;
> +	rdp->count++;

Really need an "rdp->count++" in call_rcu_bh() as well, otherwise
the _bh struct rcu_data will have a steadily decreasing count field.
Strictly speaking, this is harmless, since call_rcu_bh() cheerfully
ignores this field, but this situation is bound to cause severe confusion
at some point.

> +/*
> + *  Should we directly call rcu_do_batch() here ?
> + *  if (unlikely(rdp->count > 10000))
> + *      rcu_do_batch(rdp);
> + */

Good thing that the above is commented out!  ;-)

Doing this can result in self-deadlock, for example with the following:

	spin_lock(&mylock);
	/* do some stuff. */
	call_rcu(&p->rcu_head, my_rcu_callback);
	/* do some more stuff. */
	spin_unlock(&mylock);

void my_rcu_callback(struct rcu_head *p)
{
	spin_lock(&mylock);
	/* self-deadlock via call_rcu() via rcu_do_batch()!!! */
	spin_unlock(&mylock);
}


						Thanx, Paul

>  }
>  
> @@ -157,6 +167,7 @@
>  		next = rdp->donelist = list->next;
>  		list->func(list);
>  		list = next;
> +		rdp->count--;
>  		if (++count >= maxbatch)
>  			break;
>  	}
> --- linux-2.6.14-rc4/include/linux/rcupdate.h	2005-10-11 03:19:19.000000000 +0200
> +++ linux-2.6.14-rc4-ed/include/linux/rcupdate.h	2005-10-17 21:02:25.000000000 +0200
> @@ -94,6 +94,7 @@
>  	long  	       	batch;           /* Batch # for current RCU batch */
>  	struct rcu_head *nxtlist;
>  	struct rcu_head **nxttail;
> +	long            count; /* # of queued items */
>  	struct rcu_head *curlist;
>  	struct rcu_head **curtail;
>  	struct rcu_head *donelist;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux