Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals

On Wed, Aug 17, 2005 at 06:35:03PM +0400, Oleg Nesterov wrote:
> Paul E. McKenney wrote:
> >
> > > The other thing that jumped out at me is that signals are very different
> > > animals from a locking viewpoint depending on whether they are:
> > >
> > > 1.	ignored,
> > >
> > > 2.	caught by a single thread,
> > >
> > > 3.	fatal to multiple threads/processes (though I don't know
> > > 	of anything that shares sighand_struct between separate
> > > 	processes), or
> > >
> > > 4.	otherwise global to multiple threads/processes (such as
> > > 	SIGSTOP and SIGCONT).
> > >
> > > And there are probably other distinctions that I have not yet caught
> > > on to.
> > >
> > > One way to approach this would be to make your suggested lock_task_sighand()
> > > look at the signal and acquire the appropriate locks.  If, having acquired
> > > a given set of locks, it found that the needed set had changed (e.g., due
> > > to racing exec() or sigaction()), then it drops the locks and retries.
> >
> > OK, for this sort of approach to work, lock_task_sighand() must take and
> > return some sort of mask indicating what locks are held.  The mask returned
> > by lock_task_sighand() would then be passed to an unlock_task_sighand().
> 
> Sorry, I don't understand you. CLONE_THREAD implies CLONE_SIGHAND,
> so we always need to lock one ->sighand. Could you please clarify?

On the #3 and #4 code paths, the code assumes that the task-list lock
is held.  So I was thinking of something (very roughly) as follows:

#define SIGLOCK_HOLD_RCU      (1 << 0)
#define SIGLOCK_HOLD_TASKLIST (1 << 1)
#define SIGLOCK_HOLD_SIGLOCK  (1 << 2)

int lock_task_sighand(int sig, struct task_struct *p, int locksheld, struct sighand_struct **spp, int *flags)
{
	int locksret = 0;
	struct sighand_struct *sp;

retry:
	if (!(locksheld & SIGLOCK_HOLD_RCU)) {
		locksret |= SIGLOCK_HOLD_RCU;
		rcu_read_lock();
	}
	sp = rcu_dereference(p->sighand);
	if (sp == NULL) {
		unlock_task_sighand(NULL, locksret, *flags);
		*spp = NULL;
		return 0;
	}
	if (!(locksheld & SIGLOCK_HOLD_TASKLIST)) {
		/* Complain if siglock held. */
		if (sig_kernel_stop(sig) /* expand for other conditions */) {
			locksret |= SIGLOCK_HOLD_TASKLIST;
			read_lock(&tasklist_lock);
		}
	}
	if (!(locksheld & SIGLOCK_HOLD_SIGLOCK)) {
		*flags = 0;
	} else {
		locksret |= SIGLOCK_HOLD_SIGLOCK;
		spin_lock_irqsave(&sp->siglock, *flags);
		if (p->sighand != sp) {
			unlock_task_sighand(sp, locksret, *flags);
			goto retry;
		}
	}
	*spp = sp;
	return locksret;
}

void unlock_task_sighand(struct sighand_struct *sp, int lockstodrop, int flags)
{
	/* lockstodrop had better be non-negative! */

	if (lockstodrop & SIGLOCK_HOLD_SIGLOCK) {
		spin_unlock_irqrestore(&sp->siglock, flags);
	}
	if (lockstodrop & SIGLOCK_HOLD_TASKLIST) {
		read_unlock(&tasklist_lock);
	}
	if (lockstodrop & SIGLOCK_HOLD_RCU) {
		rcu_read_unlock();
	}
}

The "expand for other conditions" must also cover signals that cause
coredumps, that kill all threads in a process, or that otherwise affect
more than one thread (e.g., kill with a negative signal number).  I am
assuming that your argument about not needing the get_task_struct_rcu()
eventually work their way through my skull.  ;-)

Of course, the "expand for other conditions" requires reference to the
sighand_struct.  And for that, you really need to be holding siglock.
But you have to drop siglock to acquire tasklist_lock.  So will end up
relying on RCU some more to safely peek into sighand_struct -before-
acquiring siglock -- which is why I snapshot p->sighand so early, it
will need to be referenced when checking "other conditions".

I am not exactly happy with the above pair of functions, but I suspect
that they might -really- simplify things a bit.

The usage would be as follows:

	if ((lockstodrop = lock_task_sighand(sig, tsk, 0, &sp, &flags)) < 0)
		return lockstodrop;  /* error code */
	if (sp != NULL) {
		/* invoke the function to send the signal */
	}
	unlock_task_sighand(sp, lockstodrop, flags);

Thoughts?  Hey, you asked for this!!!  ;-)

> > > > +	if (!ret && sig && (sp = p->sighand)) {
> > > >  		if (!get_task_struct_rcu(p)) {
> > > >  			return -ESRCH;
> > > >  		}
> > > > -		spin_lock_irqsave(&p->sighand->siglock, flags);
> > > > +		spin_lock_irqsave(&sp->siglock, flags);
> > > > +		if (p->sighand != sp) {
> > > > +			spin_unlock_irqrestore(&sp->siglock, flags);
> > > > +			put_task_struct(p);
> > > > +			goto retry;
> > > > +		}
> > > >  		ret = __group_send_sig_info(sig, info, p);
> > > > -		spin_unlock_irqrestore(&p->sighand->siglock, flags);
> > > > +		spin_unlock_irqrestore(&sp->siglock, flags);
> > > >  		put_task_struct(p);
> > >
> > > Do we really need get_task_struct_rcu/put_task_struct here?
> > >
> > > The task_struct can't go away under us, it is rcu protected.
> > > When ->sighand is locked, and it is still the same after
> > > the re-check, it means that 'p' has not done __exit_signal()
> > > yet, so it is safe to send the signal.
> > >
> > > And if the task has ->usage == 0, it means that it also has
> > > ->sighand == NULL, and your code will notice that.
> > >
> > > No?
> >
> > Seems plausible.  I got paranoid after seeing the lock dropped in
> > handle_stop_signal(), though.
> 
> Yes, this is bad and should be fixed, I agree.
> 
> But why do you think we need to bump task->usage? It can't make any
> difference, afaics. The task_struct can't dissapear, the caller was
> converted to use rcu_read_lock() or it holds tasklist_lock.
> 
> Nonzero task_struct->usage can't stop do_exit or sys_wait4, it will
> only postpone call_rcu(__put_task_struct_cb).
> 
> And after we locked ->sighand we have sufficient memory barrier, so
> if we read the stale value into 'sp' we will notice that (if you were
> worried about this issue).
> 
> Am I missed something?

I doubt if you are missing anything.  I was just being paranoid.
Seeing locks get momentarily get dropped in the middle of functions is
a -really- good way to get me into that state!  But since that code
path can be called with tasklist_lock held, it should not be sleeping,
so I believe that you are correct.

Hence my switching to rcu_read_lock() in lock_task_sighand() above.

> >  void exit_sighand(struct task_struct *tsk)
> >  {
> >  	write_lock_irq(&tasklist_lock);
> > -	__exit_sighand(tsk);
> > +	spin_lock(&tsk->sighand->siglock);
> > +	if (tsk->sighand != NULL) {
> > +		__exit_sighand(tsk);
> > +	}
> > +	spin_unlock(&tsk->sighand->siglock);
> >  	write_unlock_irq(&tasklist_lock);
> >  }
> 
> Very strange code. Why this check? And what happens with
> 
> 	spin_unlock(&tsk->sighand->siglock);
> 
> when tsk->sighand == NULL ?

My bad.  Ingo beat you to it though.  ;-)

							Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: Oleg Nesterov <oleg@tv-sign.ru>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: Oleg Nesterov <oleg@tv-sign.ru>

References:
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: Oleg Nesterov <oleg@tv-sign.ru>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: "Paul E. McKenney" <paulmck@us.ibm.com>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: Oleg Nesterov <oleg@tv-sign.ru>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: "Paul E. McKenney" <paulmck@us.ibm.com>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: Oleg Nesterov <oleg@tv-sign.ru>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: "Paul E. McKenney" <paulmck@us.ibm.com>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: "Paul E. McKenney" <paulmck@us.ibm.com>
- Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
  - From: Oleg Nesterov <oleg@tv-sign.ru>

Prev by Date: [PATCH] DVB: Clarify description text for dvb-bt8xx in Kconfig
Next by Date: Re: [RFC] IPV4 long lasting timer function
Previous by thread: Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
Next by thread: Re: [RFC,PATCH] Use RCU to protect tasklist for unicast signals
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind]