Re: [Patch 6/6] per task delay accounting taskstats interface: fix clone skbs for each listener

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2006-07-11 at 07:15, Herbert Xu wrote:
> On Tue, Jul 11, 2006 at 03:57:31AM -0700, Andrew Morton wrote:
> >
> > > >>       down_write(&listeners->sem);
> > > >>       list_for_each_entry_safe(s, tmp, &listeners->list, list) {
> > > >> -             ret = genlmsg_unicast(skb, s->pid);
> > > >> +             skb_next = NULL;
> > > >> +             if (!list_islast(&s->list, &listeners->list)) {
> > > >> +                     skb_next = skb_clone(skb_cur, GFP_KERNEL);
> > > > 
> > > > If we do a GFP_KERNEL allocation with this semaphore held, and the
> > > > oom-killer tries to kill something to satisfy the allocation, and the
> > > > killed task gets stuck on that semaphore, I wonder of the box locks up.
> > > 
> > > We do GFP_KERNEL inside semaphores/mutexes in lots of places.  So if this
> > > can deadlock with the oom-killer we probably should fix that, preferably
> > > by having GFP_KERNEL fail in that case.
> > 
> > This lock is special, in that it's taken on the exit() path (I think).  So
> > it can block tasks which are trying to exit.
> 
> Sorry, missed the context.
> 
> If there is a deadlock then it's not just this allocation that you
> need worry about.  There is also an allocation within genlmsg_uniast
> that would be GFP_KERNEL.


Remove down_write() from taskstats code invoked on the exit() path.

In send_cpu_listeners(), which is called on the exit path, 
a down_write() was protecting operations like skb_clone() and 
genlmsg_unicast() that do GFP_KERNEL allocations. If the oom-killer 
decides to kill tasks to satisfy the allocations,the exit of those 
tasks could block on the same semphore.

The down_write() was only needed to allow removal of invalid 
listeners from the listener list. The patch converts the down_write 
to a down_read and defers the removal to a separate critical region. 
This ensures that even if the oom-killer is called, no other task's 
exit is blocked as it can still acquire another down_read.

Thanks to Andrew Morton & Herbert Xu for pointing out the oom 
related pitfalls, and to Chandra Seetharaman for suggesting this 
fix instead of using something more complex like RCU.

Signed-Off-By: Chandra Seetharaman <[email protected]>
Signed-Off-By: Shailabh Nagar <[email protected]>

---

 kernel/taskstats.c |   24 +++++++++++++++++++-----
 1 files changed, 19 insertions(+), 5 deletions(-)

Index: linux-2.6.18-rc1/kernel/taskstats.c
===================================================================
--- linux-2.6.18-rc1.orig/kernel/taskstats.c	2006-07-11 00:13:00.000000000 -0400
+++ linux-2.6.18-rc1/kernel/taskstats.c	2006-07-12 00:07:53.000000000 -0400
@@ -51,6 +51,7 @@ __read_mostly = {
 struct listener {
 	struct list_head list;
 	pid_t pid;
+	char valid;
 };
 
 struct listener_list {
@@ -127,7 +128,7 @@ static int send_cpu_listeners(struct sk_
 	struct listener *s, *tmp;
 	struct sk_buff *skb_next, *skb_cur = skb;
 	void *reply = genlmsg_data(genlhdr);
-	int rc, ret;
+	int rc, ret, delcount = 0;
 
 	rc = genlmsg_end(skb, reply);
 	if (rc < 0) {
@@ -137,7 +138,7 @@ static int send_cpu_listeners(struct sk_
 
 	rc = 0;
 	listeners = &per_cpu(listener_array, cpu);
-	down_write(&listeners->sem);
+	down_read(&listeners->sem);
 	list_for_each_entry_safe(s, tmp, &listeners->list, list) {
 		skb_next = NULL;
 		if (!list_islast(&s->list, &listeners->list)) {
@@ -150,14 +151,26 @@ static int send_cpu_listeners(struct sk_
 		}
 		ret = genlmsg_unicast(skb_cur, s->pid);
 		if (ret == -ECONNREFUSED) {
-			list_del(&s->list);
-			kfree(s);
+			s->valid = 0;
+			delcount++;
 			rc = ret;
 		}
 		skb_cur = skb_next;
 	}
-	up_write(&listeners->sem);
+	up_read(&listeners->sem);
+
+	if (!delcount)
+		return rc;
 
+	/* Delete invalidated entries */
+	down_write(&listeners->sem);
+	list_for_each_entry_safe(s, tmp, &listeners->list, list) {
+		if (!s->valid) {
+			list_del(&s->list);
+			kfree(s);
+		}
+	}
+	up_write(&listeners->sem);
 	return rc;
 }
 
@@ -290,6 +303,7 @@ static int add_del_listener(pid_t pid, c
 				goto cleanup;
 			s->pid = pid;
 			INIT_LIST_HEAD(&s->list);
+			s->valid = 1;
 
 			listeners = &per_cpu(listener_array, cpu);
 			down_write(&listeners->sem);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux