On Tue, 2006-07-11 at 07:15, Herbert Xu wrote:
> On Tue, Jul 11, 2006 at 03:57:31AM -0700, Andrew Morton wrote:
> >
> > > >> down_write(&listeners->sem);
> > > >> list_for_each_entry_safe(s, tmp, &listeners->list, list) {
> > > >> - ret = genlmsg_unicast(skb, s->pid);
> > > >> + skb_next = NULL;
> > > >> + if (!list_islast(&s->list, &listeners->list)) {
> > > >> + skb_next = skb_clone(skb_cur, GFP_KERNEL);
> > > >
> > > > If we do a GFP_KERNEL allocation with this semaphore held, and the
> > > > oom-killer tries to kill something to satisfy the allocation, and the
> > > > killed task gets stuck on that semaphore, I wonder of the box locks up.
> > >
> > > We do GFP_KERNEL inside semaphores/mutexes in lots of places. So if this
> > > can deadlock with the oom-killer we probably should fix that, preferably
> > > by having GFP_KERNEL fail in that case.
> >
> > This lock is special, in that it's taken on the exit() path (I think). So
> > it can block tasks which are trying to exit.
>
> Sorry, missed the context.
>
> If there is a deadlock then it's not just this allocation that you
> need worry about. There is also an allocation within genlmsg_uniast
> that would be GFP_KERNEL.
Remove down_write() from taskstats code invoked on the exit() path.
In send_cpu_listeners(), which is called on the exit path,
a down_write() was protecting operations like skb_clone() and
genlmsg_unicast() that do GFP_KERNEL allocations. If the oom-killer
decides to kill tasks to satisfy the allocations,the exit of those
tasks could block on the same semphore.
The down_write() was only needed to allow removal of invalid
listeners from the listener list. The patch converts the down_write
to a down_read and defers the removal to a separate critical region.
This ensures that even if the oom-killer is called, no other task's
exit is blocked as it can still acquire another down_read.
Thanks to Andrew Morton & Herbert Xu for pointing out the oom
related pitfalls, and to Chandra Seetharaman for suggesting this
fix instead of using something more complex like RCU.
Signed-Off-By: Chandra Seetharaman <[email protected]>
Signed-Off-By: Shailabh Nagar <[email protected]>
---
kernel/taskstats.c | 24 +++++++++++++++++++-----
1 files changed, 19 insertions(+), 5 deletions(-)
Index: linux-2.6.18-rc1/kernel/taskstats.c
===================================================================
--- linux-2.6.18-rc1.orig/kernel/taskstats.c 2006-07-11 00:13:00.000000000 -0400
+++ linux-2.6.18-rc1/kernel/taskstats.c 2006-07-12 00:07:53.000000000 -0400
@@ -51,6 +51,7 @@ __read_mostly = {
struct listener {
struct list_head list;
pid_t pid;
+ char valid;
};
struct listener_list {
@@ -127,7 +128,7 @@ static int send_cpu_listeners(struct sk_
struct listener *s, *tmp;
struct sk_buff *skb_next, *skb_cur = skb;
void *reply = genlmsg_data(genlhdr);
- int rc, ret;
+ int rc, ret, delcount = 0;
rc = genlmsg_end(skb, reply);
if (rc < 0) {
@@ -137,7 +138,7 @@ static int send_cpu_listeners(struct sk_
rc = 0;
listeners = &per_cpu(listener_array, cpu);
- down_write(&listeners->sem);
+ down_read(&listeners->sem);
list_for_each_entry_safe(s, tmp, &listeners->list, list) {
skb_next = NULL;
if (!list_islast(&s->list, &listeners->list)) {
@@ -150,14 +151,26 @@ static int send_cpu_listeners(struct sk_
}
ret = genlmsg_unicast(skb_cur, s->pid);
if (ret == -ECONNREFUSED) {
- list_del(&s->list);
- kfree(s);
+ s->valid = 0;
+ delcount++;
rc = ret;
}
skb_cur = skb_next;
}
- up_write(&listeners->sem);
+ up_read(&listeners->sem);
+
+ if (!delcount)
+ return rc;
+ /* Delete invalidated entries */
+ down_write(&listeners->sem);
+ list_for_each_entry_safe(s, tmp, &listeners->list, list) {
+ if (!s->valid) {
+ list_del(&s->list);
+ kfree(s);
+ }
+ }
+ up_write(&listeners->sem);
return rc;
}
@@ -290,6 +303,7 @@ static int add_del_listener(pid_t pid, c
goto cleanup;
s->pid = pid;
INIT_LIST_HEAD(&s->list);
+ s->valid = 1;
listeners = &per_cpu(listener_array, cpu);
down_write(&listeners->sem);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]