Re: + delay-accounting-taskstats-interface-send-tgid-once.patch added to -mm tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shailabh Nagar wrote:

Arjan van de Ven wrote:

also in response to
                          Subject: Re: 2.6.17-mm3
                             Date: Tue, 27 Jun 2006 16:12:42 +0200
with Message-ID:
<[email protected]>

On Mon, 2006-06-26 at 12:06 -0700, [email protected] wrote:

+static inline void taskstats_tgid_alloc(struct signal_struct *sig)
+{
+    struct taskstats *stats;
+
+    stats = kmem_cache_zalloc(taskstats_cache, SLAB_KERNEL);
+    if (!stats)
+        return;
+
+    spin_lock(&sig->stats_lock);
+    if (!sig->stats) {


+static inline void taskstats_tgid_free(struct signal_struct *sig)
+{
+       struct taskstats *stats = NULL;
+       spin_lock(&sig->stats_lock);
+       if (sig->stats) {
+               stats = sig->stats;
+               sig->stats = NULL;
+       }
+       spin_unlock(&sig->stats_lock);
+       if (stats)
+               kmem_cache_free(taskstats_cache, stats);
+}

this is buggy and deadlock prone!
(found by lockdep)

stats_lock nests within tasklist_lock, which is taken in irq context.
(and thus needs irq_save treatment). But because of this nesting, it is
not allowed to take stats_lock without disabling irqs, or you can have
the following scenario

CPU 0                        CPU 1
(in irq)                     (in the code above)
                     stats_lock is taken
tasklist_lock is taken stats_lock_is taken <spin> <interrupt happens>
                     tasklist_lock is taken
             which now forms an AB-BA deadlock!


this happens at least in copy_process which can call taskstats_tgid_free
without first disabling interrupts (via cleanup_signal).


Arjan,

Didn't get how stats_lock is nesting within tasklist_lock in copy_process ? The call to cleanup_signal (or any call to taskstats_tgid_alloc/free) seems to be happening
outside of holding the tasklist lock ? Or maybe I'm missing something.

Yup...indeed I was.

In release_task(), __cleanup_signal() is being called within tasklist_lock
protection and that leads to stats_lock being held nested within tasklist_lock.


Changing to an irqsave variant isn't a problem of course...

--Shailabh

There may be
many other cases, I've not checked deeper yet.

Solution should be to make these functions use irqsave variant... any
comments from the authors of this patch ?

Will submit a patch to switch to irqsave variants.

Thanks,
Shailabh



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux