I just rediscovered a deadlock in drivers/ieee1394/nodemgr.c which I
thought didn't exist anymore. It's still there, it's just a matter of
timing to trigger this. Quoting myself from
http://bugzilla.kernel.org/show_bug.cgi?id=6706 :
------------------------------------------------------------------------
# modprobe ohci1394 && modprobe -r ohci1394
works.
# modprobe ohci1394 && sleep 1 && modprobe -r ohci1394
gets stuck in uninterruptible sleep on kthread_stop(). This is trying to
stop the knodemgrd which uninterruptibly sleeps on
bus_rescan_devices_helper() meanwhile.
Call trace of the modprobe -r context:
kthread_stop in kernel/kthread.c
nodemgr_remove_host in drivers/ieee1394/nodemgr.c
__unregister_host in drivers/ieee1394/highlevel.c
highlevel_remove_host in drivers/ieee1394/highlevel.c
hpsb_remove_host in drivers/ieee1394/hosts.c
ohci1394_pci_remove in drivers/ieee1394/ohci1394.c
pci_device_remove in pci/pci-driver.c
__device_release_driver in drivers/base/dd.c
driver_detach in drivers/base/dd.c
Call trace of the knodemgrd context:
bus_rescan_devices_helper in drivers/base/bus.c
bus_rescan_devices in drivers/base/bus.c
nodemgr_node_probe in drivers/ieee1394/nodemgr.c
nodemgr_host_thread in drivers/ieee1394/nodemgr.c
It seems the following is the culprit:
Since Linux 2.6.16, bus_rescan_devices_helper takes
down(&dev->parent->sem) if a parent device exists. This is true for all
devices that are managed by nodemgr. (FireWire ud's have ud's or ne's as
parent, and FireWire ne's have hosts as parent.) And yes, the call in
driver_detach to __device_release_driver is enclosed in down(&dev->sem).
------------------------------------------------------------------------
The relevant change to bus_rescan_devices_helper in 2.6.16 is
http://kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=bf74ad5bc41727d5f2f1c6bedb2c1fac394de731
> commit bf74ad5bc41727d5f2f1c6bedb2c1fac394de731
> Author: Alan Stern <[email protected]>
> Date: Thu Nov 17 16:54:12 2005 -0500
>
> [PATCH] Hold the device's parent's lock during probe and remove
>
> This patch (as604) makes the driver core hold a device's parent's lock
> as well as the device's lock during calls to the probe and remove
> methods in a driver. This facility is needed by USB device drivers,
> owing to the peculiar way USB devices work:
[...]
> I have not tested this patch for conflicts with other subsystems. As
> far as I can see, the only possibility of conflict would lie in the
> bus_rescan_devices pathway, and it seems pretty remote. Nevertheless,
> it would be good for this to get a lot of testing in -mm.
Yes, it's pretty remote but there is indeed one if I'm not entirely
mistaken.
Right now I don't see a sane fix but I will have a few nights sleep over
it...
--
Stefan Richter
-=====-=-==- =-== =--==
http://arcgraph.de/sr/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]