Hi! I've been working on some usb chipcard reader drivers in which I use the asynchronous usb devio URB delivery to usrerspace. There have always been some bug reports of kernel oopses while terminating a program that uses the driver. Now I found out a way to relatively easy trigger that oops from pcscd (part of pcsc-lite) on any kernel, at least tested with 2.6.8 to 2.6.12-rc5: Unable to handle kernel NULL pointer dereference at virtual address 00000508 printing eip: c02c8591 *pde = 00000000 Oops: 0000 [#1] SMP Modules linked in: ipv6 autofs4 i810_audio ac97_codec amd74xx snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd soundcore snd_page_alloc ext3 jbd mbcache lp thermal processor fan button ac battery evdev eth1394 ohci1394 ieee1394 ohci_hcd pl2303 usbserial usblp tg3 e1000 ehci_hcd uhci_hcd parport_serial parport_pc parport hw_random i2c_amd8111 tpm_nsc tpm_atmel tpm dm_mod w83627hf eeprom lm85 w83781d i2c_sensor i2c_isa i2c_amd756 i2c_core ide_cd ide_core genrtc sr_mod cdrom sd_mod sata_sil libata usb_storage aic7xxx scsi_transport_spi sg scsi_mod unix CPU: 1 EIP: 0060:[<c02c8591>] Not tainted VLI EFLAGS: 00010086 (2.6.12-rc5kdb1) EIP is at _spin_lock_irqsave+0x11/0x60 eax: 00000504 ebx: 00000286 ecx: f7def898 edx: f7def890 esi: 00000504 edi: f5cea060 ebp: c1861d84 esp: c1861d74 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c1860000 task=dfa87540) Stack: 00000000 c1861da0 00000021 f4664800 c1861da4 c0125fa9 f7def8b0 f7def8a8 00000001 f7def890 f4664800 f7def880 c1861e44 c023cfd9 00000021 c1861db8 f5cea060 00000021 ffffff94 fffffffc 080ef770 c1861e68 c023cf90 341fc000 Call Trace: [<c0103dbf>] show_stack+0x7f/0xa0 [<c0103f60>] show_registers+0x160/0x1c0 [<c0104156>] die+0xf6/0x1a0 [<c0112cd6>] do_page_fault+0x356/0x68f [<c01039bf>] error_code+0x4f/0x54 [<c0125fa9>] send_sig_info+0x39/0x80 [<c023cfd9>] async_completed+0xa9/0xb0 [<c0237e31>] usb_hcd_giveback_urb+0x31/0x80 [<f8a21e84>] finish_urb+0x74/0x100 [ohci_hcd] [<f8a22f88>] finish_unlinks+0x2b8/0x2e0 [ohci_hcd] [<f8a23d1d>] ohci_irq+0x17d/0x190 [ohci_hcd] [<c0237eb8>] usb_hcd_irq+0x38/0x70 [<c0139ab3>] handle_IRQ_event+0x33/0x70 [<c0139bcd>] __do_IRQ+0xdd/0x150 [<c010551c>] do_IRQ+0x1c/0x30 [<c010388a>] common_interrupt+0x1a/0x20 [<c0100ca9>] cpu_idle+0x79/0x80 [<c03ed56a>] start_secondary+0x6a/0xa0 [<00000000>] 0x0 [<c1861fb4>] 0xc1861fb4 Code: ff c9 c3 0f 0b d7 00 81 63 2e c0 eb e9 8d b6 00 00 00 00 8d bc 27 00 00 00 00 55 89 e5 83 ec 10 89 75 fc 89 5d f8 89 c6 9c 5b fa <81> 78 04 ad 4e ad de 75 22 f0 fe 0e 79 13 f7 c3 00 02 00 00 74 Entering kdb (current=0xdfa87540, pid 0) on processor 1 Oops: Oops due to oops @ 0xc02c8591 eax = 0x00000504 ebx = 0x00000286 ecx = 0xf7def898 edx = 0xf7def890 esi = 0x00000504 edi = 0xf5cea060 esp = 0xc1861d74 eip = 0xc02c8591 ebp = 0xc1861d84 xss = 0x00000068 xcs = 0x00000060 eflags = 0x00010086 xds = 0x0000007b xes = 0x0000007b origeax = 0xffffffff ®s = 0xc1861d40 [1]kdb> bt Stack traceback for pid 0 0xdfa87540 0 1 1 1 R 0xdfa87700 *swapper EBP EIP Function (args) 0xc1861d84 0xc02c8591 _spin_lock_irqsave+0x11 (0xf7def8b0, 0xf7def8a8, 0x1, 0xf7def890, 0xf4664800) 0xc1861da4 0xc0125fa9 send_sig_info+0x39 (0x21, 0xc1861db8, 0xf5cea060, 0x21, 0xffffff94) 0xc1861e44 0xc023cfd9 async_completed+0xa9 (0xf6245c80, 0xc1861f60, 0xf79e4800, 0xf6245c80) 0xc1861e5c 0xc0237e31 usb_hcd_giveback_urb+0x31 (0xf79e4800, 0xf6245c80, 0xc1861f60, 0xf6245c80, 0xf6948160) 0xc1861e7c 0xf8a21e84 [ohci_hcd]finish_urb+0x74 (0xf79e4928, 0xf6245c80, 0xc1861f60, 0xc1ac8060, 0xf79e4800) 0xc1861ec4 0xf8a22f88 [ohci_hcd]finish_unlinks+0x2b8 (0xf79e4928, 0x9acb, 0xc1861f60, 0xf79e4800, 0x1) 0xc1861ee4 0xf8a23d1d [ohci_hcd]ohci_irq+0x17d (0xf79e4800, 0xc1861f60, 0xdf940b60, 0x0) 0xc1861efc 0xc0237eb8 usb_hcd_irq+0x38 (0xe1, 0xf79e4800, 0xc1861f60, 0x0, 0xe1) 0xc1861f24 0xc0139ab3 handle_IRQ_event+0x33 (0xe1, 0xc1861f38, 0xc011fcd8, 0xc03dcb1c, 0xdf940b60) 0xc1861f50 0xc0139bcd __do_IRQ+0xdd 0xc1861f58 0xc010551c do_IRQ+0x1c (0xc1860000, 0xc170f2e0, 0xc0100bc0, 0xffffe000, 0xc0410380) 0xc1861f94 0xc010388a common_interrupt+0x1a 0xc0100ca9 cpu_idle+0x79 So what do we learn from that? That usb_hcd_giveback_urb() is called from in_interrupt() context and calls async_completed(). async_completed() calls send_sig_info(), which in turn does a spin_lock(&tasklist_lock) to protect itself from task_struct->sighand from going away. However, the call to "spin_lock_irqsave(task_struct->sighand->siglock)" causes an oops. So either sighand or the task have disappeared. I think there is currently no protection/locking/refcounting going on. If a process issues an URB from userspace and starts to terminate before the URB comes back, we run into the issue described above. This is because the urb saves a pointer to "current" when it is posted to the device, but there's no guarantee that this pointer is still valid afterwards. I'm not familiar with the scheduler code to decide what fix is the way to go. Is it sufficient to do {get,put}_task_struct() from the usb code? Any comments welcome... -- - Harald Welte <[email protected]> http://gnumonks.org/ ============================================================================ "Privacy in residential applications is a desirable marketing option." (ETSI EN 300 175-7 Ch. A6)
Attachment:
pgpR4ZBgpz5Vi.pgp
Description: PGP signature
- Follow-Ups:
- Re: [BUG] oops while completing async USB via usbdevio
- From: Harald Welte <[email protected]>
- Re: [BUG] oops while completing async USB via usbdevio
- Prev by Date: Re: [2.6 patch] security/: possible cleanups
- Next by Date: [-mm patch] drivers/message/i2o/device.c: i2o_parm_issue has to be global
- Previous by thread: [2.4 patch] document that gcc 4 is not supported
- Next by thread: Re: [BUG] oops while completing async USB via usbdevio
- Index(es):