RE: oops with USB Storage on 2.6.14

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've run into a bug like this several times using 2.6.14-rc4 while
testing dm-multipath's reaction to uevents generated by forcing 
fiber channel transport failures -- which leads to the scsi device
being detached and the queuedata pointer in the device's queue being
reset in scsi_device_dev_release.  The fix I've used is below and
it seems to work well for me.  I was going to place this patch on
dm-devel today or tomorrow anyway.

drivers/scsi/scsi_lib.c:scsi_next_command()
Call scsi_device_get and scsi_device_put around the calls to
scsi_put_command
and scsi_run_queue so that the scsi host structure will not be de-allocated
between scsi_put_command and scsi_run_queue.

*** ../base/linux-2.6.14-rc4/drivers/scsi/scsi_lib.c	Mon Oct 10 20:19:19
2005
--- drivers/scsi/scsi_lib.c	Thu Nov  3 13:30:03 2005
***************
*** 592,601 ****
  
  void scsi_next_command(struct scsi_cmnd *cmd)
  {
! 	struct request_queue *q = cmd->device->request_queue;
  
  	scsi_put_command(cmd);
  	scsi_run_queue(q);
  }
  
  void scsi_run_host_queues(struct Scsi_Host *shost)
--- 592,611 ----
  
  void scsi_next_command(struct scsi_cmnd *cmd)
  {
! 	struct scsi_device *sdev = cmd->device;
! 	struct request_queue *q = sdev->request_queue;
! 
! 	// need to hold a reference on the device before we let go of the
cmd
! 	if (scsi_device_get(sdev)) {
! 		scsi_put_command(cmd);
! 		return;		// maybe sdev_state == SDEV_CANCEL, SDEV_DEL
! 	}
  
  	scsi_put_command(cmd);
  	scsi_run_queue(q);
+ 
+ 	// ok to remove device now
+ 	scsi_device_put(sdev);
  }
  
  void scsi_run_host_queues(struct Scsi_Host *shost)
  

> -----Original Message-----
> From: [email protected] 
> [mailto:[email protected]] On Behalf Of Andrew Morton
> Sent: Monday, November 07, 2005 11:41 PM
> To: Masanari Iida
> Cc: [email protected]; 
> [email protected]; [email protected]
> Subject: Re: oops with USB Storage on 2.6.14
> 
> Masanari Iida <[email protected]> wrote:
> >
> > Hello,
> > I updated my system's kernel from 2.6.13.2 to 2.6.14,
> > then it oops when I connect my Digital Camera via USB connection
> > as USB storage device.
> > I went back to 2.6.14-rc1, still the same panic happen.
> > 2.6.13.2 and before, the kernel has been worked as expected.
> > 
> > CPU Intel P4(2.4Ghz)
> > USB Device   Pentax Optio S40.
> > 
> > Unable to handle kernel paging request at virtual address dc9d1f4c
> >  printing eip:
> > c02b44cc
> > *pde = 00073067
> > *pte = 1c9d1000
> > Oops: 0000 [#1]
> > SMP DEBUG_PAGEALLOC
> > Modules linked in: autofs e100 ipt_LOG ipt_state ip_conntrack
> > ipt_recent iptable
> > _filter ip_tables video rtc
> > CPU:    1
> > EIP:    0060:[<c02b44cc>]    Not tainted VLI
> > EFLAGS: 00010286   (2.6.14)
> > EIP is at scsi_run_queue+0xc/0xd0
> > eax: 00000001   ebx: dc9d1e3c   ecx: d6b67910   edx: dc9d1e3c
> > esi: d5048eb0   edi: dc9d1e3c   ebp: c1507e98   esp: c1507e84
> > ds: 007b   es: 007b   ss: 0068
> > Process ksoftirqd/1 (pid: 6, threadinfo=c1506000 task=dfe2dad0)
> > Stack: 00000292 de3a7bf8 dc9d1e3c d5048eb0 dc9d1e3c 
> c1507ea8 c02b4612 dc9d1e3c
> >        da51bf60 c1507ecc c02b473f d5048eb0 00000000 
> 00000024 00000286 00000001
> >        d5048eb0 00000000 c1507f10 c02b4b2e d5048eb0 
> 00000000 00000024 00000001
> > 
> > Call Trace:
> >  [<c0103abf>] show_stack+0x7f/0xa0
> >  [<c0103c72>] show_registers+0x162/0x1d0
> >  [<c0103e90>] die+0x100/0x1a0
> >  [<c039d7ae>] do_page_fault+0x31e/0x640
> >  [<c0103763>] error_code+0x4f/0x54
> >  [<c02b4612>] scsi_next_command+0x22/0x30
> >  [<c02b473f>] scsi_end_request+0xcf/0xf0
> >  [<c02b4b2e>] scsi_io_completion+0x26e/0x470
> >  [<c02b4fc7>] scsi_generic_done+0x37/0x50
> >  [<c02af9e5>] scsi_finish_command+0x85/0xa0
> >  [<c02af89c>] scsi_softirq+0xcc/0x140
> >  [<c0122085>] __do_softirq+0xd5/0xf0
> >  [<c01220d8>] do_softirq+0x38/0x40
> >  [<c0122685>] ksoftirqd+0x95/0xe0
> >  [<c0131cfa>] kthread+0xba/0xc0
> >  [<c0100ecd>] kernel_thread_helper+0x5/0x18
> > Code: f0 8b 42 44 e8 16 7f 0e 00 89 45 ec 89 1c 24 e8 6b b7 
> ff ff eb aa 89 f6 8d
> >  bc 27 00 00 00 00 55 89 e5 57 56 53 83 ec 08 8b 55 08 <8b> 
> 82 10 01 00 00 8b 38
> >  f6 80 85 01 00 00 80 0f 85 9e 00 00 00
> >   <0>Kernel panic - not syncing: Fatal exception in interrupt
> > 
> 
> Has there been any progress on this?
> 
> If not, can you please test the latest snapshot from
> ftp://ftp.kernel.org/pub/linux/kernel/v2.6/snapshots and if 
> it still fails, raise a bug at bugzilla.kernel.org?
> 
> Thanks.
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-scsi" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux