Re: [PATCH scsi-misc-2.6 02/07] scsi: make scsi_send_eh_cmnd use its own timer instead of scmd->eh_timeout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 Hello, James.

On Mon, Apr 18, 2005 at 10:33:21AM -0500, James Bottomley wrote:
> On Mon, 2005-04-11 at 03:45 +0900, Tejun Heo wrote:
> > 	scmd->eh_timeout is used to resolve the race between command
> > 	completion and timeout.  However, during error handling,
> > 	scsi_send_eh_cmnd uses scmd->eh_timeout.  This creates a race
> > 	condition between eh and normal completion for a request which
> > 	has timed out and in the process of error handling.  If the
> > 	request completes while scmd->eh_timeout is being used by eh,
> > 	eh timeout is lost and the command will be handled by both eh
> > 	and completion path.  This patch fixes the race by making
> > 	scsi_send_eh_cmnd() use its own timer.
> > 
> > Signed-off-by: Tejun Heo <[email protected]>
> 
> The logic is wrong in there.
> 
> The problem is you cannot rely on the timer being pending as a signal
> that the command completed normally.  The kernel doesn't define the
> elapsed time between the eh_action semaphore going up and the process
> waiting for it being scheduled.  If the timer fires within that
> undefined interval, you'll think the command timed out when it, in fact,
> completed normally.

 The original code also uses timer pending status as a signal that
command completed normally in scsi_eh_done() function, and the same
race also exists in the original code, no matter what we do, unless we
make timer expiration and removal of the command atomic, there will be
a window in which command completes normally but considered to have
timed out as long as we use timer pending status as tie breaker.

 The patch moves the test out of scsi_eh_done() into
scsi_send_eh_cmnd() and this does widen the window by delaying removal
of timer until after the original thread gets scheduled, but usually
not by much and that's how timers are done in many cases (through out
the kernel, timer removals are done with intervening scheduling and no
one considers those incorrect).  So...

 * If you're worried about the race itself, it was there before,
   it shouldn't cause any problem, and we really can't help it.
 * If you're worried about the widening of the window, practically,
   it wouldn't cause problem, and it's how timers are done in many
   other places.

 But if you still don't like it, I can rework it (maybe I'll need to
add a field to Scsi_Host or scsi_cmnd).  So, please let me know.

 Thanks a lot.

-- 
tejun

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux