IDE issues with "choose_drive"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



There seem to be a certain amount of problems with ide_do_request() and
more specifically choose_drive() to pick which drive to service, when
more than one drive is on a given hwgroup (typically, when you have a
slave drive)

The first one is the one I'm trying to fix, it's basically a hang on
wakeup from sleep. What happens is that both drives are blocked
(suspended, drive->blocked is set). Their IO queues contains some
requests that haven't been serviced yet. We receive the resume()
callback for one of them. We react by inserting a wakeup request at the
head of the queue and waiting for it to complete. However, when we reach
ide_do_request(), choose_drive() may return the other drive (the one
that is still sleeping). In this case, we hit the test for blocked queue
and just break out of the loop. We end up never servicing the other
drive queue which is the one we are trying to wakeup, thus we hang.

In general, that uncovers a serie of problems with that code. That is,
the 2 other cases where we "break" out of the loop may trigger a similar
problem where we do not service the "other" drive, and thus unless
something else happens to "kick" the hwgroup, a request pending on the
other drive will be stuck there and the machine may hang.

In a similar vein, if we hit the codepath that does 
ide_stall_queue() (which is meant as doing some fairness by blocking the
slave for some time) we may end up also "missing" the opportunity to
service at all on the next pass, thus hanging forever.  

Also, I think there may be another problem with choose_drive() itself,
though I'm not 100% sure, with the code that tests blk_queue_flushing()
since we do the test on hwgroup->drive only, and thus we never test that
for the slave drive.

I'm not sure what is the best way to fix those issues except by
completely rewriting that code. I was thinking about turning the "break"
into a "continue" in the PM case to service the "other" drive but
nothing guarantees that I'll make any forward progress since
choose_drive() may just return the same drive over and over (since the
queue is not serviced).

Ben.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux