Block driver freezes when using CFQ

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

   I have written a block driver that registers a virtual device and
routes requests to appropriate real devices after some re-mapping of
the requests. I am testing the driver by creating a filesystem on the
virtual device and copying a large number of files on to it. The test
causes the device to become unresponsive after some time. After some
debugging, I noticed that this happens only if the I/O scheduler being
used is CFQ. I have not had any trouble if the scheduler is noop,
anticipatory or deadline. The problem occurs on all the kernels I have
tested - 2.6.18-rc2, 2.6.18-rc4, 2.6.19-rc3.

Below are some details about the driver and what I have observed during
testing:

The request function registered by my driver is a simple loop -

  while ((req = elv_next_request(q))) {
        blkdev_dequeue_request(req);

        /*
         Add request to an internal queue for further processing
         Wake up thread to start processing the queue
         Update some variables for book-keeping
         */
  }

Completed requests are handled in a different thread -
  while (work to be done) {
      /*
        Dequeue completed requests from internal queue
        Call end_that_request_first() and end_that_request_last()
        Update some variables for book-keeping
      */
  }

Several times during the test run, the while() loop in the request
function comes out without dequeuing any request even though the
elevator queue is not empty. (Confirmed by printing the return value of
elv_queue_empty(), and the values of q->rq.count[] outside the loop).
After one such occurrence, the request function is not called at all
and the device becomes unresponsive.
I added some code that lets me trigger the request function from userspace.
If I nudge the driver this way, I/Os continue for a short while and stop
again.

Since CFQ is the default I/O scheduler in current kernels, it has been
widely used and tested. So I suspect I am not doing something right in my
driver. Since the driver works well with the other schedulers, is there
something CFQ-specific that I should take care of?


Please Cc me on the responses since I am not subscribed to lkml.

Thanks,
Ravi.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux