Outstanding NBD requests bug? (Was: Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Peter, Paul,

The original message (below) didn't get any response.

Could you elaborate on the impact of this NBD issue?  Based on your
description, having requests just sit on the queue would _seem_ to be
quite serious.  Is this purely a bookkeeping issue or is there a more
subtle side-effect of this inaccurate request count?

You stated that your patch fixes this issue "not in the proper way".
If a fix is trully needed what is the proper way?

please advise, thanks.
Mike

On 4/29/07, Peter Zijlstra <[email protected]> wrote:
On Sun, 2007-04-29 at 21:41 +0200, Rogier Wolff wrote:
> On Tue, Apr 17, 2007 at 10:37:38PM -0700, Andrew Morton wrote:
> > Florin, can we please see /proc/meminfo as well?
> >
> > Also the result of `echo m > /proc/sysrq-trigger'
>
> Hi,
>
> It's been a while since this thread died out, but maybe I'm
> having the same problem. Networking, large part of memory is
> buffering writes.....
>
> In my case I'm using NBD.
>
> Oh,
>
> /sys/block/nbd0/stat gives:
>      636       88     5353     1700      991    19554   162272    63156       43  1452000 61802352
> I put some debugging stuff in nbd, and it DOES NOT KNOW about the
> 43 requests that the io scheduler claims are in flight at the
> driver....

AFAIK nbd is a tad broken; the following patch used to fix it, although
not in the proper way. Hence it never got merged.

There is a race where the plug state of the device queue gets confused,
which causes requests to just sit on the queue, without further action.

---

Subject: nbd: request_fn fixup

Dropping the queue_lock opens up a nasty race, fix this race by
plugging the device when we're done.

Also includes a small cleanup.

Signed-off-by: Peter Zijlstra <[email protected]>
CC: Daniel Phillips <[email protected]>
CC: Pavel Machek <[email protected]>
---
 drivers/block/nbd.c |   67 ++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 49 insertions(+), 18 deletions(-)

Index: linux-2.6/drivers/block/nbd.c
===================================================================
--- linux-2.6.orig/drivers/block/nbd.c  2006-09-07 17:20:52.000000000 +0200
+++ linux-2.6/drivers/block/nbd.c       2006-09-07 17:35:05.000000000 +0200
@@ -97,20 +97,24 @@ static const char *nbdcmd_to_ascii(int c
 }
 #endif /* NDEBUG */

-static void nbd_end_request(struct request *req)
+static void __nbd_end_request(struct request *req)
 {
        int uptodate = (req->errors == 0) ? 1 : 0;
-       request_queue_t *q = req->q;
-       unsigned long flags;

        dprintk(DBG_BLKDEV, "%s: request %p: %s\n", req->rq_disk->disk_name,
                        req, uptodate? "done": "failed");

-       spin_lock_irqsave(q->queue_lock, flags);
-       if (!end_that_request_first(req, uptodate, req->nr_sectors)) {
+       if (!end_that_request_first(req, uptodate, req->nr_sectors))
                end_that_request_last(req, uptodate);
-       }
-       spin_unlock_irqrestore(q->queue_lock, flags);
+}
+
+static void nbd_end_request(struct request *req)
+{
+       request_queue_t *q = req->q;
+
+       spin_lock_irq(q->queue_lock);
+       __nbd_end_request(req);
+       spin_unlock_irq(q->queue_lock);
 }

 /*
@@ -435,10 +439,8 @@ static void do_nbd_request(request_queue
                        mutex_unlock(&lo->tx_lock);
                        printk(KERN_ERR "%s: Attempted send on closed socket\n",
                               lo->disk->disk_name);
-                       req->errors++;
-                       nbd_end_request(req);
                        spin_lock_irq(q->queue_lock);
-                       continue;
+                       goto error_out;
                }

                lo->active_req = req;
@@ -463,10 +465,13 @@ static void do_nbd_request(request_queue

 error_out:
                req->errors++;
-               spin_unlock(q->queue_lock);
-               nbd_end_request(req);
-               spin_lock(q->queue_lock);
+               __nbd_end_request(req);
        }
+       /*
+        * q->queue_lock has been dropped, this opens up a race
+        * plug the device to close it.
+        */
+       blk_plug_device(q);
        return;
 }



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux