On Tue, Apr 25, 2006 at 03:13:49PM +1000, Neil Brown wrote:
> On Tuesday April 25, [email protected] wrote:
> > Hello,
> >
> > Reduce the raid6_end_write_request() spinlock window.
>
> Andrew: please don't include these in -mm. This one and the
> corresponding raid5 are wrong, and I'm not sure yet the unplug_device
> changes.
I am sure with the unplug_device. Just look follow the path...
>
> In this case, the call to md_error, which in turn calls "error" in
> raid6main.c, requires the lock to be held as it contains:
> if (!test_bit(Faulty, &rdev->flags)) {
> mddev->sb_dirty = 1;
> if (test_bit(In_sync, &rdev->flags)) {
> conf->working_disks--;
> mddev->degraded++;
> conf->failed_disks++;
> clear_bit(In_sync, &rdev->flags);
> /*
> * if recovery was running, make sure it aborts.
> */
> set_bit(MD_RECOVERY_ERR, &mddev->recovery);
> }
> set_bit(Faulty, &rdev->flags);
>
> which is fairly clearly not safe without some locking.
Yes. Let's fix the error(). In any case, the current code is broken. (see raid5/6_end_read_request)
Comments? Thanks.
Signed-off-by: Coywolf Qi Hunt <[email protected]>
---
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 9c24377..192de19 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -638,7 +638,7 @@ static void error(mddev_t *mddev, mdk_rd
raid5_conf_t *conf = (raid5_conf_t *) mddev->private;
PRINTK("raid5: error called\n");
- if (!test_bit(Faulty, &rdev->flags)) {
+ if (!test_and_set_bit(Faulty, &rdev->flags)) {
mddev->sb_dirty = 1;
if (test_bit(In_sync, &rdev->flags)) {
conf->working_disks--;
@@ -650,7 +650,6 @@ static void error(mddev_t *mddev, mdk_rd
*/
set_bit(MD_RECOVERY_ERR, &mddev->recovery);
}
- set_bit(Faulty, &rdev->flags);
printk (KERN_ALERT
"raid5: Disk failure on %s, disabling device."
" Operation continuing on %d devices\n",
diff --git a/drivers/md/raid6main.c b/drivers/md/raid6main.c
index d3deedb..fc0b31d 100644
--- a/drivers/md/raid6main.c
+++ b/drivers/md/raid6main.c
@@ -527,7 +527,7 @@ static void error(mddev_t *mddev, mdk_rd
raid6_conf_t *conf = (raid6_conf_t *) mddev->private;
PRINTK("raid6: error called\n");
- if (!test_bit(Faulty, &rdev->flags)) {
+ if (!test_and_set_bit(Faulty, &rdev->flags)) {
mddev->sb_dirty = 1;
if (test_bit(In_sync, &rdev->flags)) {
conf->working_disks--;
@@ -539,7 +539,6 @@ static void error(mddev_t *mddev, mdk_rd
*/
set_bit(MD_RECOVERY_ERR, &mddev->recovery);
}
- set_bit(Faulty, &rdev->flags);
printk (KERN_ALERT
"raid6: Disk failure on %s, disabling device."
" Operation continuing on %d devices\n",
>
> Coywolf: As I think I have already said, I appreciate your review of
> the md/raid code and your attempts to improve it - I'm sure there is
> plenty of room to make improvements.
> However posting patches with minimal commentary on code that you don't
> fully understand is not the best way to work with the community.
> If you see something that you think is wrong, it is much better to ask
> why it is the way it is, explain why you think it isn't right, and
> quite possibly include an example patch. Then we can discuss the
> issue and find the best solution.
>
> So please feel free to post further patches, but please include more
> commentary, and don't assume you understand something that you don't
> really.
>
> Thanks,
> NeilBrown
>
>
>
> >
> > Signed-off-by: Coywolf Qi Hunt <[email protected]>
> > ---
> >
> > diff --git a/drivers/md/raid6main.c b/drivers/md/raid6main.c
> > index bc69355..820536e 100644
> > --- a/drivers/md/raid6main.c
> > +++ b/drivers/md/raid6main.c
> > @@ -468,7 +468,6 @@ static int raid6_end_write_request (stru
> > struct stripe_head *sh = bi->bi_private;
> > raid6_conf_t *conf = sh->raid_conf;
> > int disks = conf->raid_disks, i;
> > - unsigned long flags;
> > int uptodate = test_bit(BIO_UPTODATE, &bi->bi_flags);
> >
> > if (bi->bi_size)
> > @@ -486,16 +485,14 @@ static int raid6_end_write_request (stru
> > return 0;
> > }
> >
> > - spin_lock_irqsave(&conf->device_lock, flags);
> > if (!uptodate)
> > md_error(conf->mddev, conf->disks[i].rdev);
> >
> > rdev_dec_pending(conf->disks[i].rdev, conf->mddev);
> > -
> > clear_bit(R5_LOCKED, &sh->dev[i].flags);
> > set_bit(STRIPE_HANDLE, &sh->state);
> > - __release_stripe(conf, sh);
> > - spin_unlock_irqrestore(&conf->device_lock, flags);
> > + release_stripe(sh);
> > +
> > return 0;
> > }
> >
--
Coywolf Qi Hunt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]