Re: segfault mdadm --write-behind, 2.6.14-mm2 (was: Re: RAID1 ramdisk patch)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sander wrote (ao):
# Sander wrote (ao):
# # Neil Brown wrote (ao):
# # > On Wednesday November 16, [email protected] wrote:
# # > > Sander <[email protected]> wrote:
# # > > > With 2.6.14-mm2 (x86) and mdadm 2.1 I get a Segmentation fault when I
# # > > > try this:
# # > > 
# # > > It oopsed in reiser4.  reiserfs-dev added to Cc...
# # > > 
# # > 
# # > Hmm... It appears that md/bitmap is calling prepare_write and
# # > commit_write with 'file' as NULL - this works for some filesystems,
# # > but not for reiser4.
# # > 
# # > Does this patch help.
# # 
# # Something changed, but it didn't fix it it seems:
# # 
# # # mdadm -C /dev/md1 --bitmap=/storage/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
# # mdadm: RUN_ARRAY failed: No such file or directory
# 
# FWIW, the following happens when I point --bitmap to /tmp/raid1.bitmap
# which is tmpfs, and also happens when I attach both loop0 and loop1 to
# files on tmpfs.
# 
# This would suggest that reiser4 is not solely at fault?
# 
# The difference btw is that I can reboot with 'shutdown -r now'
# instead of sysrq. And that mdadm hangs:
# 
# # mdadm -C /dev/md1 --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
# mdadm: RUN_ARRAY failed: No such file or directory
# 
# # mdadm -C /dev/md1 -f --bitmap=/tmp/raid1.bitmap -l1 -n2 /dev/loop0 --write-behind /dev/loop1
# mdadm: /dev/loop0 appears to be part of a raid array:
#     level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005
# mdadm: /dev/loop1 appears to be part of a raid array:
#     level=raid1 devices=2 ctime=Thu Nov 17 11:04:31 2005
# Continue creating array? yes
# [hang, no prompt, no reaction to ctrl-c, etc]

And even more info. It seems mdadm spins:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                             
  749 root      25   0  1696  568  492 R 99.9  0.1   8:32.50 mdadm

Would sysrq-t be useful?


# [42949549.780000] md: bind<loop0>
# [42949549.780000] md: bind<loop1>
# [42949549.780000] md: md1: raid array is not clean -- starting background reconstruction
# [42949549.790000] md1: bitmap file is out of date (0 < 1) -- forcing full recovery
# [42949549.790000] md1: bitmap file is out of date, doing full recovery
# [42949549.790000] md1: bitmap initialized from disk: read 0/4 pages, set 0 bits, status: 524288
# [42949549.790000] Bad page state at free_hot_cold_page (in process 'mdadm', page c10dcc20)
# [42949549.790000] flags:0x80000019 mapping:f5155c84 mapcount:0 count:0
# [42949549.790000] Backtrace:
# [42949549.790000]  [<c013b320>] bad_page+0x70/0xb0
# [42949549.790000]  [<c013bab1>] free_hot_cold_page+0x51/0xd0
# [42949549.790000]  [<c02b0a90>] bitmap_file_put+0x30/0x70
# [42949549.790000]  [<c02b1f8e>] bitmap_free+0x1e/0xb0
# [42949549.790000]  [<c02b2126>] bitmap_create+0xd6/0x2a0
# [42949549.790000]  [<c02ab95a>] do_md_run+0x2ba/0x500
# [42949549.790000]  [<c02ac8a7>] add_new_disk+0x157/0x3b0
# [42949549.790000]  [<c0179034>] mpage_writepages+0x124/0x3d0
# [42949549.790000]  [<c013c23e>] __pagevec_free+0x3e/0x60
# [42949549.790000]  [<c013eff9>] release_pages+0x29/0x160
# [42949549.790000]  [<c02adb81>] md_ioctl+0x5a1/0x630
# [42949549.790000]  [<c0137918>] find_get_pages+0x18/0x40
# [42949549.790000]  [<c02ad5e0>] md_ioctl+0x0/0x630
# [42949549.790000]  [<c01ede74>] blkdev_driver_ioctl+0x54/0x60
# [42949549.790000]  [<c01edfb4>] blkdev_ioctl+0x134/0x180
# [42949549.790000]  [<c015e158>] block_ioctl+0x18/0x20
# [42949549.790000]  [<c015e140>] block_ioctl+0x0/0x20
# [42949549.790000]  [<c01674ff>] do_ioctl+0x1f/0x70
# [42949549.790000]  [<c016769c>] vfs_ioctl+0x5c/0x1e0
# [42949549.790000]  [<c0156c91>] __fput+0xe1/0x140
# [42949549.790000]  [<c016785d>] sys_ioctl+0x3d/0x70
# [42949549.790000]  [<c0102f49>] syscall_call+0x7/0xb
# [42949549.790000] Trying to fix it up, but a reboot is needed
# [42949549.790000] md1: failed to create bitmap (524288)
# [42949549.790000] md: pers->run() failed ...
# [42949549.790000] md: md1 stopped.
# [42949549.790000] md: unbind<loop1>
# [42949549.790000] md: export_rdev(loop1)
# [42949549.790000] md: unbind<loop0>
# [42949549.790000] md: export_rdev(loop0)

-- 
Humilis IT Services and Solutions
http://www.humilis.net
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux