2.6.23.1: mdadm/raid5 hung/d-state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



# ps auxww | grep D
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root       273  0.0  0.0      0     0 ?        D    Oct21  14:40 [pdflush]
root       274  0.0  0.0      0     0 ?        D    Oct21  13:00 [pdflush]

After several days/weeks, this is the second time this has happened, while doing regular file I/O (decompressing a file), everything on the device went into D-state.

# mdadm -D /dev/md3
/dev/md3:
        Version : 00.90.03
  Creation Time : Wed Aug 22 10:38:53 2007
     Raid Level : raid5
     Array Size : 1318680576 (1257.59 GiB 1350.33 GB)
  Used Dev Size : 146520064 (139.73 GiB 150.04 GB)
   Raid Devices : 10
  Total Devices : 10
Preferred Minor : 3
    Persistence : Superblock is persistent

    Update Time : Sun Nov  4 06:38:29 2007
          State : active
 Active Devices : 10
Working Devices : 10
 Failed Devices : 0
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 1024K

           UUID : e37a12d1:1b0b989a:083fb634:68e9eb49
         Events : 0.4309

    Number   Major   Minor   RaidDevice State
       0       8       33        0      active sync   /dev/sdc1
       1       8       49        1      active sync   /dev/sdd1
       2       8       65        2      active sync   /dev/sde1
       3       8       81        3      active sync   /dev/sdf1
       4       8       97        4      active sync   /dev/sdg1
       5       8      113        5      active sync   /dev/sdh1
       6       8      129        6      active sync   /dev/sdi1
       7       8      145        7      active sync   /dev/sdj1
       8       8      161        8      active sync   /dev/sdk1
       9       8      177        9      active sync   /dev/sdl1

If I wanted to find out what is causing this, what type of debugging would I have to enable to track it down? Any attempt to read/write files on the devices fails (also going into d-state). Is there any useful information I can get currently before rebooting the machine?

# pwd
/sys/block/md3/md
# ls
array_state      dev-sdj1/         rd2@              stripe_cache_active
bitmap_set_bits  dev-sdk1/         rd3@              stripe_cache_size
chunk_size       dev-sdl1/         rd4@              suspend_hi
component_size   layout            rd5@              suspend_lo
dev-sdc1/        level             rd6@              sync_action
dev-sdd1/        metadata_version  rd7@              sync_completed
dev-sde1/        mismatch_cnt      rd8@              sync_speed
dev-sdf1/        new_dev           rd9@              sync_speed_max
dev-sdg1/        raid_disks        reshape_position  sync_speed_min
dev-sdh1/        rd0@              resync_start
dev-sdi1/        rd1@              safe_mode_delay
# cat array_state
active-idle
# cat mismatch_cnt
0
# cat stripe_cache_active
1
# cat stripe_cache_size
16384
# cat sync_action
idle
# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md1 : active raid1 sdb2[1] sda2[0]
      136448 blocks [2/2] [UU]

md2 : active raid1 sdb3[1] sda3[0]
      129596288 blocks [2/2] [UU]

md3 : active raid5 sdl1[9] sdk1[8] sdj1[7] sdi1[6] sdh1[5] sdg1[4] sdf1[3] sde1[2] sdd1[1] sdc1[0] 1318680576 blocks level 5, 1024k chunk, algorithm 2 [10/10] [UUUUUUUUUU]

md0 : active raid1 sdb1[1] sda1[0]
      16787776 blocks [2/2] [UU]

unused devices: <none>
#

Justin.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux