2.6.23-rc9 kswapd infinite loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a reproducible hang with kswapd in the run queue, everything else is
in an i/o wait. The load average is climbing.

Using either a highpoint RR2340 or an LSI8888ELP PCIe-8 lane controller,
I max out the write rate at between 900MB/Sec to 1.1GB/Sec into
16 seagate 500GB ES series drives. Eventually the system locks up
with kswapd0 getting 100% of one CPU. kswapd1 is not running.

The system is a SuperMicro H8DM3-2 (or an H8DMi-2) with 2222SE or 2216 opterons,
16GB of RAM. 2.6.22-5, 2.6.21 and 2.6.23-rc7 and 2.6.23-rc9 all lock up.
2.6.20 does not, but it also runs 200MB/Sec slower in write rates.
The base O/S is Centos 5.0
I can patch in KDB and look around (did this for 2.6.22-5) but I'm
not sure what to look for in kswapd to see what got lost to keep
the system locked up. With eralier kernels, the system needs a reset
button to recover. With 2.6.23-rc9 I was left with enough to get the following
ps, top, and /proc/meminfo data

Hints anyone (please) as to how to slay this dragon?

Berkley


--

// E. F. Berkley Shands, MSc//

** Exegy Inc.**

349 Marshall Road, Suite 100

St. Louis , MO  63119

Direct:  (314) 218-3600 X450

Cell:  (314) 303-2546

Office:  (314) 218-3600

Fax:  (314) 218-3601



The Usual Disclaimer follows...

This e-mail and any documents accompanying it may contain legally privileged and/or confidential information belonging to Exegy, Inc. Such information may be protected from disclosure by law. The information is intended for use by only the addressee. If you are not the intended recipient, you are hereby notified that any disclosure or use of the information is strictly prohibited. If you have received this e-mail in error, please immediately contact the sender by e-mail or phone regarding instructions for return or destruction and do not use or disclose the content to others.
top - 11:15:00 up 40 min,  2 users,  load average: 25.51, 19.62, 12.66
Tasks: 147 total,  19 running, 128 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us, 75.0%sy,  0.0%ni,  0.0%id, 25.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  16471592k total, 16415040k used,    56552k free,      692k buffers
Swap: 33551712k total,      152k used, 33551560k free, 13462880k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                           
  335 root      15  -5     0    0    0 R  100  0.0   9:56.63 kswapd0                                                                                            
 4811 root      20   0 22280 1372 1044 S  100  0.0   5:46.05 ShiftGen                                                                                           
 4816 root      20   0 22280 1376 1044 S  100  0.0   5:47.71 ShiftGen                                                                                           
 4080 root      20   0  110m 1220  760 S    0  0.0   0:00.46 exegyd                                                                                             
 4826 root      20   0 12716 1092  796 R    0  0.0   0:00.20 top                                                                                                
    1 root      20   0 10316  668  556 R    0  0.0   0:00.60 init                                                                                               
    2 root      15  -5     0    0    0 S    0  0.0   0:00.00 kthreadd                                                                                           
    3 root      RT  -5     0    0    0 S    0  0.0   0:00.00 migration/0                                                                                        
    4 root      15  -5     0    0    0 S    0  0.0   0:00.00 ksoftirqd/0                                                                                        
    5 root      RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/0                                                                                         
    6 root      RT  -5     0    0    0 R    0  0.0   0:00.00 migration/1                                                                                        
    7 root      15  -5     0    0    0 S    0  0.0   0:00.00 ksoftirqd/1                                                                                        
    8 root      RT  -5     0    0    0 S    0  0.0   0:00.00 watchdog/1                                                                                         
    9 root      RT  -5     0    0    0 R    0  0.0   0:00.00 migration/2                                                                                        
   
[email protected] local/exegy/init> ps -flea
F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
4 R root         1     0  0  80   0 -  2579 -      10:34 ?        00:00:00 init [3]                   
1 S root         2     0  0  75  -5 -     0 kthrea 10:34 ?        00:00:00 [kthreadd]
1 S root         3     2  0 -40   - -     0 migrat 10:34 ?        00:00:00 [migration/0]
1 S root         4     2  0  75  -5 -     0 ksofti 10:34 ?        00:00:00 [ksoftirqd/0]
5 S root         5     2  0 -40   - -     0 watchd 10:34 ?        00:00:00 [watchdog/0]
1 R root         6     2  0 -40   - -     0 -      10:34 ?        00:00:00 [migration/1]
1 S root         7     2  0  75  -5 -     0 ksofti 10:34 ?        00:00:00 [ksoftirqd/1]
5 S root         8     2  0 -40   - -     0 watchd 10:34 ?        00:00:00 [watchdog/1]
1 R root         9     2  0 -40   - -     0 -      10:34 ?        00:00:00 [migration/2]
1 S root        10     2  0  75  -5 -     0 ksofti 10:34 ?        00:00:00 [ksoftirqd/2]
5 S root        11     2  0 -40   - -     0 watchd 10:34 ?        00:00:00 [watchdog/2]
1 S root        12     2  0 -40   - -     0 migrat 10:34 ?        00:00:00 [migration/3]
1 S root        13     2  0  75  -5 -     0 ksofti 10:34 ?        00:00:00 [ksoftirqd/3]
5 S root        14     2  0 -40   - -     0 watchd 10:34 ?        00:00:00 [watchdog/3]
1 R root        15     2  0  75  -5 -     0 -      10:34 ?        00:00:00 [events/0]
1 R root        16     2  0  75  -5 -     0 -      10:34 ?        00:00:00 [events/1]
1 R root        17     2  0  75  -5 -     0 -      10:34 ?        00:00:00 [events/2]
1 S root        18     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [events/3]
1 S root        19     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [khelper]
1 S root        72     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kblockd/0]
1 R root        73     2  0  75  -5 -     0 -      10:34 ?        00:00:01 [kblockd/1]
1 S root        74     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kblockd/2]
1 S root        75     2  0  75  -5 -     0 worker 10:34 ?        00:00:02 [kblockd/3]
1 S root        78     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kacpid]
1 S root        79     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kacpi_notify]
1 S root       245     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [cqueue/0]
1 S root       246     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [cqueue/1]
1 S root       247     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [cqueue/2]
1 S root       248     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [cqueue/3]
1 S root       250     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [ksuspend_usbd]
1 S root       256     2  0  75  -5 -     0 hub_th 10:34 ?        00:00:00 [khubd]
1 S root       259     2  0  75  -5 -     0 serio_ 10:34 ?        00:00:00 [kseriod]
1 R root       335     2 25  75  -5 -     0 -      10:34 ?        00:10:13 [kswapd0]
1 S root       336     2  9  75  -5 -     0 kswapd 10:34 ?        00:03:48 [kswapd1]
1 S root       337     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [aio/0]
1 S root       338     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [aio/1]
1 S root       339     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [aio/2]
1 S root       340     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [aio/3]
1 S root       341     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [xfslogd/0]
1 S root       342     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [xfslogd/1]
1 S root       343     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [xfslogd/2]
1 S root       344     2  0  75  -5 -     0 worker 10:34 ?        00:00:10 [xfslogd/3]
1 S root       345     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [xfsdatad/0]

1 R root       346     2  2  75  -5 -     0 -      10:34 ?        00:00:55 [xfsdatad/1]
1 S root       347     2  0  75  -5 -     0 worker 10:34 ?        00:00:01 [xfsdatad/2]
1 S root       348     2 22  75  -5 -     0 worker 10:34 ?        00:09:05 [xfsdatad/3]
1 S root       349     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [xfs_mru_cache]
1 S root       505     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kpsmoused]
1 S root       554     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [ata/0]
1 S root       555     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [ata/1]
1 S root       556     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [ata/2]
1 S root       557     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [ata/3]
1 S root       558     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [ata_aux]
1 S root       564     2  0  75  -5 -     0 scsi_e 10:34 ?        00:00:00 [scsi_eh_0]
1 S root       565     2  0  75  -5 -     0 scsi_e 10:34 ?        00:00:00 [scsi_eh_1]
1 S root       566     2  0  75  -5 -     0 scsi_e 10:34 ?        00:00:00 [scsi_eh_2]
1 S root       567     2  0  75  -5 -     0 scsi_e 10:34 ?        00:00:00 [scsi_eh_3]
1 S root       568     2  0  75  -5 -     0 scsi_e 10:34 ?        00:00:00 [scsi_eh_4]
1 S root       569     2  0  75  -5 -     0 scsi_e 10:34 ?        00:00:00 [scsi_eh_5]
1 S root       575     2  0  75  -5 -     0 scsi_e 10:34 ?        00:00:00 [scsi_eh_6]
1 S root       576     2  0  75  -5 -     0 kjourn 10:34 ?        00:00:00 [kjournald]
1 S root       603     2  0  75  -5 -     0 kaudit 10:34 ?        00:00:00 [kauditd]
5 S root       637     1  0  76  -4 -  3234 -      10:34 ?        00:00:00 /sbin/udevd -d
1 S root      2314     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kmpathd/0]
1 S root      2315     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kmpathd/1]
1 S root      2316     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kmpathd/2]
1 S root      2317     2  0  75  -5 -     0 worker 10:34 ?        00:00:00 [kmpathd/3]
1 S root      2347     2  0  75  -5 -     0 kjourn 10:35 ?        00:00:00 [kjournald]
1 S root      2348     2  0  75  -5 -     0 kjourn 10:35 ?        00:00:00 [kjournald]
1 S root      2349     2  0  75  -5 -     0 kjourn 10:35 ?        00:00:00 [kjournald]
5 R root      2753     1  0  77  -3 -  3548 stext  10:35 ?        00:00:00 auditd
0 S root      2755  2753  0  77  -3 - 29041 -      10:35 ?        00:00:00 python /sbin/audispd
1 R root      2774     1  0  80   0 -  1469 -      10:35 ?        00:00:00 syslogd -m 0
5 S root      2777     1  0  80   0 -   943 syslog 10:35 ?        00:00:00 klogd -x
5 S rpc       2830     1  0  80   0 -  2004 429496 10:35 ?        00:00:00 portmap
5 S root      2869     1  0  80   0 -  2528 -      10:35 ?        00:00:00 rpc.statd
1 R root      2909     2  0  75  -5 -     0 -      10:35 ?        00:00:00 [rpciod/0]
1 S root      2910     2  0  75  -5 -     0 worker 10:35 ?        00:00:00 [rpciod/1]
5 R root      2911     2  0  75  -5 -     0 -      10:35 ?        00:00:00 [rpciod/2]
5 S root      2912     2  0  75  -5 -     0 worker 10:35 ?        00:00:00 [rpciod/3]
1 R root      2919     1  0  80   0 - 10504 -      10:35 ?        00:00:00 rpc.idmapd
5 S dbus      2948     1  0  80   0 -  6365 -      10:35 ?        00:00:00 dbus-daemon --system
1 S root      2991     2  0  80   0 -     0 -      10:35 ?        00:00:00 [lockd]
1 S root      3041     1  0  80   0 -  2121 929750 10:35 ?        00:00:00 /usr/bin/hidd --server
5 S root      3066     1  0  80   0 - 19681 274877 10:35 ?        00:00:00 ypbind
5 S root      3097     1  0  80   0 - 23860 stext  10:35 ?        00:00:00 automount
1 S root      3121     1  0  80   0 -   943 -      10:35 ?        00:00:00 /usr/sbin/acpid
1 S root      3137     1  0  80   0 -  6294 -      10:35 ?        00:00:00 ./hpiod
1 R root      3142     1  0  80   0 - 36857 -      10:35 ?        00:00:00 python ./hpssd.py
5 S root      3159     1  0  80   0 - 31500 -      10:35 ?        00:00:00 cupsd
5 S root      3185     1  0  80   0 - 11074 -      10:35 ?        00:00:00 /usr/sbin/sshd
5 S ntp       3208     1  0  80   0 -  3936 -      10:35 ?        00:00:00 ntpd -u ntp:ntp -p /var/run/ntpd.pid
1 S root      3248     1  0  80   0 - 16621 343793 10:35 ?        00:00:00 rpc.rquotad

1 S root      3271     2  0  75  -5 -     0 worker 10:35 ?        00:00:00 [nfsd4]
1 S root      3272     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3273     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3274     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3275     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3276     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3277     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3278     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3279     2  0  80   0 -     0 -      10:35 ?        00:00:00 [nfsd]
1 S root      3282     1  0  80   0 -  2541 -      10:35 ?        00:00:00 rpc.mountd
5 S root      3324     1  0  80   0 -  1606 -      10:35 ?        00:00:00 gpm -m /dev/input/mice -t exps2
1 S root      3340     1  0  80   0 - 18478 -      10:35 ?        00:00:00 crond
5 S xfs       3376     1  0  80   0 -  6282 -      10:35 ?        00:00:00 xfs -droppriv -daemon
5 R root      3473     1  0  80   0 -  4670 -      10:35 ?        00:00:00 /usr/sbin/atd
5 S root      3489     1  0  80   0 - 56791 -      10:35 ?        00:00:01 /usr/bin/python /usr/sbin/yum-updatesd
5 S 68        3505     1  0  80   0 -  7868 -      10:35 ?        00:00:01 hald
0 S root      3506  3505  0  80   0 -  5408 -      10:35 ?        00:00:00 hald-runner
4 S 68        3512  3506  0  80   0 -  3069 -      10:35 ?        00:00:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
4 S 68        3520  3506  0  80   0 -  3069 evdev_ 10:35 ?        00:00:00 hald-addon-keyboard: listening on /dev/input/event0
0 S root      3529  3506  0  80   0 -  2545 -      10:35 ?        00:00:00 hald-addon-storage: polling /dev/hda
1 S root      3580     1  0  80   0 -  9634 stext  10:35 ?        00:00:00 /usr/bin/hptsvr
5 S root      3615     1  0  80   0 -  1024 -      10:35 ?        00:00:00 /usr/sbin/smartd -q never
4 S root      3619     1  0  80   0 - 17886 wait   10:35 ?        00:00:00 login -- root     
4 S root      3620     1  0  80   0 -   940 -      10:35 tty2     00:00:00 /sbin/mingetty tty2
4 S root      3621     1  0  80   0 -   940 -      10:35 tty3     00:00:00 /sbin/mingetty tty3
4 S root      3622     1  0  80   0 -   940 -      10:35 tty4     00:00:00 /sbin/mingetty tty4
4 S root      3624     1  0  80   0 -   940 -      10:35 tty5     00:00:00 /sbin/mingetty tty5
4 S root      3625     1  0  80   0 -   940 -      10:35 tty6     00:00:00 /sbin/mingetty tty6
4 S root      3678  3619  0  80   0 - 17013 -      10:35 tty1     00:00:00 -tcsh
1 S root      4080     1  0  80   0 - 28285 futex_ 10:38 ?        00:00:00 /usr/local/exegy/bin/exegyd
0 S root      4111  3678  0  80   0 - 20406 wait   10:40 tty1     00:00:00 /usr/bin/perl ./MagicNumbers.pl --nomkfs --devices 4 --satatype rr2340x500s --raiddev
1 S root      4152     2  0  75  -5 -     0 -      10:40 ?        00:00:02 [xfsbufd]
1 R root      4153     2  0  75  -5 -     0 -      10:40 ?        00:00:00 [xfssyncd]
1 S root      4156     2  0  75  -5 -     0 -      10:40 ?        00:00:02 [xfsbufd]
1 S root      4157     2  0  75  -5 -     0 -      10:40 ?        00:00:00 [xfssyncd]
1 S root      4160     2  0  75  -5 -     0 -      10:40 ?        00:00:03 [xfsbufd]
1 R root      4161     2  0  75  -5 -     0 -      10:40 ?        00:00:00 [xfssyncd]
1 S root      4164     2  0  75  -5 -     0 -      10:40 ?        00:00:03 [xfsbufd]
1 S root      4165     2  0  75  -5 -     0 -      10:40 ?        00:00:00 [xfssyncd]
4 S root      4416  3185  0  80   0 - 20071 -      10:52 ?        00:00:00 sshd: root@pts/0 
4 S root      4418  4416  0  80   0 - 18611 rt_sig 10:52 pts/0    00:00:00 -tcsh
1 S root      4803     2  2  80   0 -     0 pdflus 11:08 ?        00:00:09 [pdflush]
1 D root      4805     2  1  80   0 -     0 conges 11:08 ?        00:00:06 [pdflush]
1 S root      4809  4111  0  80   0 - 20406 wait   11:09 tty1     00:00:00 /usr/bin/perl ./MagicNumbers.pl --nomkfs --devices 4 --satatype rr2340x500s --raiddev
1 S root      4810  4111  0  80   0 - 20406 wait   11:09 tty1     00:00:00 /usr/bin/perl ./MagicNumbers.pl --nomkfs --devices 4 --satatype rr2340x500s --raiddev
0 S root      4811  4809 97  80   0 -  5570 futex_ 11:09 tty1     00:06:02 /usr/local/exegy/bin/ShiftGen -blockkb 128 -generate 8 -sync -file /s0/GigaData.38 -l
0 S root      4812  4810  2  80   0 -  5570 futex_ 11:09 tty1     00:00:09 /usr/local/exegy/bin/ShiftGen -blockkb 128 -generate 8 -sync -file /s1/GigaData.38 -l
1 S root      4813  4111  0  80   0 - 20406 wait   11:09 tty1     00:00:00 /usr/bin/perl ./MagicNumbers.pl --nomkfs --devices 4 --satatype rr2340x500s --raiddev

0 S root      4816  4815 97  80   0 -  5570 futex_ 11:09 tty1     00:06:04 /usr/local/exegy/bin/ShiftGen -blockkb 128 -generate 8 -sync -file /s3/GigaData.38 -l
1 D root      4822     2  0  80   0 -     0 conges 11:09 ?        00:00:00 [pdflush]
5 D root      4823  3340  0  80   0 - 29620 synchr 11:10 ?        00:00:00 crond
0 R root      4827  4418  0  80   0 - 16179 -      11:15 pts/0    00:00:00 ps -flea

cat /proc/meminfo
MemTotal:     16471592 kB
MemFree:       2201120 kB
Buffers:           944 kB
Cached:       13463208 kB
SwapCached:          0 kB
Active:          54416 kB
Inactive:     13451452 kB
SwapTotal:    33551712 kB
SwapFree:     33551560 kB
Dirty:          822408 kB
Writeback:      102280 kB
AnonPages:       41324 kB
Mapped:          12228 kB
Slab:           478412 kB
SReclaimable:   413192 kB
SUnreclaim:      65220 kB
PageTables:       4604 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  41787508 kB
Committed_AS:   174504 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    114264 kB
VmallocChunk: 34359598407 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB


[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux