(Please Cc me on answers, I don't follow LKML.)
Hi,
Suddenly one of our servers, a Dual Opteron with 2GB memory (running 32-bit
userland, but 64-bit kernel) started to behave oddly:
imapd[31528]: segfault at 00000000fff00000 rip 00000000556a1a6d rsp 00000000ffffd394 error 4
imapd[31527]: segfault at 00000000fff00000 rip 00000000556a1a6d rsp 00000000ffffcbe4 error 4
sh[31530]: segfault at 00000000ffff7ff4 rip 000000005555e556 rsp 00000000ffff7ff8 error 6
sh[31531]: segfault at 00000000ffff7e5c rip 00000000555dc575 rsp 00000000ffff7e60 error 6
Unable to load interpreter /lib/ld-linux.so.2
Unable to load interpreter /lib/ld-linux.so.2
(ad infinitum)
It turned out it had some sort of memory problem:
Jun 2 11:56:02 cassarossa smbd[7171]: oplock_break: malloc fail for input buffer.
Jun 2 11:56:02 cassarossa smbd[7171]: open_mode_check: FAILED when breaking oplock (3) on file login.bat, dev = 900, inode = 110665
This wasn't a RAM problem, as the machine has ECC RAM and we received no
warnings from it. Also, we definitely had enough swap:
cassarossa:~# free
total used free shared buffers cached
Mem: 2058300 2041136 17164 0 39576 1601468
-/+ buffers/cache: 400092 1658208
Swap: 3903712 0 3903712
It looks like somehow, the kernel couldn't really distinguish between memory
used as cache and just "used". It couldn't even swapoff:
cassarossa:~# swapoff -a
swapoff: /dev/sda5: Cannot allocate memory
swapoff: /dev/sdf5: Cannot allocate memory
However, we run with vm.overcommit_memory=2, so we figured out it was worth a
shot:
cassarossa:~# echo 0 > /proc/sys/vm/overcommit_memory
cassarossa:~# swapoff -a
cassarossa:~# swapon -a
cassarossa:~# free -m
total used free shared buffers cached
Mem: 2010 1993 16 0 39 1595
-/+ buffers/cache: 358 1651
Swap: 3812 0 3812
Suddenly everything seems to be back to normal (ie. we could swapoff, and the
programs stopped running out of memory; no changes in the cache used,
though), and after a quick restart of services, everything is back to normal.
So to me, it looks like vm.overcommit_memory=2 is broken, at least on AMD64.
Any ideas why this would happen?
for the record:
cassarossa:~# uname -a
Linux cassarossa 2.6.12-rc4 #1 SMP Fri May 13 18:49:40 CEST 2005 x86_64 unknown
No kernel patches except for a microscopic forward-port of the ELF fix from
2.6.11.9.
/* Steinar */
--
Homepage: http://www.sesse.net/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]