Anton Altaparmakov <[email protected]> wrote:
>
> Hi,
>
> On Fri, 2006-01-27 at 14:39 +0000, Anton Altaparmakov wrote:
> > A colleague has a server (which does backups) that is incapable of doing
> > a backup due to the backup process being killed due to OOM after
> > anywhere between 30s and a few minutes of running... And the backup
> > process is just a simple program that does the equivalent of "dd with
> > one source but two destinations" where the source is an lvm/dm snapshot
> > and the two destinations are two different tape drives attached via
> > scsi. That is pretty critical, admittedly only to us and that system...
>
> We found a workaround for the OOM problems on above server yesterday.
>
> Add a 1MiB swap file:
>
> dd if=/dev/zero of=/var/swapfile bs=1024 count=1024
> mkswap /var/swapfile
> swapon /var/swapfile
>
> Run backup script and no problems!
>
> Note: This is a suse SLES9 system and the problem is not present on
> kernel kernel-smp-2.6.5-7.193.i586.rpm and all earlier kernels and it is
> present on kernel-smp-2.6.5-7.201.i586.rpm and all later kernels
> including the latest kernel (2.6.5-7.244).
>
> Seems like a definite VM bug... Interestingly on the .244 kernel the
> OOM conditions print out a lot of debug information to dmesg about the
> memory use in the system and AFAICS none of the memory is exhausted! So
> it seems the system goes OOM without it actually being OOM because it
> detects that "free swap == 0" or something along those lines...
It does sound like that. Does it still happen if there's 1MB of swap
online and it's all full?
> Or do we nowadays require swap to be present?
Shouldn't be the case.
> The machine has 6GiB RAM so swap was turned off on it. (In our
> experience if a machine with a lot of concurrent connections starts
> swapping the system goes down the drain (it becomes too slow) so swap is
> not something we want on servers with 40000+ users...)
1MB of swap isn't likely to cause a lot of swapping.
> If the above is not enough information to find/fix the problem please
> let me know what more you would like to know...
It'd be nice to see the oom-killer output.
I don't recall a problem like this. I wonder if there are any suse changes
which might have triggered it.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]