Re: Tracking down a memory leak

On Mon, 2005-06-20 at 12:33 +0200, Marco Colombo wrote:
> Hi,
> today I've found a server in a OOM condition, the funny thing is that
> after some investigation I've found no process that has mem allocated
> to. I even switched to single user, here's what I've found: 

[...]
>              total       used       free     shared    buffers     cached
> Mem:       1035812     898524     137288          0       3588      16732
> -/+ buffers/cache:     878204     157608
> Swap:      1049248        788    1048460
> sh-2.05b# uptime
>  12:13:28 up 35 days,  1:48,  0 users,  load average: 0.00, 0.59, 16.13
> sh-2.05b# uname -a
> Linux xxxx.example.org 2.6.10-1.12_FC2.marco #1 Mon Feb 7 14:53:42 CET 2005
> i686 athlon i386 GNU/Linux
> 
> I know this is an old Fedora Core 2 kernel, eventually I'll bring the
> issue on thier lists. An upgrade has already been scheduled for this
> host, so I'm not really pressed in tracking this specific bug (unless it
> occurs on the new system, of course).
> 
> Anyway, I just wonder if generally there's a way to find out where those
> 850+ MBs are allocated. Since there are no big user processes, I'm
> assuming it's a memory leak in kernel space. I'm curious, this is the
> first time I see something like this. Any suggestion what to look at
> besides 'ps' and 'free'?
> 
> The server has been mainly running PostgreSQL at a fairly high load for
> the last 35 days, BTW.
> 
> TIA,
> .TM.

Thanks to everybody who replied to me. Here's more data:

sh-2.05b# sort -rn +1 /proc/slabinfo | head -5
biovec-1          7502216 7502296     16  226    1 : tunables  120   60    0 : slabdata  33196  33196      0
bio               7502216 7502262     96   41    1 : tunables  120   60    0 : slabdata 182982 182982      0
size-64             4948   5307     64   61    1 : tunables  120   60    0 : slabdata     87     87      0
buffer_head         3691   3750     52   75    1 : tunables  120   60    0 : slabdata     50     50      0
dentry_cache        2712   2712    164   24    1 : tunables  120   60    0 : slabdata    113    113      0

I've found no way do free that memory, so decided to reboot it.
In the following days I had been monitoring the system after upgrading
to kernel-2.6.10-1.770_FC2. Here are the results I got, day by day:

bio               115333 115333     96   41    1 : tunables  120   60    0 : slabdata   2813   2813      0
biovec-1          115322 115486     16  226    1 : tunables  120   60    0 : slabdata    511    511      0

biovec-1          325006 325440     16  226    1 : tunables  120   60    0 : slabdata   1440   1440      0
bio               324987 325212     96   41    1 : tunables  120   60    0 : slabdata   7930   7932      0

bio               538535 538535     96   41    1 : tunables  120   60    0 : slabdata  13135  13135      0
biovec-1          538528 538784     16  226    1 : tunables  120   60    0 : slabdata   2384   2384      0

bio               749870 750218     96   41    1 : tunables  120   60    0 : slabdata  18296  18298      0
biovec-1          749886 750772     16  226    1 : tunables  120   60    0 : slabdata   3322   3322      0

bio               960630 960630     96   41    1 : tunables  120   60    0 : slabdata  23430  23430      0
biovec-1          960642 960726     16  226    1 : tunables  120   60    0 : slabdata   4251   4251      0

bio               1170079 1170345     96   41    1 : tunables  120   60    0 : slabdata  28543  28545      0
biovec-1          1170066 1170906     16  226    1 : tunables  120   60    0 : slabdata   5181   5181      0

bio               1379857 1380019     96   41    1 : tunables  120   60    0 : slabdata  33658  33659      0
biovec-1          1379854 1380408     16  226    1 : tunables  120   60    0 : slabdata   6108   6108      0

Clearly, something was going on. So I decided to run a vanilla kernel
instead.

I'm running 2.6.12.1 right now, and after about one day of uptime:

bio                  345    369     96   41    1 : tunables  120   60    0 : slabdata      9      9      0
biovec-1             376    678     16  226    1 : tunables  120   60    0 : slabdata      3      3      0

which seem to me sane values (and they stay like that as far as I can
see). No more daily increase of more than 200,000.

I'll keep an eye on it in the next days, but I think 2.6.12.1 is not
affected.

.TM.
-- 
      ____/  ____/   /
     /      /       /                   Marco Colombo
    ___/  ___  /   /                  Technical Manager
   /          /   /                      ESI s.r.l.
 _____/ _____/  _/                      [email protected]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: Tracking down a memory leak
  - From: Andrew Morton <[email protected]>

References:
- Tracking down a memory leak
  - From: Marco Colombo <[email protected]>

Prev by Date: variable used before it is set
Next by Date: [no subject]
Previous by thread: Tracking down a memory leak
Next by thread: Re: Tracking down a memory leak
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]