Re: Tracking down a memory leak

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2005-06-28 at 04:02 -0700, Andrew Morton wrote:
> Marco Colombo <[email protected]> wrote:
> >
> > Thanks to everybody who replied to me. Here's more data:
> > 
> >  sh-2.05b# sort -rn +1 /proc/slabinfo | head -5
> >  biovec-1          7502216 7502296     16  226    1 : tunables  120   60    0 : slabdata  33196  33196      0
> >  bio               7502216 7502262     96   41    1 : tunables  120   60    0 : slabdata 182982 182982      0
> 
> Are you using RAID?


Thanks Andrew,
yes I'm using RAID:

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [multipath]
md1 : active raid1 hdc7[1] hda7[0]
      524544 blocks [2/2] [UU]

md2 : active raid1 hdc8[1] hda8[0]
      524544 blocks [2/2] [UU]

md3 : active raid1 hdc9[1] hda9[0]
      3145856 blocks [2/2] [UU]

md4 : active raid1 hdc10[1] hda10[0]
      1048192 blocks [2/2] [UU]

md5 : active raid1 hdc11[1] hda11[0]
      2097024 blocks [2/2] [UU]

md6 : active raid1 hdc12[1] hda12[0]
      25961408 blocks [2/2] [UU]

md7 : active raid1 hdc13[1] hda13[0]
      25961408 blocks [2/2] [UU]

md0 : active raid1 hdc5[1] hda5[0]
      262400 blocks [2/2] [UU]

unused devices: <none>

This is a pretty standard configuration for me here, I use it almost
everywhere as a mean to protect data from simple disk failures (that are
way to common these days). I have at least 5-6 servers at hand, all with
disk mirroring. Some of them match the version exactly, and the hardware
is very similar, but there was only one host showing the leak. Two
things are different on the affected server:
- it uses the dc395x driver (only a DAT device is attached);
- it's a database (PostgreSQL) server (that is, with some shm usage).

The dc395x sometimes spits some error messages:
Jun 28 05:37:47 xxxx kernel: dc395x: Interrupt from DMA engine: 0x63!
Jun 28 05:37:47 xxxx kernel: dc395x: Ignoring DMA error (probably a bad
thing) ...

I have to live with those for now. They seem to be harmless. The same
messages occur with 2.6.12.1, but the bio leak is gone (or so it seems
after 1 day of uptime), I guess they may be unrelated (the dc395x driver
used to be my suspect #1).

> From: Neil Brown <[email protected]>
> 
> insert a missing bio_put when writing the md superblock.
> 
> Without this we have a steady growth in the "bio" slab.
> 
> Signed-off-by: Neil Brown <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
> 
>  drivers/md/md.c |    1 +
>  1 files changed, 1 insertion(+)
> 
> diff -puN drivers/md/md.c~md-bio-leak-fix drivers/md/md.c
> --- 25/drivers/md/md.c~md-bio-leak-fix	2005-06-27 22:29:13.000000000 -0700
> +++ 25-akpm/drivers/md/md.c	2005-06-27 22:29:13.000000000 -0700
> @@ -338,6 +338,7 @@ static int super_written(struct bio *bio
>  
>  	if (atomic_dec_and_test(&rdev->mddev->pending_writes))
>  		wake_up(&rdev->mddev->sb_wait);
> +	bio_put(bio);
>  	return 0;
>  }
>  

I'm sorry, but I'm missing something: I'm not able to locate that piece
of code in neither the (FC2) linux-2.6.10 tree nor the (unpatched)
linux-2.6.12.1 tree. There's no super_written() function at all, BTW.
The code looks completely different.

[...google digging...]

Is it related to this patch?

http://marc.theaimsgroup.com/?l=linux-raid&m=111155701929593&w=2

If so, I don't think I'm using it.
But the symptoms looks like the same.

Anyway, it seems solved in 2.6.12.1, so I'm not that worried. I'll wait
and see if after the upgrade to FC4 the problem is still there (with the
official FC4 kernel). I'll bring the issue to the Fedora lists, in case.
As far as I can see, the problem is already solved in mainstream. Thanks
again.

.TM.
-- 
      ____/  ____/   /
     /      /       /                   Marco Colombo
    ___/  ___  /   /                  Technical Manager
   /          /   /                      ESI s.r.l.
 _____/ _____/  _/                      [email protected]

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux