Re: Linux 2.4.33-rc2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Grant,

On Thu, Jul 06, 2006 at 05:42:17PM +1000, Grant Coady wrote:
> On Wed, 5 Jul 2006 22:51:37 +0200, Willy Tarreau <[email protected]> wrote:
(...)
> >after a full day of stress-test of multiple parallel tar xUf, and ffsb at
> >full CPU load, I could not reproduce the problem on the exact same kernel
> >I first saw it on. So I think I had bad luck and the problem is not related
> >to the vfs_unlink() patch, so unless anyone else reports a problem or tells
> >us why it is right or wrong, it would seem reasonable to keep it as it is
> >in -rc2.
> 
> Hi Willy,
> 
> Got this with unpatched -rc2, tosh is NFS server, niner is client:
> 
> grant@niner:/home/nfstest$ ls -l
> total 228474
> drwxr-xr-x  19 grant wheel       680 2006-03-20 16:53 linux-2.6.16/
> -rw-r--r--   1 grant wheel 233953280 2006-07-05 18:27 linux-2.6.16.tar
> drwxr-xr-x  19 grant wheel       680 2006-03-20 16:53 linux-2.6.16b/
> grant@niner:/home/nfstest$ x=0; while [ ! $(diff -rq linux-2.6.16 linux-2.6.16b) ]; do ((x++)); echo "trial $x"; rm -rf linux-2.6.16b; mv linux-2.6.16 linux-2.6.16b; tar xf linux-2.6.16.tar; done
> trial 1
> ...
> trial 29
> rm: cannot remove directory `linux-2.6.16b/drivers/cdrom': Directory not empty
> -bash: [: too many arguments
> grant@niner:/home/nfstest$ ls -l
> total 228474
> drwxr-xr-x  19 grant wheel       680 2006-03-20 16:53 linux-2.6.16/
> -rw-r--r--   1 grant wheel 233953280 2006-07-05 18:27 linux-2.6.16.tar
> drwxr-xr-x   4 grant wheel       104 2006-07-06 11:01 linux-2.6.16b/
> grant@niner:/home/nfstest$ rm -rf linux-2.6.16b/
> 
> The 'rm -rf linux-2.6.16b' completed okay, a mystery?  

you might have had a '.nfs0000*' file inthe directory which prevented rmmod
from working, but it was finally removed by the rm -rf.

> This is with two slow (500MHz) boxen with -rc2.
> Only idea I get from logs is during the test:
> 
> Jul  5 19:01:19 niner kernel: nfs: server tosh not responding, still trying
> Jul  5 19:01:19 niner kernel: nfs: server tosh OK
> 
> ... about one pair each 2 to 5 mins
> 
> Jul  6 11:16:08 niner kernel: nfs: server tosh not responding, still trying
> Jul  6 11:16:08 niner kernel: nfs: server tosh OK
> Jul  6 11:26:57 niner -- MARK --
> Jul  6 11:46:57 niner -- MARK --

I get this if the server spends too much time writing data back to the disks.
Doing this on the server fixed the problem for me :

# echo 50 25000 0 0 100 100 60 45 0 >/proc/sys/vm/bdflush

> Other pair of boxen with patched -rc2 completed 146 trials overnight along 
> with compiling 2.4 kernel over NFS as well since morning, 64 completed. 
> No 'server not responding messages' logged.

Was it on the same server and while other clients saw the server disappear ?

> I'll change the two running boxen to straight -rc2 and see if catch 
> anything.  

OK, similarly, it might be interesting to apply your patch to niner to see
if the rmmod error happens again.

> Grant.

Thanks for your tests,
Willy

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux