NFS client hangs under certain circumstances on SMP machine

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi

I have already sent this message on the NFS mailing-list, but I had no reaction there. May be you kernel hackers have an idea?



I have a strange problem since a few months on some Linux clients.

I have a file server accessed through:
  - NFS from Linux clients (autofs, but direct mount causes same effect)
  - Samba from Windows clients

This works since several years like a charm, but as I said there is a
strange problem that appeared recently:

I have a directory, to which I generate code from Windows (\\server\dir)
I can see it under Linux (/mount/dir) where I can access (compile) the
files.

However, when I regenerate the file under Windows again (ie. I overwrite
the old files), and I try to compile the files again under Linux, "make"
hangs simply in D state:

# ps aux | grep make
user 7177 0.0  0.0 1984 760 pts/1 D+ 16:13 0:00 make -f myMakefile

The load average goes up one unit each time I reproduce this test (apparently, processes in non-interruptible state are considered as running).

From then, the following actions does NOT unblock the process:
  - stopping or restarting the NFS service on the server
  - restarting the server
  - restarting autofs on the client
  - trying to unmount the NFS mount

If I reboot the client, all goes back to normal, until I repeat the
process below (ie. overwriting and compiling).
Typically, "shutdown -r" does not work, I have to "reboot -f".

There is nothing interesting in /var/log on the server nor on the
client.

Versions used on the server:
  - SuSE 9.3
  - kernel-default-2.6.11.4-21.11
  - nfs-utils-1.0.7-3
  - samba-3.0.13-1.1
  - filesystem: reiserfs

On the client:
  - SuSE 9.3
  - kernel-smp-2.6.11.4-21.10
  - nfs-utils-1.0.7-3
  - mounts:
automount on /mount type autofs (rw,fd=4,pgrp=6529,minproto=2,maxproto=4)
    serv:/dir on /mount/dir type nfs (rw,addr=*IP*)
  - CPU: P4 with hyper threading (2 virtual CPUs)
Note: maxcpus=0 does not make any difference regarding this issue. I
could not test yet with kernel compiled without SMP at all.


On the following clients with the very same server, network, and mount tables I could not reproduce the problem:

  - SuSE 9.1
  - kernel-default-2.6.5-7.202.7
  - nfs-utils-1.0.6-103
  - CPU: P4 single core

  - SuSE 10.0
  - Kernel: 2.6.14.3-default (from kernel.org)
  - nfs-utils-1.0.7-13


Any idea?
Seems to me as it is related to the SMP. What do you think?
How can I debug further?


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux