Re: NFS on loopback locks up entire system(2.6.23-rc6)?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Trond, for clarifying this for me.

I have seen similar behavior when a remote NFS server is not
available. Many processes wait end up waiting in nfs_release_page. So,
what will happen if the remote server is not available,
nfs_release_page cannot free the memory since it waits on rpc request
to complete, which never completes and processes wait in there for
ever?

And unfortunately in my case, I cannot use "mount --bind". I want to
use the same file system from two different nodes, and I want file &
record locking to be consistent. The only way to make sure locking is
consistent is to use loopback NFS on 1 host and NFS mount the same
file system on other nodes, so that NFS server ensures file & record
locking to be consistent. Is there any alternative to this?

Is it possible or any efforts to integrate ext3 or other local file
systems locking & network file system locking, so that user can use
"mount --bind" on local host and NFS mount on remote nodes, but file &
record locking will be consistent between both the nodes?

Thanks
--Chakri

On 9/20/07, Trond Myklebust <[email protected]> wrote:
> On Thu, 2007-09-20 at 17:22 -0700, Chakri n wrote:
> > Hi,
> >
> > I am testing NFS on loopback locks up entire system with 2.6.23-rc6 kernel.
> >
> > I have mounted a local ext3 partition using loopback NFS (version 3)
> > and started my test program. The test program forks 20 threads
> > allocates 10MB for each thread, writes & reads a file on the loopback
> > NFS mount. After running for about 5 min, I cannot even login to the
> > machine. Commands like ps etc, hang in a live session.
> >
> > The machine is a DELL 1950 with 4Gig of RAM, so there is plenty of RAM
> > & CPU to play around and no other io/heavy processes are running on
> > the system.
> >
> > vmstat output shows no buffers are actually getting transferred in or
> > out and iowait is 100%.
> >
> > [root@h46 ~]# vmstat 1
> > procs -----------memory---------- ---swap-- -----io---- --system--
> > -----cpu------
> >  r  b    swpd       free       buff       cache   si   so   bi   bo
> > in     cs us sy id wa st
> >  0 24    116 110080  11132 3045664    0    0     0     0   28  345  0
> > 1  0 99  0
> >  0 24    116 110080  11132 3045664    0    0     0     0    5  329  0
> > 0  0 100  0
> >  0 24    116 110080  11132 3045664    0    0     0     0   26  336  0
> > 0  0 100  0
> >  0 24    116 110080  11132 3045664    0    0     0     0    8  335  0
> > 0  0 100  0
> >  0 24    116 110080  11132 3045664    0    0     0     0   26  352  0
> > 0  0 100  0
> >  0 24    116 110080  11132 3045664    0    0     0     0    8  351  0
> > 0  0 100  0
> >  0 24    116 110080  11132 3045664    0    0     0     0   23  358  0
> > 1  0 99  0
> >  0 24    116 110080  11132 3045664    0    0     0     0   10  350  0
> > 0  0 100  0
> >  0 24    116 110080  11132 3045664    0    0     0     0   26  363  0
> > 0  0 100  0
> >  0 24    116 110080  11132 3045664    0    0     0     0    8  346  0
> > 1  0 99  0
> >  0 24    116 110080  11132 3045664    0    0     0     0   26  360  0
> > 0  0 100  0
> >  0 24    116 110080  11140 3045656    0    0     8     0   11  345  0
> > 0  0 100  0
> >  0 24    116 110080  11140 3045664    0    0     0     0   27  355  0
> > 0  2 97  0
> >  0 24    116 110080  11140 3045664    0    0     0     0    9  330  0
> > 0  0 100  0
> >  0 24    116 110080  11140 3045664    0    0     0     0   26  358  0
> > 0  0 100  0
> >
> >
> > The following is the backtrace of
> > 1. one of the threads of my test program
> > 2. nfsd daemon and
> > 3. a generic command like pstree, after the machine hangs:
> > -------------------------------------------------------------
> > crash> bt 3252
> > PID: 3252   TASK: f6f3c610  CPU: 0   COMMAND: "test"
> >  #0 [f6bdcc10] schedule at c0624a34
> >  #1 [f6bdcc84] schedule_timeout at c06250ee
> >  #2 [f6bdccc8] io_schedule_timeout at c0624c15
> >  #3 [f6bdccdc] congestion_wait at c045eb7d
> >  #4 [f6bdcd00] balance_dirty_pages_ratelimited_nr at c045ab91
> >  #5 [f6bdcd54] generic_file_buffered_write at c0457148
> >  #6 [f6bdcde8] __generic_file_aio_write_nolock at c04576e5
> >  #7 [f6bdce40] try_to_wake_up at c042342b
> >  #8 [f6bdce5c] generic_file_aio_write at c0457799
> >  #9 [f6bdce8c] nfs_file_write at f8c25cee
> > #10 [f6bdced0] do_sync_write at c0472e27
> > #11 [f6bdcf7c] vfs_write at c0473689
> > #12 [f6bdcf98] sys_write at c0473c95
> > #13 [f6bdcfb4] sysenter_entry at c0404ddf
> >     EAX: 00000004  EBX: 00000013  ECX: a4966008  EDX: 00980000
> >     DS:  007b      ESI: 00980000  ES:  007b      EDI: a4966008
> >     SS:  007b      ESP: a5ae6ec0  EBP: a5ae6ef0
> >     CS:  0073      EIP: b7eed410  ERR: 00000004  EFLAGS: 00000246
> > crash> bt 3188
> > PID: 3188   TASK: f74c4000  CPU: 1   COMMAND: "nfsd"
> >  #0 [f6836c7c] schedule at c0624a34
> >  #1 [f6836cf0] __mutex_lock_slowpath at c062543d
> >  #2 [f6836d0c] mutex_lock at c0625326
> >  #3 [f6836d18] generic_file_aio_write at c0457784
> >  #4 [f6836d48] ext3_file_write at f8888fd7
> >  #5 [f6836d64] do_sync_readv_writev at c0472d1f
> >  #6 [f6836e08] do_readv_writev at c0473486
> >  #7 [f6836e6c] vfs_writev at c047358e
> >  #8 [f6836e7c] nfsd_vfs_write at f8e7f8d7
> >  #9 [f6836ee0] nfsd_write at f8e80139
> > #10 [f6836f10] nfsd3_proc_write at f8e86afd
> > #11 [f6836f44] nfsd_dispatch at f8e7c20c
> > #12 [f6836f6c] svc_process at f89c18e0
> > #13 [f6836fbc] nfsd at f8e7c794
> > #14 [f6836fe4] kernel_thread_helper at c0405a35
> > crash> ps|grep ps
> >     234      2   3  cb194000  IN   0.0       0      0  [khpsbpkt]
> >     520      2   0  f7e18c20  IN   0.0       0      0  [kpsmoused]
> >    2859      1   2  f7f3cc20  IN   0.1    9600   2040  cupsd
> >    3340   3310   0  f4a0f840  UN   0.0    4360    816  pstree
> >    3343   3284   2  f4a0f230  UN   0.0    4212    944  ps
> > crash> bt 3340
> > PID: 3340   TASK: f4a0f840  CPU: 0   COMMAND: "pstree"
> >  #0 [e856be30] schedule at c0624a34
> >  #1 [e856bea4] rwsem_down_failed_common at c04df6c0
> >  #2 [e856bec4] rwsem_down_read_failed at c0625c2a
> >  #3 [e856bedc] call_rwsem_down_read_failed at c0625c96
> >  #4 [e856bee8] down_read at c043c21a
> >  #5 [e856bef0] access_process_vm at c0462039
> >  #6 [e856bf38] proc_pid_cmdline at c04a1bbb
> >  #7 [e856bf58] proc_info_read at c04a2f41
> >  #8 [e856bf7c] vfs_read at c04737db
> >  #9 [e856bf98] sys_read at c0473c2e
> > #10 [e856bfb4] sysenter_entry at c0404ddf
> >     EAX: 00000003  EBX: 00000005  ECX: 0804dc58  EDX: 00000062
> >     DS:  007b      ESI: 00000cba  ES:  007b      EDI: 0804e0e0
> >     SS:  007b      ESP: bfa3afe8  EBP: bfa3d4f8
> >     CS:  0073      EIP: b7f64410  ERR: 00000003  EFLAGS: 00000246
> > ----------------------------------------------------------
> >
> > Any ideas what could potentially trigger this?
>
> This is pretty much expected: the NFS server is trying to allocate
> memory. The VM then tries to satisfy that demand by freeing up resources
> from the NFS client by telling the client to write out cached pages. The
> client again is in a congested state in which it is waiting on the NFS
> server to finish writing out what it already sent.
>         Quod Erat Deadlocked...
>
> This is why 'mount --bind' is a good idea, and 'mount -t nfs localhost:'
> is generally a bad one...
>
> Cheers
>   Trond
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux