rpc.mountd crashes when extensively using netgroups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

we are seeing rpc.mountd crashes on our Red Hat EL4 systems.
We have tracked down the bug and it seems to be still present
in the current nfs-utils source.

We are making extensive use of netgroups for NFS exports. On
a large file server with hundreds of home directories we export
every directory to a unique netgroup. Member netgroups are used
to export to sets of machines. The following example illustrates
what we do:

# cat /etc/exports
/export/home/jane @jane(async,rw,no_subtree_check,fsid=10000)
/export/home/joe @joe(async,rw,no_subtree_check,fsid=10001)
# cat /etc/netgroup
lab_1 (workstation1,,) (workstation2,,) (workstation3)
offices_1 (workstation4,,) (workstation5,,)
jane lab_1 offices_1
joe offices_1 (joeslaptop,,)

We do this on a much larger scale though. The bug we ran into is
in line 96 in utils/mountd/auth.c. The strcpy can corrupt
memory when it copies the string returned by client_compose() to
my_client.m_hostname which has a fixed size of 1024 bytes. 
For our example above, client_compose() returns "@joe,@jane"
for any machine in the offices_1 netgroup. Unfortunately we have
a machine to which roughly 150 netgroups like @joe or @jane
export to and client_compose() returns a string over 1300 bytes
long and rpc.mountd nicely segfaults.
 
To prevent the crash is of course trivial: Inserting a simple
'if (strlen(n) > 1024) return NULL;' before line 96 does the job.

There are however two issues for which we could not find an easy
solution:

 1. For every client rpc.mountd and the kernel seem to exchange
    and use lists with _all_ netgroups used in exports that are
    relevant for granting permission to some share for a particular
    client. We could imagine two optimizations here:

       * Resolve netgroups and only put the (member) netgroups that
         contained the host name that would be used to authorize
         a mount in the list.

       * Use the list of mounted paths per client and only put the
         netgroup(s) used to export paths that are actually mounted
         on a client. 

    This also caused us severe performance problems because
    rpc.mountd queries all these netgroups. We were initially using
    a LDAP and mouting a directory took up to ten seconds
    during which rpc.mountd was busily querying the LDAP server.
    We got this down to two seconds using file based netgroups.
 
 2. Using a fixed size for NFSCLNT_IDMAX does not scale. Mounting
    shares on a client for which the 'if' clause of the quick fix
    becomes true will not be possible. We thought about enlarging
    NFSCLNT_IDMAX and using a custom kernel but dropped the idea. 

Our ultimate goal is to get Red Hat fix the code in nfs-utils 1.0.6
that is used in RHEL4. A first step would be to get a suitable fix in
the current nfs-utils. 

Is there somebody on the mailing list who could see an easy fix or
would have an opinion on how to best address the issues we see?

Thanks in advance and best regards,

Stefan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux