Fedora Users — Re: What linux lacks most

Cameron Simpson wrote:

On 26Mar2008 15:45, Bob Kinney <bc98kinney@xxxxxxxxx> wrote:
| --- Ian Chapman <packages@xxxxxxxxxxxxxxxxxx> wrote:
| > Neal Becker wrote:
| > > I used unix/linux for many years.  In the past we've used nfs.  But nfsv3
| > > has no (useful) authentication.  Anyone can setup a rogue machine and
| > > pretend to be any uid/gid.

| >| > What I'd like to see is a way to forcibly unmount broken hard NFS| > mounts. umount -f seems to do squat.|| I thought that hard NFS mounts were a thing of the past--like the mid '90s.

Not if you want reliable batch behaviour in the face of NFS server
downtime. My previous workplace routinely ran jobs that took weeks.
With a hard mount the job just stalls until the server's back, then
continues. Which means you can do maintenance that requires downtime.

"hard,intr" is the common flag pair, allowing you to at least interrupt
a stalled IO to a down server, getting your job back.

Those were the options I used too, and the above were the reasons.

| Isn't it preferred to set them up with an automounter to prevent panic
| when communication falters?
| I've looked into it a little bit, and it seems like it can be done, but for
| the frequency that I use NFS, I took the quick-and-dirty route.

Autofs isn't enough. If you run it with a smallish idle timeout (to
umount when a remote fs is unused long enough) it reduces your exposure
to down servers, particularly handy when you want to reboot a client,
but also handy for those processes that walk the mount table to find
stuff out - avoiding a stall on a down mount mountpoint.

I actually increased the timeouts quite high, otherwise a several jobs startingat the same time caused mount storms (you need a lot of machines trying to mountat the same time to get this) which results in some mounts timing out, andresults in jobs failing to start.

However, it only reduces the problem. There's no magic in autofs, and a
stalled mount point is still a stalled mount point. And of course autofs
introduces its own collection of issues (mostly rare and minor).

If it is mounted (through any method) and the server goes down it is trouble.

I don't have very much trouble at all with NFS, most of the people I have helpedthat have had issues read about some option that was suggested to use severalyears ago and went to using it on a current NFS server with less than goodresults.

I know people that decided that the "soft" option was a good choice, and used itand bitched about NFS being horrible until we told them never ever use the softoption as this will result in the application aborting when a "hard" mount wouldget a annoying timeout warning in the messages file and eventually go on whenthings were fixed.

The -l option has some bad behavior if you use them under certain cases it isbest to be avoided in any environment that reliability is important, it isbetter to kill all of the jobs accessing the filesystem and do a proper unmount(I believe hung NFS will unmount with hard,intr once all of the jobs accessingit have been killed and given time to receive the signal and die), or take theeasy way out and reboot. Have the mount removed from the table (-l) and havingprocesses live and still accessing that unmounted NFS filesystem *WILL* resultin funny things happening when the NFS server comes back up-those applicationsin the background *WILL* continue even though the filesystem is not showing inthe mount table on the given client machine.

                                 Roger

Re: What linux lacks most - a decent remote fs