Hi Terry,
I tend to disagree with the other who have replied so far, I've found
NFS to be 100% reliable for many years, with large clusters of clients
using many flavors of Unix, Whenever things have failed I've always
being able to find the root cause.
I'd suggest that you look are your messages file for indications of the
problem. Also one tool you can .use is nfsstat (man nfsstat) it should
indicate NFS related bad calls.
On any recent Linux, it would be very rare for there to be "no
indication", so your log files are your friend.
If you really cannot find any message or indication, it stands to reason
that the files in question may have been open/updated by another user or
process during the gzip process, is that possible ?
I would agree that Rsync is a good choice for this task (you could run
"rsync --dry-run --stats" to show any differences) that exist.
Albert.
T. Horsnell wrote:
I'm in the process of moving stuff from our Alpha fileserver
onto A linux replacement. I've been using gnu-tar to copy filesystems
from the Alpha to to the Linux NFS-exported disks over a 1Gbit LAN,
followed by diff -r to check that they have copied correctly (I wish
diff had an option to not follow symlinks..). I've so far transferred
about 3 TiB of data (spread over several weeks) and am concerned
that during this process, 3 files were mis-copied without any
apparent hardware-errors being flagged. There was nothing unusual
about these files, and re-copying them (with cp) fixed the problem.
Are occasional undetected errors like this to be expected?
I thought there were sufficient stages of checksumming/parity
(both boxes have ECC memory) etc to render the probability
of this to be vanishingly small.
On all 3 files, multiple retries of the diff still resulted
in a compare error, which was then fixed by a re-copy. This
suggests that the problem occurs during the 'gtar' phase, rather
than the 'diff -r' phase.
Does anyone know of a network-exercise utility I can use
to check the LAN component of the data-path?
Cheers,
Terry.