Mikkel L. Ellertson wrote:
You may want to look at the fdups and fslint packages.
That would be fdupes. Both are available in Fedora 8:
yum list fdupes fslint
Loading "downloadonly" plugin
Loading "skip-broken" plugin
Installed Packages
fdupes.i386 1.40-10.fc8 installed
fslint.noarch 2.24-1.fc8 installed
With fdupes: move your folder that you think has duplicates below the
original, then:
$ fdupes --recurse --delete
This sorts files by size, and for matching size compares contents. For
dupes it list the path to each {note that file name does not need to
match, only the content}, with an associated [number]. You then type 1
or more numbers to indicate which copy to keep.
If you don't care which copy to keep, you can use a trick like:
$ yes 1|fdupes --recurse --delete /home/myhome/mypath_to_dedupe/
It runs as before and any time the list stops waiting for input yes
passes a 1 in; thus the first item is kept. Make sure you have tried the
command without the yes before hand so you get an idea of what would be
deleted - automatically.
I have found fslint's gui useful to find and erase empty folders across
the disc. This saves time in not going into folders just to check if
there is any contents.
With late version rsync, you can also:
rsync --dry-run --remove-source-files -a
different_machine:/home/stuff_i_think_is_dupes/ /home/the_primary_copy/
which is useful if the 2 folders are somewhat identical and you want to
end up with a single merged copy. You would probably then run fdupes
afterwards to tidy dupes that are in differently named folders.
DaveT.