Hi there, Try 'fslint', it does hash-sum comparisons on files in different dirs.. Although the front-end 'fslint-gui' isn't exactly built for automation, it at least lets you do big sweeps of dupe-deletion. --Mike On Tue, Feb 23, 2010 at 5:31 PM, Marko Vojinovic <vvmarko@xxxxxxxxx> wrote: > > Hi folks! :-) > > I have the following task: there are two directories on the disk, say a/ and > b/, with various subdirectories and files inside. I need to find and erase all > *duplicate* files, and after that all empty directories. The files may reside in > different directories, may have different names, but if they have identical > *contents*, file from b/ branch should be deleted. > > Now, the directories that I have are rather large and I wouldn't want to go > hunt for duplicates manually. Is there some tool that can at least identify > and list duplicate files in some directory structure? > > I could think of an algorithm like: > > 1) list all files in all subdirectories of a/ along with their file size > 2) do the same thing for files in b/ > 3) sort and compare lists, look for pairs of files with identical size > 4) test each pair to see if the file content is the same, and if yes, list them > in the output > > I could probably be able to write a bash script which would do this, but I > guess this problem is common and there are already some available tools which > would do this for me. Any suggestions? > > Thanks, :-) > Marko > > -- > users mailing list > users@xxxxxxxxxxxxxxxxxxxxxxx > To unsubscribe or change subscription options: > https://admin.fedoraproject.org/mailman/listinfo/users > Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines > -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines