Fedora Users — Re: How to compare two text files

On Thu, May 6, 2010 at 8:57 AM, W.H. Kalpa Pathum <callkalpa@xxxxxxxxx> wrote:
> hi,
>
> I've got two text files containing email addresses one at a row. The
> number of rows in one file is different from the number of the other
> file. email addresses in one file is already there in the other file
> (there are some more also). What I want to do is extract the list
> which is not in the other file.
>
> elaboration
>
> File A has 100 email addresses and file B has 15 email addresses. 15
> email addresses in file B are already there in file A. I want to
> extract the 85 mails (excluding the 15 from file B) from file A.
>
> Any idea on how to accomplish this?

Several ways, depending on how clean your data is. If there are no
duplicates then sort it then diff.

If you need to clean it up first then you may need to pass it through
tr, uniq, or other filters to normalize the data.

If your input sets are very small you could use grep or egrep with the
-v option and pass the entire list of one to the other.

If you anticipate having much larger input sets then you could clean
up the data, load it into a hash table, then delete all the entries
that match your second list. Then print the table.
-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines