On Thu, May 6, 2010 at 8:57 AM, W.H. Kalpa Pathum <callkalpa@xxxxxxxxx> wrote: > hi, > > I've got two text files containing email addresses one at a row. The > number of rows in one file is different from the number of the other > file. email addresses in one file is already there in the other file > (there are some more also). What I want to do is extract the list > which is not in the other file. > > elaboration > > File A has 100 email addresses and file B has 15 email addresses. 15 > email addresses in file B are already there in file A. I want to > extract the 85 mails (excluding the 15 from file B) from file A. > > Any idea on how to accomplish this? Several ways, depending on how clean your data is. If there are no duplicates then sort it then diff. If you need to clean it up first then you may need to pass it through tr, uniq, or other filters to normalize the data. If your input sets are very small you could use grep or egrep with the -v option and pass the entire list of one to the other. If you anticipate having much larger input sets then you could clean up the data, load it into a hash table, then delete all the entries that match your second list. Then print the table. -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines