On 22Mar2009 11:09, Steven W. Orr <steveo@xxxxxxxxxxx> wrote: | On Saturday, Mar 21st 2009 at 22:41 -0000, quoth Cameron Simpson: | =>On 21Mar2009 16:47, Daniel B. Thurman <dant@xxxxxxxxx> wrote: | =>> When I tried `sort -un', the data was truncated, i.e. | =>> there is data loss. So, when I went back to my original | =>> code using 'sort -n | uniq', there is no data loss. There | =>> seems to be a problem using the `sort -un' method. | => | =>Well, they do mean slightly different things. | => | =>"sort -un" sorts and returns the first row of each set of rows that | =>sorted equal. (i.e. "1 foo" and "1 bah" sort equal (numeric) and only "1 | =>foo" is returned. (See "man sort" for the details, and "man 1p sort" for | =>what you may portably expect on multiple UNIX platforms.) [...] | =>It is often correct to replace "sort -n | uniq" with "sort -un", but I was | =>clearly wrong to do so here. | | Fascinating. I never noticed that behaviour. Neither had I until Daniel reported his problem. Then I read the manual page more closely. [...] | I guess I'm convinced that the only safe way is to *never* use -u. For plenty of data the -u option is just fine. It's only where you have extra data on the line not related to the sort criteria that it matters. Clearly, almost all the data I've sorted has been unique (columns of numbers, passwd and group files, etc), or I've not cared about which line I end up with (sorting lists of words). So from my point of view, Daniel's use case is uncommon. I intend using -u as frequently as before. Like anything, I must keep its real semantics in the back of my mind for when I do have data not unique on the sort keys, which must be kept. Cheers, -- Cameron Simpson <cs@xxxxxxxxxx> DoD#743 http://www.cskk.ezoshosting.com/cs/ If I haven't seen further, it is by standing in the footprints of giants. - ~kzm <ketil@xxxxxxxxx> -- fedora-list mailing list fedora-list@xxxxxxxxxx To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines