Dan Track wrote:
Hi
I've got the following output
Col1 Col2 Col3 Col5
1 000 001 Yes
2 000 001
3 000 001
4 Yes Yes
4 000 001
4 000 001
5 000 001
5 Yes 001
6 000 001 Yes
As you can see the column widths vary in size. What I need to do is to
find out The number in Col1 that is associated with all those "Yes"
occurrences in Col5. How can I do this.
I've tried the following
cat file | tr -s ' ' ' ' | tr -s '\t' ' ' | cut -d ' ' -f 6
But I get a result like this
Hi
I've got the following output
Col1 Col2 Col3 Col5
1 000 001 Yes
2 000 001
3 000 001
4 Yes Yes
4 000 001
4 000 001
5 000 001
5 Yes 001
6 000 001 Yes
As you can see one of the "Yes" statements has moved into the third
column, so that's a wrong move.
Any help would be appreciated
The problem here I think is that some of your columns are empty, so for
instance:
Col1 Col2 Col3 Col5
4 Yes Yes
appears the same as:
Col1 Col2 Col3 Col5
4 Yes Yes
to most Unix text-processing tools that separate fields based on whitespace.
If you're actually looking for lines where the last field is "Yes", you
could just do:
$ awk '$NF == "Yes"' file
If all you want is the number in the first field, you'd have:
$ awk '$NF == "Yes" { print $1 }' file
Paul.