Re: Text processing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dan Track wrote:
Hi

I've got the following output

Col1    Col2   Col3       Col5
1         000    001        Yes
2         000    001
3         000    001
4         Yes                 Yes
4         000    001
4         000    001
5         000    001
5         Yes    001
6         000    001        Yes

As you can see the column widths vary in size. What I need to do is to
find out The number in Col1 that is associated with all those "Yes"
occurrences in Col5. How can I do this.
I've tried the following
cat file | tr -s ' ' ' ' | tr -s '\t' ' ' | cut -d ' ' -f 6

But I get a result like this

Hi

I've got the following output

Col1 Col2 Col3 Col5
1 000 001 Yes
2 000 001
3 000 001
4 Yes Yes
4 000 001
4 000 001
5 000 001
5 Yes 001
6 000 001 Yes

As you can see one of the "Yes" statements has moved into the third
column, so that's a wrong move.

Any help would be appreciated

The problem here I think is that some of your columns are empty, so for instance:

Col1    Col2   Col3       Col5
4         Yes                 Yes

appears the same as:

Col1    Col2   Col3       Col5
4       Yes    Yes

to most Unix text-processing tools that separate fields based on whitespace.

If you're actually looking for lines where the last field is "Yes", you could just do:

$ awk '$NF == "Yes"' file

If all you want is the number in the first field, you'd have:

$ awk '$NF == "Yes" { print $1 }' file

Paul.





[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux