Re: Text Manipulation/Replacement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 22Sep2008 14:57, Ubence Quevedo <r0d3nt@xxxxxxxxxxx> wrote:
| I've used pdftotext to convert a pdf document to text and then used
| a combination of grep and awk to single out data and replace formatting
| that I didn't need.
| 
| The output data eventually looks like this:
| 12,123456789
| ,0987654321
| 
| But I want it to look like this:
| 12,123456789,0987654321
| 
| I've tried many different things with awk, but I can't get it replace
| \r, with just a ,

Do you want to only do this when the following line starts with a comma?

A little state machine might do (untested):

  h             # stash first line in hold space
  :again
  n             # get next line
  /^,/{         # starts with comma? do this stuff
    H           # append line to hold space
    x           # get hold space
    s/\n//      # remove embedded newline
    x           # put it back
    b again     # repeat for next line
  }
  x             # pull back hold space for printing

Put that in a file called "sedf" and try:

  sed -f sedf < olddata >newdata

and see how it goes. I think it will eat the last line as written.
-- 
Cameron Simpson <cs@xxxxxxxxxx> DoD#743
http://www.cskk.ezoshosting.com/cs/

Heaven could change from chocolate to vanilla without violating perfection.
        - arromdee@xxxxxxxxxxxxxxxxxxxxx (Ken Arromdee)

-- 
fedora-list mailing list
fedora-list@xxxxxxxxxx
To unsubscribe: https://www.redhat.com/mailman/listinfo/fedora-list
Guidelines: http://fedoraproject.org/wiki/Communicate/MailingListGuidelines

[Index of Archives]     [Current Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [Yosemite Photos]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux