Fedora Users — Re: dd question

On Fri, 10 Dec 2010 03:11:25 +0000 (UTC)
"Amadeus W.M." <amadeus84@xxxxxxxxxxx> wrote:

> I have a binary file with data. Each block of 48 bytes is a record. I 
> want to extract the first 8 bytes within each record. I'm thinking
> this should be possible with dd, but gawk, perl - anything goes. It
> just has to be fast, because the data files are ~ 1Gb.
> 
> I can do this in C++ but I was just wondering if it can be done with 
> existing well tested tools.

The binary aspect makes it tricky.  If they were EOL delimited records,
lots of tools could do this.

Here's a python function, not checked though.  It does require that you
have enough memory to slurp the file into memory.  Put it in a file,
edit for the filenames, and run it as python <filename>.  I guess it
should take less than a minute, but not sure, should be fine for one
off.

def extract (filename1 = None, filename2 = None):
  if filename1 != None and filename2 != None:
    infile = open (filename1, "rb")
    slurp = infile.read ()  # at least as much memory as the file size
    infile.close ()
    outfile = open (filename2, "wb")
    while len (slurp) > 0:
      record = slurp [:48]  # extract a record
      first8 = record [:8]  # slice off first 8 positions
      outfile.write (first8)  # write them out, no separator
      slurp = slurp [48:]  # chop them off the file
    outfile.close ()
    
extract (filename1 = "your input filename with path", 
         filename2 = "your output filename with path")
-- 
users mailing list
users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines