On Fri, 10 Dec 2010 03:11:25 +0000 (UTC) "Amadeus W.M." <amadeus84@xxxxxxxxxxx> wrote: > I have a binary file with data. Each block of 48 bytes is a record. I > want to extract the first 8 bytes within each record. I'm thinking > this should be possible with dd, but gawk, perl - anything goes. It > just has to be fast, because the data files are ~ 1Gb. > > I can do this in C++ but I was just wondering if it can be done with > existing well tested tools. The binary aspect makes it tricky. If they were EOL delimited records, lots of tools could do this. Here's a python function, not checked though. It does require that you have enough memory to slurp the file into memory. Put it in a file, edit for the filenames, and run it as python <filename>. I guess it should take less than a minute, but not sure, should be fine for one off. def extract (filename1 = None, filename2 = None): if filename1 != None and filename2 != None: infile = open (filename1, "rb") slurp = infile.read () # at least as much memory as the file size infile.close () outfile = open (filename2, "wb") while len (slurp) > 0: record = slurp [:48] # extract a record first8 = record [:8] # slice off first 8 positions outfile.write (first8) # write them out, no separator slurp = slurp [48:] # chop them off the file outfile.close () extract (filename1 = "your input filename with path", filename2 = "your output filename with path") -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines