perl seek and remove at varying offsets in binmode

Question

This is my script I am writing.

#usr/bin/perl
use warnings;


open(my $infile, '<', "./file1.bin") or die "Cannot open file1.bin: $!";
binmode($infile);
open(my $outfile, '>', "./extracted data without 00's.bin") or die "Cannot create extracted data without 00's.bin: $!";
binmode($outfile);

local $/; $infile = <STDIN>;
   print substr($infile, 0, 0x840, '');
   $infile =~ s/\0{16}//;
   print $outfile;

I'm loading a binary file in perl. I have been able to seek and patch at certain offsets, but what I would like to do is, now be able to find any instance of "00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00" (16 bytes?) and remove it from the file, but no less than 16 bytes. Anything less than that I would want to leave. In some of the files the offset where the 00's start will be at different offsets, but if I am thinking correctly, if I can just search for 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 and remove any instance of it, then it won't matter what offset the 00's are at. I would extract the data first from specific offsets, then search the file and prune 00's from it. I can already extract the specific offsets I need, I just need to open the extracted file and shave off 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

EF 39 77 5B 14 9D E9 1E 94 A9 97 F2 6D E3 68 05
6F 7B 77 BB C4 99 67 B5 C9 71 12 30 9D ED 31 B6 
AB 1F 81 66 E1 DD 29 4E 71 8D 54 F5 6C C8 86 0D 
5B 72 AF A8 1F 26 DD 05 AF 78 13 EF A5 E0 76 BB 
8A 59 9B 20 C5 58 95 7C E0 DB 44 6A EC 7E D0 10 
09 42 B1 12 65 80 B3 EC 58 1A 2F 92 B9 32 D9 07 
96 DE 32 51 4B 5F 3B 50 9A D1 09 37 F4 6D 7C 01 
01 4A A4 24 04 DC 83 08 17 CB 34 2C E5 87 26 C1 
35 38 F4 C4 E4 78 FE FC A2 BE 99 48 C9 CA 69 90 
33 87 09 A8 27 BA 91 FC 4B 77 FA AB F5 1E 4E C0        I want to leave everything from
F2 78 6E 31 7D 16 3B 53 04 8A C1 A8 4B 70 39 22 <----- here up
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <----- I want to prune everything
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00        from here on
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00<---- this IS the end of the file, and
                                                     just need to prune these few rows
                                                     of 00's

Say that "F2 78 6E" from the example above, is at offset 0x45000 BUT in another file the 00 00's will start at a different offset, how could I code it so the 00 00's would get pruned. In any file that I am opening? If I need to be more specific, just ask. Seems like I would peekk so far into the file until I hit a long 00 00 string, then prune any remaining lines. Does that make sense at all? All I want to do is search the file for any instances of 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 and delete/prune/truncate it. I want to save everything but the 00's

EDIT #2 this did it:

open($infile, '<', './file1') or die "cannot open file1: $!";
binmode $infile;
open($outfile, '>', './file2') or die "cannot open file2: $!";
binmode $outfile;

local $/; $file = <$infile>;
$file =~ s/\0{16}//g;
print $outfile $file;


close ($infile);
close ($outfile);

Thank you ikegami for all your help and patience :)

You opened `$infile`, but read from `STDIN`. You overwrite your file handle with the content of the file. (Goes to show your variable names are substandard. Use `..._fh` for file handles.) You didn't say what to print. — ikegami, Feb 09 '13 at 23:56
trust me i been giving it a good go man. i think i am gonna give up tonight and do some more tomorrow. thanks for all your help so far. but to make myself clear, all i wanna do is open the extracted 59kn file, and delete the few rows of 00;s off the end lol. sounds easy enough. i can open HxD and do it manually, but a nice script to just double click makes it alot easier for me. i work with these files daily and have to manually check them, and it would be nice to have a tool that seperated each part of the file, the 59kb file is just part of a 16MB file. but alot of the other data is static, — james28909, Feb 10 '13 at 02:58
and i dont have any trouble at all extracting that, and what i am doing is extracting a little more than 59 kb, that way every file will be padded at the end with 00's, then i want to add this script to delete them 00's. anyways, thanks for all yoru help so far man. — james28909, Feb 10 '13 at 03:00

ikegami · Accepted Answer · 2013-02-09T21:46:14.997

4

No such thing as removing from a file. You have to either

copy the file without the undesired bits, or
read the rest of the file, seek back, print over the undesired bits, then truncate the file.

I went with option 1.

$ perl -e'
   binmode STDIN;
   binmode STDOUT;
   local $/; $file = <STDIN>;
   $file =~ s/\0{16}//;
   print $file;
' <file.in >file.out

I'm loading the entire file into memory. Either option can be done in chunks, but it complicates things because your NULs could span two chunks.

In a poorly phrased update, you seem to have asked to avoid changes in the first 0x840 bytes. Two solutions:

$ perl -e'
   binmode STDIN;
   binmode STDOUT;
   local $/; $file = <STDIN>;
   substr($file, 0x840) =~ s/\0{16}//;
   print $file;
' <file.in >file.out

$ perl -e'
   binmode STDIN;
   binmode STDOUT;
   local $/; $file = <STDIN>;
   print substr($file, 0, 0x840, '');
   $file =~ s/\0{16}//;
   print $file;
' <file.in >file.out

edited Feb 09 '13 at 21:46

answered Feb 09 '13 at 20:20

ikegami

367,544
15
269
518

1

You should have told me why my solution wasn't good enough. I only noticed your update as a fluke. – ikegami Feb 09 '13 at 21:46
original post updated. when i run this, it tried to run, but hangs with a blinking cursor. – james28909 Feb 09 '13 at 22:36
please have patience with me, this is a learning experience for me. i can program, but i am not intermediate man, so please have patience ;) im learning, lol – james28909 Feb 09 '13 at 22:46
might i add this file is only 59kb – james28909 Feb 10 '13 at 00:22
the variable offsets dont vary no more than a few kb from file to file. is there not a way i could load a hex string say for instance **/0x00, /0x00, /0x00, /0x00,/0x00, /0x00, /0x00, /0x00,/0x00, /0x00, /0x00, /0x00,/0x00, /0x00, /0x00, /0x00** into a variable, then search and remove any exact instances of that exact variable? then i could just extract the section i want, and simply delete the 00 hex strings off the end of it, which isnt but 4 or 5 lines of hex string – james28909 Feb 10 '13 at 01:59
1

Sure, change `$file =~ s/\0{16}//;` to `$str = "\0" x 16; $file =~ s/\Q$str//;` – ikegami Feb 10 '13 at 02:06
ok i got it sort of working m8, but it only deletes one line of zero's – james28909 Feb 11 '13 at 06:13
i setup a test file with 5 lines of 00's in a hex file named "file1" when i run it thru the cose, it deletes one line of the 00's. – james28909 Feb 11 '13 at 06:14
i just need it to recursively search again and again until it cant find any instance of "00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00" – james28909 Feb 11 '13 at 06:15
by the way, thanks for your help so far man, it is really appreciated. also its the first option that i got working, just need it to search over and over. would i use a loop of a sort? – james28909 Feb 11 '13 at 06:16
1

oh, you want to delete all instances of 16 NULs? Add the "g" flag to the substitution, `s/\0{16}//g`. – ikegami Feb 11 '13 at 07:12
dont kill me, but how would i strip "FF" the same way? i tried to input "$file =~ s/\F{16}//g; but it does nothing. – james28909 Feb 11 '13 at 08:22

perl seek and remove at varying offsets in binmode

1 Answers1