0

I am doing some forensics learning, and got a .str file that has an entire .xsl file:

Content of the xsl file

I need to extract all that .xsl file from the .str file. I have used something like:

cat pc1.str | grep "<From>" > talk.txt

The problem is that I get almost all text, but not in a readable format. I think I am only getting all that has From inside.

Can you help me to get the text from <?xml version="1.0"?> to </log>?

Edit for clarity: I want to get all text, beginning from the xml until the /log.

The .str file is created by strings.

Here is the actual file I am using: https://www.dropbox.com/s/j02elywhkhpbqvg/pc1.str?dl=0

From line 20893696 to 20919817.

Sergio Calderon
  • 837
  • 1
  • 13
  • 32
  • What is a `.str` file? Is that first column part of the file contents (and presumably you don't want that)? Other than that first column is there any other non-XSLT data in the file that you need to ignore? – Etan Reisner Jun 24 '15 at 19:03
  • Thanks for answering. I do not want to ignore anything, I have to get all text, beginning from the until . – Sergio Calderon Jun 24 '15 at 19:07
  • So there isn't any binary data in the file content block then? Ok. What about that first column? If that's in the file then you need to ignore that if you are expecting to get just the XSLT file as output. Also is that data raw (for newlines and control characters)? – Etan Reisner Jun 24 '15 at 19:09
  • Created by `strings` with what command/arguments? `-t`/`--radix` at the very least it looks like. Also realize that `strings` may have **already** lost you data from the original source. – Etan Reisner Jun 24 '15 at 19:11
  • I created the .str file from a .img file. I used this command: `strings -a -t -d pc1.img > pc1.str` When I tried the command I showed in the first question, I get a text like this: 20893777 – Sergio Calderon Jun 24 '15 at 19:17
  • I'm assuming you meant `strings -a -t d pc1.img` but ok. Then yes, you need to ignore the radix in that output and you may or may not have the complete xslt file from start to finish in order. – Etan Reisner Jun 24 '15 at 19:21
  • The question is: How can I get the complete xslt file then? – Sergio Calderon Jun 24 '15 at 19:23
  • Depends on what the `pc1.img` file is and whether you have tools that can understand it and extract the entire file. – Etan Reisner Jun 24 '15 at 19:30

2 Answers2

0

I'd probably use perl:

#!/usr/bin/perl

use strict;
use warnings;

while ( <> ) {
     print if m,<?xml version, .. m,</log>,
}

This makes use of the 'range' operator that returns true if a file is between two markers. By default, it uses the record separators $/ which is newline. If your data has newlines it's easy, but you can iterate based on bytes instead. (Just bear in mind that you may have to worry about overlapping a boundary).

E.g.

$/ = \80; 

Will read 80 bytes at a time.

Sobrique
  • 52,974
  • 7
  • 60
  • 101
0

If you want all the lines of your .str file from the line that contains <?xml version="1.0"?> to the first line that contains </log> then this should work.

awk '/<?xml version="1.0"?>/{p=1} p; /<\/log>/{exit}' pc1.str

Match the opening line and set p=1. If p is truth-y then print the current line. Match the line with the closing tag and exit.

If you want output without the radix field from the file then something like this should work.

cut -f 2 pc1.str | awk '/<?xml version="1.0"?>/{p=1} p; /<\/log>/{exit}'

This adds cut to trim off the first radix field (awk isn't as good at field ranges).

If you also want to ignore anything before the opening xml marker and after the closing </log> tag something like this should work (untested).

cut -f 2 pc1.str | awk '/<?xml version="1.0"?>/{p=1; $0=substr($0, 1, index($0, "<?xml version=\"1.0\"?>"))} {sub(/^.*<\/log>/, $0, "&")} p; /<\/log>/{exit}'

This uses substr and sub to remove parts of lines that aren't desired.

Etan Reisner
  • 77,877
  • 8
  • 106
  • 148
  • I tried the first command, but received this: `awk: line 1: syntax error at or near <` I have the pc1.str and in a folder called 'PornCase' Did it do something wrong? – Sergio Calderon Jun 24 '15 at 19:39
  • No. Typo on my part. Fixing. – Etan Reisner Jun 24 '15 at 19:40
  • Thanks a lot. I appreciate if you let me know when it you can fix it. Should I get the result from the console itself? – Sergio Calderon Jun 24 '15 at 19:49
  • Already fixed. Those commands will all send the output to standard output. You can do with that whatever you want. (If you include an actual snippet text contents from the file in your question then people, like myself, can actually test our answers on it more easily.) – Etan Reisner Jun 24 '15 at 19:54
  • Sadly, I get nothing as a result. I am uploading the full .str file so you will be able to test if you can. It is a little big heavy. – Sergio Calderon Jun 24 '15 at 20:00
  • File added in the first post. – Sergio Calderon Jun 24 '15 at 20:28