0

Essentially what I'm trying to do is parse through some files on my system and pull out a few different things from each file. Here is how I'm currently doing this:

grep -oP "((?<=set_kb_item\(name:)(.*?)(?=, value:))" *.nasl >> /tmp/set_kb_items.txt && 
grep -oP "((?<=user = )(.*?)(?=;))" *.nasl >> /tmp/usernames.txt && 
grep -oP "((?<=dependencies\()(.*?)(?=\)))" *.nasl>> /tmp/dependencies.txt && 
grep -oP "((?<=script_set_attribute\(attribute:\"plugin_type\", value:)(.*?)(?=\)))" *.nasl >> /tmp/plugin_type.txt && 
grep -oP "((?<=script_require_ports\()(.*?)(?=\)))" *.nasl >> /tmp/required_ports.txt 

This works perfect for me, and it finishes in about two minutes (70k files). However, I'm curious if I can chain these together a different way? My end goal here is to take this string, and do the equivalent in python, and then send these values to a database, but I'm not quite there yet. Any input would be appreciated, thanks!

Chad D
  • 299
  • 8
  • 21
  • I think you'll do best going straight to Python from here. You can read each line and then match it against each of the regexes, arranging to write the relevant output to the correct files. I'd nominate Perl if you hadn't mentioned Python — they're equivalent (or close enough to equivalent) in terms of ability to handle the problem. I'm feeling too lazy to write the Python code. You wouldn't necessarily have to deal with leading and trailing context in the same way — you could probably alter (simplify) the regexes a little. – Jonathan Leffler Jan 24 '16 at 00:59

1 Answers1

1

What about not chaining them, since there are no dependencies.

grep _yourstaff_ *.nasl >> _youfile1_ & grep _youotherstaff_ *.nasl &

these will be executed as different processes in parallel .

g24l
  • 3,055
  • 15
  • 28