0

I am currently using Dynamic analysis for malware detection. I have list of all the files accessed by malware and benign executable. My aim is to build classifiers on the information extracted through the analysis reports.

As of now i am using the file path string like c:\hvtqk\modules\packages\reboot.py as a separate dimension in my classifier. i just want to know if there are any other innovative techniques that can be used to featurize the path strings ?

Pranjul Ahuja
  • 26
  • 1
  • 3

1 Answers1

0

You can use the hash of the lower case of the path, and you can consider only the directory but not the file name, since many malware write random file name, but write to common directories.

Moustafa Saleh
  • 178
  • 2
  • 7
  • So you are suggesting to take a count of how many files are being touched in the most common directories ? – Pranjul Ahuja Apr 09 '17 at 10:47
  • Yes you can do that. or just treat the directory path as a string and becomes one of the features you get from the training set. Thus, the classifier will consider the path as a feature when encountered afterwards in the test set. – Moustafa Saleh Apr 10 '17 at 17:42