Different Representation of Full file access paths by malware

Question

I am currently using Dynamic analysis for malware detection. I have list of all the files accessed by malware and benign executable. My aim is to build classifiers on the information extracted through the analysis reports.

As of now i am using the file path string like c:\hvtqk\modules\packages\reboot.py as a separate dimension in my classifier. i just want to know if there are any other innovative techniques that can be used to featurize the path strings ?

score 0 · Answer 1 · answered Apr 09 '17 at 01:48

0

You can use the hash of the lower case of the path, and you can consider only the directory but not the file name, since many malware write random file name, but write to common directories.

answered Apr 09 '17 at 01:48

Moustafa Saleh

178
2
7

So you are suggesting to take a count of how many files are being touched in the most common directories ? – Pranjul Ahuja Apr 09 '17 at 10:47
Yes you can do that. or just treat the directory path as a string and becomes one of the features you get from the training set. Thus, the classifier will consider the path as a feature when encountered afterwards in the test set. – Moustafa Saleh Apr 10 '17 at 17:42

Different Representation of Full file access paths by malware

1 Answers1