I'm trying to visualize malware executables for testing visual classification approach. Using Microsoft Malware Classification Challange dataset .bytes files I have input such:
00401000 56 8D 44 24 08 50 8B F1 E8 1C 1B 00 00 C7 06 08
00401010 BB 42 00 8B C6 5E C2 04 00 CC CC CC CC CC CC CC
00401020 C7 01 08 BB 42 00 E9 26 1C 00 00 CC CC CC CC CC
00401030 56 8B F1 C7 06 08 BB 42 00 E8 13 1C 00 00 F6 44
00401040 24 08 01 74 09 56 E8 6C 1E 00 00 83 C4 04 8B C6
00401050 5E C2 04 00 CC CC CC CC CC CC CC CC CC CC CC CC
00401060 8B 44 24 08 8A 08 8B 54 24 04 88 0A C3 CC CC CC
00401070 8B 44 24 04 8D 50 01 8A 08 40 84 C9 75 F9 2B C2
00401080 C3 CC CC CC CC CC CC CC CC CC CC CC CC CC CC CC
00401090 8B 44 24 10 8B 4C 24 0C 8B 54 24 08 56 8B 74 24
004010A0 08 50 51 52 56 E8 18 1E 00 00 83 C4 10 8B C6 5E
You can find an example file in here. And I want the output to be a .bmp or other type of lossless image file. I'm using this python function to convert bytes to image file:
import imageio
import numpy as np
from binascii import unhexlify
from scipy import ndimage
bytes_dir = '.\dataSample'
images_dir = '.\imageData'
undecodedByte = '00'
def bytes2png(f, height, width):
file = bytes_dir + '\\' + f
b_data = b''
for line in open(file, 'r'):
for byte in line.rstrip()[8:].split():
if (byte == '??'): byte = undecodedByte
b_data += binascii.unhexlify(byte)
image_buffer = np.zeros((height, width)).astype(np.ubyte)
for i in range(height):
for j in range(width):
image_buffer[i,j] = b_data[i+j]
# Naming image
image_name = images_dir + '\\' + f.split('.')[0] + str(height) + "x" + \
str(width) + '.bmp'
# image_buffer = ndimage.rotate(image_buffer, -45.0, reshape=False, prefilter=False)
imageio.imwrite(image_name, image_buffer)
I'm successfully extracting bytes and writing it to a legit .bmp file but, it is rotated 45 degrees. When I opened it in a hex editor and compared with the original file, I saw it has the same sequence as in .bytes file.
I don't know image filetypes well and used commented line for rotating numpy array 45 degrees then; I get a hexagon shaped picture because of rotation. As you can see:
Is there any way (preferably efficient) to make images which patterns can be seen in the same row as in original bytes order?
P.S. uploaded images are .png because it doesn't accept .bmp files.