-1

I have encrypted text dataset and i want to classify it using neural network algorithm. I know that there is a pattern in the encrypted data. example of my input data :

diss%^ghghE(t dffd$#KL*vb xod@#:n>did ....

My questions is should i treat encrypted data as if its normal text and create vocabulary and transform my data into sequence of indices ? should i clean my data first from all the special characters ?

What i tried is i cleaned all data from special characters, then created a vocabulary and transform my data into sequences however i am getting a very low accuracy. but my model works well when my data is in natural language.

Any help is appreciated.

user780975
  • 13
  • 1
  • 6

1 Answers1

2

By definition, a good encryption algorithm will not allow you to learn anything[*] from the encrypted data.

So, unless you suspect that the encryption algorithm is weak, I suggest you abandon this idea.

[*] apart from the approximate size of the original text

Roman Cheplyaka
  • 37,738
  • 7
  • 72
  • 121
  • Thanks for your answer but i know that there is a pattern in this encrypted data. it's just hiding the normal words with random characters – user780975 Jul 09 '17 at 07:52
  • Technically, the fact that a good encryption algorithm doesn't allow you to learn about the plaintext doesn't mean you can't run algorithms on the crypttext. But that is theoretical. – MSalters Jul 09 '17 at 07:52