-1

I am doing this small university project, where I have to create a console-based text editor with some features, and making files password protected is one of them. As I said, it's a university project for an introductory OOP course, so it doesn't need to be the most secure thing on planet. I am planning to use a simple Caesar cipher to encrypt my file.

The only problem is the password. I'll use the password as the encryption key and it will work, but the problem is handling the case where the password is wrong. If no checks are placed then it would just show gibberish, but I want to make it so that it displays a message in case of a wrong password.

The idea I have come up with is to somehow store the hash of the unencrypted file in that text file (but it shouldn't show that hash when I open the file up with notepad) and after decrypting with the provided password, I can just hash the contents and check if it matches with the hidden hash stored in that file. Is it possible?

I am using Windows, by the way, and portability is not an issue.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
Saad Ahmed
  • 54
  • 1
  • 7
  • 2
    You could use an [Alternative Data Stream](https://learn.microsoft.com/en-us/windows/win32/fileio/file-streams), which is a feature specific to NTFS. But know that if the file is ever copied to a non-NTFS file system, the ADS will be lost. – Remy Lebeau May 05 '22 at 15:28
  • 1
    A simpler option would be to instead just append the hash to the beginning/ending of the encrypted file content, and then ignore the hash bytes when decrypting the file. The hash won't be in the decrypted content, and this allows the hash to be preserved regardless of how the file is copied. Also, hashing the whole file content is unnecessary, and lengthy if the file is large. You could simply store a hash of the correct password instead, and then compare that to a hash of the user's input during decryption. The chances of a wrong password hashing to the correct value will be negligible. – Remy Lebeau May 05 '22 at 15:41
  • @RemyLebeau yeah but I want my file to be clean if I open it up with any other text editor. The hash should be invisible. – Saad Ahmed May 05 '22 at 15:44
  • 1
    Opening an encrypted file in a text editor would display garbage anyway, so what does it matter if the hash is present as extra garbage? Your requirement makes no sense and is unnecessary. – Remy Lebeau May 05 '22 at 15:45
  • What's the codepage of the editor? ASCII, ANSI, some single byte one, full Unicode? – Seva Alekseyev May 05 '22 at 15:47
  • @SevaAlekseyev codepage is not quite relevant here, because editors will open the whole text file anyway, and the encrypted content will be shown although not in a readable way. There's no way to hide data in a text file – phuclv May 05 '22 at 16:04
  • Tell that to the Unicode BOM :) – Seva Alekseyev May 05 '22 at 16:04

1 Answers1

0

In general, you can't theoretically design a data format where nothing but plain text is a valid subset of it, but there can also be metadata (hash or something else). Just think about it: how do you store something other than text (i. e. metadata) in a file where every single byte is to be interpreted as text?

That said, there are some tricks to hide the metadata in plain sight. With Unicode, the palette of tricks is wider. For example, you can use spacelike characters to encode metadata or indicate metadata presence in the way that the user won't notice. Consider Unicode BOM. It's the "zero-length space" character. Won't be seen in Notepad, serves as metadata. You could so something similar.

They already mentioned alternative data streams. While one of those could work to keep the metadata, an alternative data stream doesn't survive archival, e-mailing, uploading to Google Drive/OneDrive/Dropbox, copying with a program that is not aware of it, or copying to a filesystem that doesn't support it (e. g. a CD or a flash drive with FAT).

Seva Alekseyev
  • 59,826
  • 25
  • 160
  • 281