0

I am trying to encrypt a file in place using cryptography module, so I dont have to buffer the ciphertext of the file which can be memory intensive and then i will have to replace the original file with it's encrypted one.so my solution is encrypting a chunk of plaintext then trying to replace it with its ciphertext 16 bytes at a time(AES-CTR mode). The problem seems that the loop is an infinite loop.

  • so how to fix this.
  • what other methods you suggest.
  • What are The side effects of using such a method below.
pointer = 0
with open(path, "r+b") as file:
   print("...ENCRYPTING")
   while file:
        file_data = file.read(16)
        pointer += 16
        ciphertext = aes_enc.update(file_data)
        file.seek(pointer-16)
        file.write(ciphertext)
    print("...Complete...")
KMG
  • 1,433
  • 1
  • 8
  • 19
  • It's not the best idea, cause in case if process of encryption would be terminated it will be hard to restore original file. – Olvin Roght Aug 30 '20 at 10:50
  • 1
    Since you're using CTR mode you **need** a unique IV if you use the same key over and over again. That IV is not secret. Usually we put it in front of the ciphertext. If you do that you now have to make sure that you're buffering an AES block since the ciphertext starts one block later than the corresponding plaintext. You could put the IV to the end of the file and keep your implementation easier. – Artjom B. Aug 30 '20 at 19:54
  • @ArtjomB. yeah thank's much this is what i was going to do, but i was trying to solve a bigger problem that is what to do if i have to encrypt a big database for example. I think everybody advice's against this approach I take. I would really appreciate it if you can tell me what's your advice here. – KMG Aug 30 '20 at 19:58

2 Answers2

4
  • so how to fix this.

As Cyril Jouve already mentions, check for if not file_data

  • what other methods you suggest.
  • What are The side effects of using such a method below.

Reading in blocks of 16 bytes is relatively slow. I guess you have enough memory to read larger blocks like 4096, 8192 ...

Unless you have very large files and limited diskspace I think there is no benefit in reading and writing in the same file. In case of an error and if the os has already written data to disk you will have lost the original data and will have an incomplete encrypted file of which you don't know which part is encrypted.

It's easier and saver to create a new encrypted file an then delete and rename if there were no errors.

Encrypt to a new file, catch exceptions, check existence and size of the encrypted file, delete source and rename encrypted file only if all is oké.

import os

path = r'D:\test.dat'

input_path = path
encrypt_path = path + '_encrypt'

try:
    with open(input_path, "rb") as input_file:
        with open(encrypt_path, "wb") as encrypt_file:

            print("...ENCRYPTING")

            while True:

                file_data = input_file.read(4096)
                if not file_data:
                    break

                ciphertext = aes_enc.update(file_data)
                encrypt_file.write(ciphertext)

            print("...Complete...")

    if os.path.exists(encrypt_path):
        if os.path.getsize(input_path) == os.path.getsize(encrypt_path):
            print(f'Deleting {input_path}')
            os.remove(input_path)
            print(f'Renaming {encrypt_path} to {input_path}')
            os.rename(encrypt_path, input_path)

except Exception as e:
    print(f'EXCEPTION: {str(e)}')
Mace
  • 1,355
  • 8
  • 13
  • Oké, thanks. It should be "if not file_data:" like in the example. – Mace Aug 30 '20 at 13:11
  • Yes, or `while file_data := input_file.read(4096):`. – superb rain Aug 30 '20 at 13:12
  • Well at least the example code was correct. I have edited the comment above it. Thanks again. – Mace Aug 30 '20 at 13:14
  • `if os.path.exists` is not really necessary (or else the code above would have raised an exception) / you do not need the remove before rename (bonus: rename is atomic, so in case of failure, so you won't have an iconsistency where input_path does not exist anymore) – Cyril Jouve Aug 30 '20 at 20:32
1

there is no "truthiness" for a file object, so you can't use it as the condition for your loop.

The file is at EOF when read() returns an empty bytes object (https://docs.python.org/3/library/io.html#io.BufferedIOBase.read)

with open(path, "r+b") as file:
   print("...ENCRYPTING")
    while True:
        file_data = file.read(16)
        if not file_data:
            break
        ciphertext = aes_enc.update(file_data)
        file.seek(-len(file_data), os.SEEK_CUR)
        file.write(ciphertext)
    print("...Complete...")
Cyril Jouve
  • 990
  • 5
  • 15