3

I'm implementing a web based file storage service (C#). The files will be encrypted when stored on the server, but the challenge is how to implement the decryption functionality.

The files can be of any size, from a few KB to several GB. The data transfer is done in chunks, so that the users download the data from say offset 50000, 75000 etc. This works fine for unencrypted files, but if encryption is used the entire file has to be decrypted before each chunk is read from the offset.

So, I am looking at how to solve this. My research so far shows that ECB and CBC can be used. ECB is the most basic (and most insecure), with each block encrypted separately. The way ECB works is pretty much what I am looking for, but security is a concern. CBC is similar, but you need the previous block decrypted before decrypting the current block. This is ok as long as the file is read from beginning to end, and you keep the data as you decrypt it on the server, but at the end of the day its not much better than decrypting the entire file server side before transferring.

Does anyone know of any other alternatives I should consider? I have no code at this point, because I am still only doing theoretical research.

Andreas
  • 705
  • 10
  • 22
  • Do you only need random read-access and sequential write access(i.e. the file is written in one go)? – CodesInChaos Sep 06 '12 at 11:52
  • And do you need integrity checking? I *strongly* recommend using MACs. There is a lot of weired stuff you can do, when there is no authentication. – CodesInChaos Sep 06 '12 at 11:54
  • "CBC is similar, but you need the previous block decrypted before decrypting the current block." Wrong. Having the cipher-text of the previous block is enough. You don't need the plaintext. Check out CBC on wikipedia. They have some nice graphs. – CodesInChaos Sep 06 '12 at 11:58
  • @CodesInChaos: yes, only random read-access. The entire file is encrypted only when fully uploaded and is therefore written in one go. I was not aware that it was enough with just the cipher-text of the previous block in CBC - learn something every day! :) – Andreas Sep 07 '12 at 06:15

1 Answers1

2

Do not use ECB (Electronic Code Book). Any patterns in your plaintext will appear as patterns in the ciphertext. CBC (Cipher Block Chaining) has random read access (the calling code knows the key and the IV is the previous block's result) but writing a block requires rewriting all following blocks.

A better mode is Counter (CTR). In effect, each block uses the same key and the IV for each block is the sum of offset of that block from a defined start and an initial IV. For example, the IV for block n is IV + n. CTR mode is described in detail on page 15 in NIST SP800-38a. Guidance on key and IV derivation can be found in NIST SP800-108.

There are a few other similar modes such as (Galois Counter Mode) GCM.

waqas
  • 10,323
  • 2
  • 20
  • 11
akton
  • 14,148
  • 3
  • 43
  • 47
  • I'd probably divide the file into blocks of a few kilobytes each, and encrypt each block with GCM, using an implicit IV. – CodesInChaos Sep 06 '12 at 14:38
  • @akton: yes, I am aware of the ECB weaknesses and will definately be checking out your other suggestions and get back to you! – Andreas Sep 07 '12 at 06:25
  • @CodesInChaos: from what you wrote I gathered you suggested I manually divide the file into chunks and encrypt each one with an implicit IV. This works, but is painfully slow. I also believe I could achieve this using any "fixed size" encryption algorithm. I expect GCM block cipher encryption to allow me to encrypt the entire file in one go, and by knowing the size of each encrypted chunk then be able to calculate the offset, read a chunk and decrypt individually. So far I have not found any code samples for this approach. – Andreas Sep 24 '12 at 13:03
  • Encrypting in chunks should not be slow. The problem is with your code, not with the approach. – CodesInChaos Sep 24 '12 at 13:05
  • @CodesInChaos: I ran a few tests and realised that my approach wasn't slow - it was just a much slower process than I expected so you are totally right. But, is it really necessary to split the source data up in chunks and encrypt each one manually? Shouldn't the algorithm encrypt the data in chunks of X bytes anyway, and as long as I know how big these chunks are I should be able to go to an offset, read X bytes (the encrypted chunk size) and decrypt? – Andreas Sep 25 '12 at 09:32
  • The problem is that in that case GCM will only add a MAC at the very end of the file. So you need to read the whole file to verify a small part. – CodesInChaos Sep 25 '12 at 09:45
  • @CodesInChaos It may be worth adding an answer to this question (1) so Andreas can see it in more detail and mark it as the preferred answer and (2) to ensure the details are accessible and not embedded in comments. – akton Sep 25 '12 at 10:40