IOS - How do I AES decrypt a large file if the file is too large to load all of it into memory?

Question

I know how to AES encrypt and decrypt an NSData, but that requires loading the whole file into memory first.

Say I have a 50mb encrypted file called data.dat.enc, how can I decrypt it to a file data.dat without having to first load it all into memory?

Why not split the data before encrypting, then unencrypt and put it back together? — PRNDL Development Studios, Mar 15 '12 at 17:22
@PRNDLDevelopmentStudios Yes, I suppose I will do that if I have to, but I have multiple large files, and it would be harder to manage a bunch of split up files. — Kyle, Mar 15 '12 at 17:25
You could try compressing the data before the encryption, but I don't know if the compression ratio would be high enough to really matter. Maybe open an SSL connection to a server, upload the encrypted data, server decrypts, and sends back? — PRNDL Development Studios, Mar 15 '12 at 17:28

Rob Napier · Accepted Answer · 2012-04-05T21:33:05.067

15

EDIT: This code has been expanded by http://github.com/rnapier/RNCryptor.

RNCryptManager is a good example of how to do this. It comes from the Chapter 11 sample code of iOS5:PTL. Look at:

+ (BOOL)decryptFromStream:(NSInputStream *)fromStream
                 toStream:(NSOutputStream *)toStream
                 password:(NSString *)password
                    error:(NSError **)error;

It assumes that the salt and IV have been prepended to the stream (this is all explained in the book). For some more general discussion on AES encryption, see Properly encrypting with AES with CommonCrypto.

For an example of its use, see CPCryptController.m in the same project.

If there's sufficient interest, I could pull this object out and support it as a stand-alone project rather than just as a piece of sample code. It seems reasonably useful to people. But it's not that difficult to integrate as-is.

The more general answer is that you create a cryptor with CCCryptorCreate and then make calls to CCCryptorUpdate for each block. Then you call CCCryptorFinal to finish things up.

edited Apr 05 '12 at 21:33

answered Mar 15 '12 at 21:45

Rob Napier

286,113
34
456
610

+1 Can't fault that answer. Could have guessed that there would be an update function somewhere somehow. – Maarten Bodewes Mar 16 '12 at 00:09
Just finished testing it. Thanks so much! It works perfectly :) – Kyle Mar 16 '12 at 03:05
@Rob In my testing, I've found that decrypting an encrypted file with the wrong password can sometimes return `true` from this function (about 0.5% of the time). The resulting "decrypted" file isn't really decrypted, it's just junk data, but the function still returns `true`. Is this an intended behavior? If so, how can I detect if the decryption was truly successful? – Kyle Mar 16 '12 at 19:57
The fact that you're getting a ~0.5% miss suggests that your data is 1 byte smaller than a multiple of 16, right? Without padding, AES-CBC can't detect a decrypt error (AES-CBC does not provide integrity). With 1 byte of padding, there's a 1/256 chance that the padding will just happen to decrypt correctly anyway. The solution is to add and verify an HMAC. I'm reworking this code right now anyway to post on github; this would probably be a nice built-in feature to add. – Rob Napier Mar 16 '12 at 20:46
@RobNapier Yes, some of the files are only 1 byte less than a multiple of 16. Having the HMAC creation and verification built in would be great! Let me know when you post it :) – Kyle Mar 17 '12 at 00:02
I'm doing some heavy refactoring on it, adding tests, etc. You can follow it here: https://github.com/rnapier/RNCryptor. It should be mostly stable in a few days. If there are specific use cases you're looking for, let me know. I'm still refining the API. – Rob Napier Mar 17 '12 at 00:54
FYI, the github code is rapidly approaching semi-stable. The API is firming up, and most of the open issues are documentation and additional tests. Comments, review, testing, and any usage issues all welcome. – Rob Napier Mar 19 '12 at 03:35

SquareRootOfTwentyThree · Answer 2 · 2012-03-16T07:42:31.170

0

You have two options (and here I describe the encryption process only, but decryption is similar):

Use a stream cipher (like AES-CTR)

You initialize the cipher with a 16 byte key and truly random 16 byte nonce, write the nonce, load the first piece, encrypt it, write the result, load a second piece and so on. Note that you must initialize the cipher only once. The size of the piece can be arbitrary; it does not even need to be the same each time.

Use a block cipher with a one pass chaining mode, for instance AES128-CBC

You initialize the cipher with the 16 byte key, generate a random 16 byte IV, write the IV, write the total length of the file, load the first piece, encrypt it together with the IV, write the result, load a second piece, encrypt using the last 16 bytes of the previous encrypted block as IV, write the result, and so on. The size of the piece must be a multiple of 16 bytes; again, it does not even need to be the same each time. You may need to pad the last block with zeroes.

In both cases

You must compute the cryptographic hash of the original unencrypted file (e.g. using SHA-256) and write it when the encryption is finished. That is pretty easy: you initialize the hash at the very beginning, and feed each block to it as soon as it is loaded (including nonce/IV and possibly the length field). At the decryption side, you do the same. Eventually, you must verify that the computed digest matches with the one that came with the encrypted file.

How can that be done on iOS? I am afraid I am not familiar with the platform, but CCCypt seems to fit the bill.

EDIT: nonce/IV and length are hashed too.

edited Mar 16 '12 at 07:42

answered Mar 15 '12 at 18:19

SquareRootOfTwentyThree

7,606
32
43

3

First of all, there are many issues with using RC4, read the wikipedia page. RC4 does not use an IV, although some implementations may call some element an IV. The second paragraph, using AES128-CBC is absolutely fine as a solution, except for the padding with zero's. Just calculating a hash over the plain text is not a good idea at all, if only because you would leak information about said plain text (encrypt the same plain text twice and you can compare the results, to name one issue). – Maarten Bodewes Mar 15 '12 at 21:37
If you post your CBC solution as a separate answer, propose to use PKCS#7 padding instead of zero padding I'll be happy to vote for it. This one gets voted down because of the many additional mistakes. – Maarten Bodewes Mar 15 '12 at 21:48
Note that the OP's original code almost certainly uses CBC. It's the default in the iOS libraries. There's almost never a reason to use EBC (the other mode), regardless of whether you're encrypting/decrypting all at once or "as you go." PKCS#7 padding is also the only padding supported by the iOS libraries, so it's likely that it's already in use as well (unless the payload doesn't require padding). – Rob Napier Mar 15 '12 at 22:03
RC4 has indeed many issues, but if implemented correctly it is still secure. I mentioned it because it probably is the most commonly available decent stream cipher in libraries (more than CTR mode I changed the answer with). I also changed IV to nonce (which, again, is commonly used in RC4). I don't see the problem with zero padding, since I explicitly say that the file length is saved with the IV/nonce. PKCS#7 adds no value (and theoretically exposes you to oracle attacks, check Vaudenay's paper from 2002). – SquareRootOfTwentyThree Mar 15 '12 at 22:33
1

When pointing to RC4, you should advise people how to use it correctly. Padding oracle attacks don't work without a padding oracle, and to avoid them some kind of integrity check/authentication is required. Just a hash won't work, and your method of performing hashing over the plain text is plain wrong. PS. I have implemented the Vaudenay padding Oracle attack, and used it on XML encryption before those vulnerabilities in webservices were exposed. – Maarten Bodewes Mar 15 '12 at 23:59
I would argue that any crypto primitive can be used incorrectly, be it RC4 or the strongest stream cipher (e.g. what if nonce is not really random?). In which sense is hashing wrong (it's there as integrity check)? And which value does PkCS#7 add if the file has the expected length as header? – SquareRootOfTwentyThree Mar 16 '12 at 00:30
1

Of course, anything can be used incorrectly, but you cannot point to RC4, a cipher with known weaknesses, and expect a first time user to get it right. Hashing is not wrong, but exposing information about the text you want to keep confidential is. PKCS#7 has the advantage that it is more or less the standard, creating your own scheme with a length prepended (which format? big endian? which size?) is not a good idea when it is not needed. What happens if you get a stream instead of a file with known file size? You are however correct in assuming that cryptographically speaking, it is not wrong. – Maarten Bodewes Mar 16 '12 at 01:18
Even AES has its own weaknesses, and CBC mode even more. As I said, I don't know the platform so I preferred to be as generic as possible to allow a secure enough implementation even with the simplest of the libraries. Rolling your own PKCS#7 may be even trickier that having the length header included. Apart from that, OP explicitly mentions a file and no need for interoperability. – SquareRootOfTwentyThree Mar 16 '12 at 07:40

IOS - How do I AES decrypt a large file if the file is too large to load all of it into memory?

2 Answers2

Linked