13

I'm thinking of using AES256 CBC + HMAC SHA-256 as a building block for messages that ensures both confidentiality and authentication.

In particular, consider this scenario:

  • Alice is possession a public key belonging to Bob (the key exchange and algorithm is outside the scope of this question). Alice has an identifying key K, also shared with Bob, that she can use to identify herself with. Only Alice and Bob knows the key K.
  • Alice encrypts (nonce || K) using Bob's public key.
  • Bob decrypts the packet and has now has K and nonce.
  • Bob uses SHA-256 with SHA256(K || nonce) to yield a K(e) of 256 bits.
  • Bob uses SHA-256 with SHA256(K || nonce + 1) to yield a K(s) of 256 bits.

Now for every packet Bob wishes to send Alice he performs the following:

  • Create a new random 128 bit IV
  • Encrypts the message using the IV and K(e) as the key.
  • Creates a SHA-256 HMAC with K(s) as key and (IV || Encrypted message) as data.
  • Finally sends (IV || HMAC || Ciphertext) to Alice

Alice has also calculated K(e) and K(s), and follows the following procedure when receiving data from Bob:

  • Split the message into IV, ciphertext and HMAC.
  • Calculate the HMAC using K(s), IV and ciphertext.
  • Compare HMAC with the HMAC sent. If this matches, Alice considers this message authenticated as a message sent by Bob, otherwise it is discarded.
  • Alice decrypts the message using K(e)

Does this protocol ensure that Alice only decrypts messages from Bob, assuming that no one other than Bob can read the encrypted message that Alice sends him encrypted using his public key?

I.e. does messages constructed in this manner ensure both confidentiality and authentication?

Note: If the protocol requires Bob to send multiple messages, this scheme needs a slight modification to avoid replay attacks.

P.S. I am aware of AES-GCM/CCM, but this scheme would work with the basic AES, SHA and HMAC algorithms that are found in most crypto packages. This solution might also be slower, but that too is out of the scope for the question.

Nuoji
  • 3,438
  • 2
  • 21
  • 35
  • I don't believe there's any need to include the IV in the HMAC. And it sounds like you're using a packet protocol rather than a stream-based one; beware of replay attacks. – Nick Johnson Mar 08 '11 at 17:05
  • 6
    @Nick: You MUST include the IV in the HMAC, if the HMAC is over the encrypted message. Otherwise, you are not protecting the integrity: an attacker could alter the IV, which in effect would alter the first block of the decrypted message (furthermore, this can be done with precise bit flipping abilities). – Thomas Pornin Mar 08 '11 at 17:25
  • @Thomas You're right; I didn't pay attention to the fact that the HMAC was on the encrypted message rather than the plaintext. – Nick Johnson Mar 08 '11 at 18:25
  • As an amusing fact, I added a unit test where I flip a bit on the IV that creates exactly one bit flip on the plaintext. The HMAC validation catches that of course. – Nuoji Mar 09 '11 at 08:19
  • @ThomasPornin yes this happened to me when not including the iv in the hmac! – GGGforce Sep 28 '19 at 16:44

3 Answers3

19

Basically you are recreating SSL/TLS. This implies the usual caveats about building your own protocol, and you are warmly encouraged to use TLS with an existing library instead of rewriting your own.

That being said, using AES with CBC for encryption, and HMAC for integrity, is sound. There are combined encryption+integrity modes (that you are aware of), and CBC+HMAC is kind of "old school", but it cannot hurt. You are doing things in the "science-approved" way: encrypt, then MAC the encrypted string (and you do not forget the IV: forgetting the IV is the classical mistake).

Your key derivation may be somewhat weak. It is perfect if SHA-256 behaves like a perfect random oracle, but it is known that SHA-256 does not behave like a random oracle (because of the so-called length-extension attack). It is similar to the reason why HMAC is HMAC, with two nested hash function invocations, instead of simple hashing (once) the concatenation of the MAC key and the data. TLS uses a specific key derivation function (which is called "the PRF" in the TLS specification) which should avoid any trouble. That function is built over SHA-256 (actually, over HMAC/SHA-256) and can be implemented around any typical SHA-256 implementation.

(I am not saying that I know how to attack your key derivation process; only that this is a tricky thing to make properly, and that its security may be assessed only after years of scrutiny from hundreds of cryptographers. Which is why reusing functions and protocols which have already been thoroughly examined is basically a good idea.)

In TLS there are two nonces, called the "client random" and the "server random". In your proposal you only have the "client random". What you lose here, security-wise, is kind of unclear. A cautious strategy would be to include a server random (i.e. another nonce chosen by Bob). The kind of things we want to avoid is when Alice and Bob run the protocol in both directions, and an attacker feeds messages from Alice to Alice herself. Complete analysis of what an attacker could do is complex (it is a whole branch of cryptography); generally speaking, nonces in both directions tend to avoid some issues.

If you send several packets, then you may have some issues about lost packets, duplicated packets ("replay attacks"), and packets arriving out of order. In the context of TLS, this should not "normally" happen because TLS is used over a medium which already ensures (under normal conditions, not counting active attacks) that data is transferred in strict order. Thus, TLS includes a sequence number into the data which goes in the MAC. This would detect any alteration from an attacker, include replay, lost records and record reordering. If possible, you should also use a sequence number.

Community
  • 1
  • 1
Thomas Pornin
  • 72,986
  • 14
  • 147
  • 189
  • Sure ssl/tls provide both secrecy and authentication, but a full PKI has a lot more overhead than an hmac or CMAC mode. (+1 this is solid advice.) – rook Mar 09 '11 at 05:33
  • Is it better for Alice to generate two entirely random keys for encryption instead? I.e. she sends (nonce || K || Ks || Ke) where Ks and Ke are two random 256 bit keys generated by Alice and only used for one single exchange like this. – Nuoji Mar 09 '11 at 12:56
  • @Rook: TLS does not necessarily implies a full PKI. In TLS, the server sends his public key as an X.509 certificate. The client should use the server public key after having made sure that this is the correct key; this _may_ be through validating the certificate, but could also be done in many other ways, e.g. simply using an hardcoded public key without even looking at the certificate that the server sent. Usual TLS implementations separate X.509 validation from the tunnel implementation. – Thomas Pornin Mar 09 '11 at 13:11
  • @Nuoji: what is good is using "independent" keys for the encryption and for the MAC. Generating the two keys randomly is one way to achieve independence. Another is to derive both keys from the same "master key" using a random oracle -- which is what TLS does with its PRF. TLS actually derives _four_ keys, two for each tunnel direction. The PRF accomodates more easily the restrictions on encryption size (with a 1024-bit RSA key, you encrypt at most 117 bytes of data) and also the use of Diffie-Hellman key exchange (where the key is not chosen arbitrarily). – Thomas Pornin Mar 09 '11 at 13:16
  • @Thomas Pornin yeah your right. This would be the best way to have privacy and authentication. The one caveat is that he said "If the Alice and Bob does not know each other.", which would imply a PKI and mean that CMAC or OMAC1 wouldn't really apply. – rook Mar 09 '11 at 16:38
  • @Rook - did I write that Alice and Bob does not know each other? I'm pretty sure I wrote that they already share a (private) secret K and that Alice knows Bob's public key. – Nuoji Mar 10 '11 at 08:15
  • @Thomas: let's consider the case where a message consists of some non-encryted data (header) and encrypted content. Is it ok to calculate and store HMACs for both header and the payload? The reason for 2 HAMCs is because header can be modified (with recalculating it's HMAC), but data should be intact. Is it safe to use the same secret key for both HMACs? – mistika Feb 16 '16 at 17:06
2

The answer to the question as stated is no, there is no guarantee that Alice only decrypts messages from Bob, but that's only because you didn't stipulate that only Bob knows K. If Alice and Bob are the only two people who know K, then the crux of the question is whether your key generation protocol is sound. (We can ignore the rest, I believe, because you're just using HMAC-SHA256 and AES256 as they are intended to be used.)

The generation protocol isn't bad, but it can be improved. The accepted way to create keys from shared secrets is to use a "key derivation function". These functions use a hash in a similar way to what you have done here, but they are also purposely slow to inhibit brute force attacks. PBKDF2 seems to be what you want, as it a) can derive 512 bits of key data (or more), and b) can be made up of the primitives you have available; namely, SHA256 and HMAC-SHA256.

ladenedge
  • 13,197
  • 11
  • 60
  • 117
  • OP called _K_ a key, so a KDF should be superfluous. – aaz Mar 08 '11 at 17:09
  • K is a random number shared between Alice and Bob. If the nonce sent by Alice is a random number as well - do I really need to harden K(e) and K(s) against an attack? Is it a bit hard to get at K, considering the random nonce (not a fixed salt) is added? – Nuoji Mar 08 '11 at 17:19
  • A KDF reduces the need to make assumptions about *K*. How many bits of information are in a "key"? Anyway, you've already got a key derivation function - it's just not a standard KDF. Why not use one that has been vetted and has a bit more resistance to brute force attacks? – ladenedge Mar 08 '11 at 17:38
  • Efficiency: no point spending time on PBKDF2 for every connection if it doesn't improve security. (Though using it to get _K_ in the first place does make sense.) – aaz Mar 08 '11 at 17:57
  • Edited the question to clarify that K is shared only between Alice and Bob. – Nuoji Mar 09 '11 at 12:53
1

If you don't want to use PKI, take a look at TLS-PSK. It would seem to solve the exact problem you are solving yourself. See RFC 4279 (and 5487 for additional ciphersuites).

PerGN
  • 11
  • 3