RFC 7539 defines its AEAD construction as follows:
chacha20_aead_encrypt(aad, key, iv, constant, plaintext):
nonce = constant | iv
otk = poly1305_key_gen(key, nonce)
ciphertext = chacha20_encrypt(key, 1, nonce, plaintext)
mac_data = aad | pad16(aad)
mac_data |= ciphertext | pad16(ciphertext)
mac_data |= num_to_4_le_bytes(aad.length)
mac_data |= num_to_4_le_bytes(ciphertext.length)
tag = poly1305_mac(mac_data, otk)
return (ciphertext, tag)
On the other hand, libsodium implements it as follows:
chacha20_aead_encrypt(aad, key, iv, constant, plaintext):
nonce = constant | iv
otk = poly1305_key_gen(key, nonce)
ciphertext = chacha20_encrypt(key, 1, nonce, plaintext)
mac_data = aad
mac_data |= num_to_8_le_bytes(aad.length)
mac_data |= ciphertext
mac_data |= num_to_8_le_bytes(ciphertext.length)
tag = poly1305_mac(mac_data, otk)
return (ciphertext, tag)
Basically libsodium does not uses padding and interleaves data and metadata (its length) on its Poly1305 pass. This is very unfriendly for optimization due to block alignment issues: after computing the MAC of additional data, next data does not need to be block aligned, so you can not use a highly optimized and interleaved Chacha20-Poly1305 construct.
What is the reason behind this decision?