0

I've an Android (java) program which uses ChaCha20-Poly1305 to encrypt some strings, and a python script to decrypt the data.

Here's the code to encrypt data:

byte[] plainText = "This is a sample test".getBytes(StandardCharsets.UTF_8);
byte[] key = "11111111111111111111111111111111".getBytes();
byte[] nonce = "nnnnnnnnnnnn".getBytes();

IvParameterSpec ivSpec = new IvParameterSpec(nonce);
SecretKeySpec secretKeySpec = new SecretKeySpec(key, "ChaCha20");
Cipher cipher = Cipher.getInstance("ChaCha20-Poly1305");
cipher.init(Cipher.ENCRYPT_MODE, secretKeySpec, ivSpec);

String nt = java.util.Base64.getEncoder().encodeToString(ivSpec.getIV());
String ct = java.util.Base64.getEncoder().encodeToString(cipher.doFinal(plainText));

Here's the python script to decrypt:

 chacha_key = '11111111111111111111111111111111'.encode()
 nonce = 'nnnnnnnnnnnn'.encode()
 chacha_cipher = ChaCha20_Poly1305.new(key=chacha_key, nonce=nonce)
 plain_text = chacha_cipher.decrypt(ciphertext)

But the problem is, he plain text always have garbage characters at the trailing end:

b'This is a sample test\xcbn\x1c\xbd\xa4/\xfc\xc6X\xf4\x93\xb4\xeb3\xcf\xf5'

By default the string is UTF-8 encoded, and i can't think of an issue in the mechanism too. Any suggestions why I'm getting garbage values?

KTB
  • 1,499
  • 6
  • 27
  • 43
  • 1
    I'm not saying it's the problem but always provide the full "*algorithm/mode/padding*" specification to Java's `Cipher.getInstance()` method. Try `Cipher.getInstance("ChaCha20-Poly1305/None/NoPadding")` and see if the result is different. – President James K. Polk Jun 26 '23 at 16:57

1 Answers1

2

The Java code returns as result the concatenation of ciphertext and 16 bytes MAC: ciphertext|MAC.

In PyCryptodome decryption is possible with decrypt_and_verify(), where ciphertext and tag must be passed individually:

from base64 import b64decode
from Crypto.Cipher import ChaCha20_Poly1305

ciphertext = b64decode('N/Dbissn5UeWkK0Z9H3sYc0B+uMQHWxE47+rk8zFmAjEKXT+wQ==') # from the Java Code
chacha_key = '11111111111111111111111111111111'.encode()
nonce = 'nnnnnnnnnnnn'.encode()

chacha_cipher = ChaCha20_Poly1305.new(key=chacha_key, nonce=nonce)
try:
    plain_text = chacha_cipher.decrypt_and_verify(ciphertext[:-16], ciphertext[-16:]) # authenticate and (if successful) decrypt 
    print(plain_text.decode('utf-8')) # This is a sample test
except ValueError:
    print('Authentication failed')

In contrast, decrypt() only decrypts without authentication:

chacha_cipher = ChaCha20_Poly1305.new(key=chacha_key, nonce=nonce)
plain_text = chacha_cipher.decrypt(ciphertext[:-16]) # decrypt without authentication
print(plain_text.decode('utf-8')) # This is a sample test

The gibberish in your result is due to the MAC not being separated.

Note that in practice, of course, authentication should not be omitted, i.e. decrypt_and_verify() should be applied.

Regarding the Java code, the change suggested in the comment: Cipher.getInstance("ChaCha20-Poly1305/None/NoPadding") should also be applied to avoid provider-dependent default values (which may differ from the intended ones).

Topaco
  • 40,594
  • 4
  • 35
  • 62