How to prevent MITM attacks when implementing E2EE?

Question

I'm working on a project where two clients can send files to each other via web sockets (using Socket.IO). Each chunk is encrypted with AES.

Currently, the clients connect to the server, they each generate an RSA public/private key pair on their devices, they then announce their public keys to the server which sends them to the other client, and this gets stored by said client. Before data is sent, it is encrypted with AES using a random key and a random IV, and the AES key is then encrypted using the other client's public key. The data is sent across, the other client then decrypts the AES key using their RSA private key, and finally decrypts the content using the decrypted AES key and saves it to a file on their disk.

The issue is that the server could easily just replace one client's public key with its own, and steal the data. The only solution I can think of is for the clients to contact one another and manually verify their public keys... I'm not sure how I'd go about automating this process. Services that provide E2EE seem to generate a matching code on each device, but I'm having trouble finding any information about how this is actually implemented, like how would two devices generate matching codes without talking to a server or each other in between, and if they do, then the server knows the code anyway right?

I've considered using WebRTC to send the public key from one client to the other without having the data go through the server, but I'd appreciate alternative approaches.

The "matching codes" are not sent by the server, they're computed by each end independently. Then you must've have some tamperproof way of verifying the codes. One way is to read out loud the code for your key to your peer over some nonsecure voice channel. The peers have to recognize their voices in order for this to be effective. You can also leverage previously authenticated keys to pass new keys. It's a tricky problem. There are other ways of course. — President James K. Polk, Oct 20 '21 at 20:56
@CherryDT Thank you, that's very helpful. So I'm essentially generating a key that both clients would know because they're the only ones that know how it was derived, and doesn't actually get transferred via the server? The mixed colors example explained it very well. — Xtrendence, Oct 20 '21 at 21:05
@PresidentJamesK.Polk Yes, it's tricky indeed. The bad thing is that you can't just make up an approach either, like they always say to use methods that are already proven otherwise you will 100% mess it up. Ideally I'd want the clients to not have to know each other, and this verification to be as automated as possible. The other person that replied mentioned the Diffie–Hellman key exchange, which seems promising. Haven't looked into it enough, but could the server not interfere with the key exchange and replace one client's key generation method with its own? — Xtrendence, Oct 20 '21 at 21:09
The WebRTC method I mentioned in the original post might be the best approach for me then. If the two clients exchange keys without using the server to send this data across, then the server won't have the opportunity to replace the key with its own or do anything else. Although I could be wrong honestly. — Xtrendence, Oct 20 '21 at 21:21
You're of course welcome to use WebRTC this way, and it may provide some benefits, but I don't think this constitutes what most people consider end-to-end encryption, since WebRTC itself does not provide E2EE nor does it claim to. — President James K. Polk, Oct 20 '21 at 21:28
WebRTC would only be for the key exchange though. From there, the actual data would be encrypted with AES, the AES key would be encrypted with the RSA key, and transferred through the server. Then the client on the other side would decrypt the AES key using their private key, and use the AES key to decrypt the data. Since only the two clients can decrypt the data with their private keys, that is E2EE right? — Xtrendence, Oct 20 '21 at 21:33
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. — Community, Oct 21 '21 at 04:36
Using webrtc here would be a chicken-egg problem since WebRTC itself is not authenticated so an attacker controlling the signalling server can do an active MITM — Philipp Hancke, Oct 21 '21 at 06:37
@PhilippHancke That's a good point. By replacing the other client's IP with its own for example? — Xtrendence, Oct 21 '21 at 12:46
ips and dtls fingerprints (which are self-signed certificates) — Philipp Hancke, Oct 21 '21 at 12:48

score 1 · Answer 1 · answered Oct 21 '21 at 11:50

To prevent MITM, users are supposed to "manually compare public key fingerprints through an outside channel", as explained in this article regarding the Signal Protocol.

Usually, it means checking an hexadecimal string over a trusted communication: face to face, phone, ... Depending on your requirements, you might also consider that an attacker cannot access both your tool and emails at the same time and consider emails your trusted communication.

How to prevent MITM attacks when implementing E2EE?

1 Answers1