This is a tricky situation. I hope you'll eventually be able to replace the entire application with something that's better designed.
There isn't much you can do at the TUN/TAP level, because is a too low layer of the stack to understand about retransmissions.
There is however things you could do for the IP over TCP implementation to mostly mitigate the retransmission issue. Be aware that I haven't had the need to implement such a thing myself, so there could be problems I haven't realized yet. I can however explain how the ideas work in theory.
The problem is that once the outer TCP connection lose any single packet, the receiving side will be blocked until the lost packet has been retransmitted. This will cause the inner packets to be delayed which may be detected as packet loss by the inner layer causing retransmission on the inner level as well which will needlessly consume extra bandwidth.
On the receiving side
My best idea on how to deal with this is to tweak the receiving side in order to partially bypass the kernel TCP stack. You still set up the TCP connection using the kernel TCP implementation just as you would in the normal case. But on the receiving side you don't actually use the data you receive from the TCP socket. Instead you will have a thread or process which is constantly reading from the TCP socket and discarding all the received data.
In order to have packets to deliver to the TUN/TAP interface you use a raw socket that will receive the TCP segments as seen on the wire. This process can use filters in the kernel to only see those packets it cares about and ignore any excess packets if the kernel cannot do filtering accurately enough. Your process has to do enough of the TCP reassembly itself in order to extract the inner packets which it can then deliver to the TUN/TAP interface.
What's important here is that when an outer packet is lost only the inner packets affected by it will be lost or delayed. Your process can keep reassembling packets after the lost one in order to extract and deliver the inner packets to the TUN/TAP interface. The inner TCP stack may still retransmit a few packets, but not nearly as many as when the outer TCP connection stalls.
There is a couple of caveats to point out which may or may not be obvious:
- If the receive window or congestion window fills up TCP on the sending side will stall. You cannot prevent that, but you can reduce the risk by ensuring the outer TCP connection supports selective acknowledgements (SACK).
- Depending on the specifics of the tunnel protocol it may be hard or even impossible to accurately identify packet boundaries after a lost packet. If this turns out to be the case for the protocol you need to implement you may be out of luck. I'd have suggested to modify the protocol, but I understand that's not an option for you.
On the sending side
Being able to work around on the receiving side isn't sufficient for packets in the other direction where you are on the sending side. You cannot prevent the outer TCP connection from stalling on the receiver when a packet is lost.
Instead your best bet is to try and avoid unnecessary retransmissions on the inner connection. If possible you can tweak retransmit timers on the inner TCP connections. You need to wait at least 2 roundtrip times before the inner TCP connection retransmit a packet.
Completely disabling retransmits on the inner TCP connection wouldn't be a good idea, as the packets can be lost before or after the tunnel in which case the outer TCP connection won't be able to retransmit.
A theoretical possibility but likely lots of work to implement is that you use the raw socket mentioned above to snoop on ACK packets. That way you can deduce which inner packets are still in flight. Every inner TCP packet would then have to be checked against packets in flight, and if it is a retransmit of a packet which the outer TCP connection has not yet acknowledged, you silently drop the retransmit by the inner TCP connection.
Ignoring the problem
Chances are that the current application doesn't do any of this. It probably just does the TCP over TCP part and hopes for the best. And if it hasn't been a problem for you so far, it's probably not going to be a problem once you replace one end of the connection with a new implementation of the same protocol.
As such it may be more productive to just try with a known suboptimal protocol and only fix it, if you find it to cause real problems. This of course depends on what the consequences would be of deploying the reimplementation and running into problems later.