We are currently creating a new Azure Synapse Analytics environment to house our data. For this, we need to connect from Synapse to our on-premise SQL Server databases.
These databases are not reachable from the open internet, they are only accessible from within our own network. For our other applications on Azure, we have setup a hub-spoke model network, where the hub connects to our internal network, and VNets in the spoke are peered to the hub. This way we have been able to connect to on-premise database servers in other applications like AKS.
However, this doesn't seem to work for Azure Synapse. So how to do this then?
What I tried so far
The Azure Data Factory docs contain the following page outlining a guide for making this connection, using a private link service, a load balancer and a couple of Virtual Machines configured to forward traffic on specific ports towards the server using iptables
.
I have implemented this solution, but I don't really trust this. Most of all, the iptables configuration doesn't seem to persist when a VM reboot. This doesn't inspire a lot of trust in this solution, since this doesn't feel like something that should be used in production as is.
At the same time, I can't imagine that connecting to an on-premise database server is such a stretch, that there are no better solutions around. We can't be the only one stumbling onto this problem.
So my question, is there really no other, better, way to set up this connection?