2

I have issues connecting a KafkaIO source to brokers available only through a Cloud VPN tunnel.

The tunnel is set up to allow traffic from a specific subnetwork (secure) and routes are set up and working for compute engines in that subnetwork.

Executing the pipeline with the DirectRunner KafkaIO is able to connect to the brokers, whether through the VPN on a standard compute engine in the secure subnetwork, or through a local machine with ssh tunnels setup by sshuttle.

Running the pipeline with the DataflowRunner connections to the brokers fail with: org.apache.kafka.common.errors.TimeoutException: Timeout expired while fetching topic metadata. The pipeline gets executed within the secure subnetwork.

Connecting to the compute engine instance spanned by the job the following routes are visible:

jgrabber@REDACTED-harness-REDACTED ~ $ ip r
default via 10.74.252.1 dev eth0 proto dhcp src 10.74.252.3 metric 1024                      
default via 10.74.252.1 dev eth0 proto dhcp metric 1024
10.74.252.1 dev eth0 proto dhcp scope link src 10.74.252.3 metric 1024    
10.74.252.1 dev eth0 proto dhcp metric 1024                                   
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown

The IPv4 addresses of the brokers are within a 172.17.0.0/16 (remote) network. The VPN is configured with a remote network range of 172.16.0.0/12.

Could the remote 172.17.0.0/16 network be shadowed by the virtual network setup and used by docker?

  • Hi, have you tried specifying the proper subnetwork using PipelineOptions when starting the Dataflow job: https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineWorkerPoolOptions.java#L164 – chamikara Jan 29 '18 at 19:02
  • What is 'advertised.listeners' set to on Kafka servers? You can you telnet to that ip and port from the worker? Also please post the full stacktrace from the worker logs if you can. – Raghu Angadi Jan 31 '18 at 04:41
  • I guess you are mainly asking about issue with the routing table ('linkdown' for 172.17.0.0/16). I am not very familiar with it. – Raghu Angadi Jan 31 '18 at 04:57
  • @chamikara yes, the Pipeline get's executed in the correct environment. – Jonas Grabber Jan 31 '18 at 11:04

0 Answers0