0

I have flink system running in a remote system.. say with IP as 10.XX.XX.XX and port as 6123. Now I would like to connect from another system using Pyflink using RemoteExecution Environment. I saw the docs https://ci.apache.org/projects/flink/flink-docs-stable/dev/python/table-api-users-guide/table_environment.html but its not clear. Any pointers please?

2 Answers2

2

There seems to be no builtin method for doing this programmatically but I was able to come right with:

from pyflink.java_gateway import get_gateway
gateway = get_gateway()
string_class = gateway.jvm.String
string_array = gateway.new_array(string_class, 0)
stream_env = gateway.jvm.org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
j_stream_exection_environment = stream_env.createRemoteEnvironment(
    "localhost", 
    8081, 
    string_array
)

env = StreamExecutionEnvironment(j_stream_exection_environment)
1

I believe it should be enough to do this:

./bin/flink run \
      --jobmanager <jobmanagerHost>:8081 \
      --python examples/python/table/batch/word_count.py

See Submitting PyFlink Jobs, which is where I found this example.

David Anderson
  • 39,434
  • 4
  • 33
  • 60
  • I shall try, is there any example like this in Python, similar to java - https://ci.apache.org/projects/flink/flink-docs-release-1.1/apis/cluster_execution.html ? – Arun Chandramouli Apr 08 '21 at 08:56
  • I don't think you can explicitly force a remote execution environment as you can in Java. But it will use one if that what is supplied. See the API docs for details of what's possible: https://ci.apache.org/projects/flink/flink-docs-master/api/python/pyflink.datastream.html – David Anderson Apr 08 '21 at 09:21