1

Update: I spun up an EC2 instance and was able to get the example below to work, which confirms that this is a connectivity issue with Docker on Mac.

Update: I still face this error even when I bring down the Flink Server Container and Kafka, which leads to believe this is a connectivity issue

I recently tried processing a Kafka Stream with Python, Apache Beam, and Apache Flink using tutorial tutorial. Based on the tutorial, I setup Flink with the following command:

docker run --net=host apache/beam_flink1.13_job_server:latest

Doing so results in the following:

Jul 14, 2021 8:40:47 PM org.apache.beam.runners.jobsubmission.JobServerDriver createArtifactStagingService
INFO: ArtifactStagingService started on localhost:8098
Jul 14, 2021 8:40:47 PM org.apache.beam.runners.jobsubmission.JobServerDriver createExpansionService
INFO: Java ExpansionService started on localhost:8097
Jul 14, 2021 8:40:47 PM org.apache.beam.runners.jobsubmission.JobServerDriver createJobServer
INFO: JobService started on localhost:8099
Jul 14, 2021 8:40:47 PM org.apache.beam.runners.jobsubmission.JobServerDriver run
INFO: Job server now running, terminate with Ctrl+C

When running my script with python main.py (shown below) I get the following error:

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses"
        debug_error_string = "{"created":"@1626301362.091496000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3009,"referenced_errors":[{"created":"@1626301362.091494000","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":398,"grpc_status":14}]}"

Does anyone know of a quick workaround for this? I should note I found this

main.py

import apache_beam as beam
from apache_beam.io.kafka import ReadFromKafka
from apache_beam.options.pipeline_options import PipelineOptions

if __name__ == '__main__':
    options = PipelineOptions([
        "--runner=PortableRunner",
        "--job_endpoint=localhost:8099",
        "--environment_type=LOOPBACK",
    ])

    pipeline = beam.Pipeline(options=options)

    result = (
        pipeline
        | "Read from kafka" >> ReadFromKafka(
            consumer_config={
                "bootstrap.servers": 'localhost:9092',
            },
            topics=['demo'],
            expansion_service='localhost:8097',
        )

        | beam.Map(print)
    )

    pipeline.run()

1 Answers1

0

--net=host is not supported on Docker Desktop for Mac