Spark driver / executors in docker containers with port translation

Question

I'm trying to setup a spark standalone cluster on a bunch of docker containers in a private cloud. The executor processes, running in nodes different from the driver's node, are not able to connect back to the driver because the host port that is exposed (randomly assigned port in this case) is different from the port number that is advertised by the driver process. The same problem occurs with the blockmanager service started in each executor / driver - the advertised port is not available for communication from the outside.

Please suggest if there is a configuration that spark provides that allows us to differentiate the advertised port number from the actual port number (couldn't find any so far). The host name / port number that is visible from the outside can be obtained by environment variables that are automatically set when the container starts up, so if there is such a configuration (say spark.driver.advertisePort / spark.blockManager.advertisePort) then I can populate these values.

Note: we can get around this by using host networking or port mapping with same internal and external ports in the docker containers. But neither are allowed in the private cloud infrastructure that I'm trying to deploy the cluster on.

Have you looked at [spark.driver.port](https://spark.apache.org/docs/latest/configuration.html#networking)? — o_O, Sep 10 '22 at 20:42
@o_O Yes I have, the issue is, if i set spark.driver.port the driver will advertise this as the port on which it is running, which is not reachable from the executors because this port is mapped to a random external port (which can be obtained via an env variable). If I set this random port as the value for spark.driver.port it will fail trying to bind to it because only the ports that were defined while bringing up the container are allowed for binding. — francotirador, Sep 11 '22 at 03:18

Spark driver / executors in docker containers with port translation

0 Answers0