I'm trying to setup a spark standalone cluster on a bunch of docker containers in a private cloud. The executor processes, running in nodes different from the driver's node, are not able to connect back to the driver because the host port that is exposed (randomly assigned port in this case) is different from the port number that is advertised by the driver process. The same problem occurs with the blockmanager service started in each executor / driver - the advertised port is not available for communication from the outside.
Please suggest if there is a configuration that spark provides that allows us to differentiate the advertised port number from the actual port number (couldn't find any so far). The host name / port number that is visible from the outside can be obtained by environment variables that are automatically set when the container starts up, so if there is such a configuration (say spark.driver.advertisePort / spark.blockManager.advertisePort) then I can populate these values.
Note: we can get around this by using host networking or port mapping with same internal and external ports in the docker containers. But neither are allowed in the private cloud infrastructure that I'm trying to deploy the cluster on.