1

When trying to connect to port 7077 to a spark cluster via pyspark in Python, I get Connection refused Error.

Running nmap server_ip from my local machine (Ubuntu 20.04) shows 4 open ports (80, 8080, 22, 9000)

Running nc -zv server_ip 7077 gives the output:

nc: connect to server_ip port 7077 (tcp) failed: Connection refused

Then I ssh to the sles server (have to be connected to a VPN) and run the following command: ss -tulw. The command gives this output for port 7077:

Netid  State      Recv-Q Send-Q Local Address:Port                 Peer Address:Port
tcp    LISTEN     0      128     *:7077                            *:* 

If I understand it correctly, this means the port 7077 is open for any address. Why am I then getting a Connection refused Error?

There is no firewall for the port 7077 in the VPN Connection.

Edit:

Output from iptables -L:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:7077
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:7077

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         
DOCKER-USER  all  --  anywhere             anywhere            
DOCKER-INGRESS  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-1  all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere             ctstate RELATED,ESTABLISHED
DOCKER     all  --  anywhere             anywhere            
ACCEPT     all  --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination         

Chain DOCKER (2 references)
target     prot opt source               destination         
ACCEPT     tcp  --  anywhere             another_ip           tcp dpt:9870
ACCEPT     tcp  --  anywhere             another_ip           tcp dpt:cslistener
ACCEPT     tcp  --  anywhere             another_ip           tcp dpt:7077

Chain DOCKER-INGRESS (1 references)
target     prot opt source               destination         
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:http-alt
ACCEPT     tcp  --  anywhere             anywhere             state RELATED,ESTABLISHED tcp spt:http-alt
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:http
ACCEPT     tcp  --  anywhere             anywhere             state RELATED,ESTABLISHED tcp spt:http
RETURN     all  --  anywhere             anywhere            

Chain DOCKER-ISOLATION-STAGE-1 (1 references)
target     prot opt source               destination         
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
DOCKER-ISOLATION-STAGE-2  all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            

Chain DOCKER-ISOLATION-STAGE-2 (2 references)
target     prot opt source               destination         
DROP       all  --  anywhere             anywhere            
DROP       all  --  anywhere             anywhere            
RETURN     all  --  anywhere             anywhere            

Chain DOCKER-USER (1 references)
target     prot opt source               destination         
RETURN     all  --  anywhere             anywhere        
Snow
  • 111
  • 4
  • Ok, the input rules were not needed (default policy is to ACCEPT), but it looks like you are running docker on that machine and that's heavily messing with your routing; however this should not be a problem if your are trying to connect to the server itself... – Massimo Nov 26 '20 at 15:55
  • What is the server's IP? – Massimo Nov 26 '20 at 15:56
  • @Massimo are you asking what's the `server_ip`? I don't understand why that matters – Snow Nov 26 '20 at 16:12
  • Because you have an iptables rule forwarding TCP port 7077 to IP address 172.18.0.4, and I'd like to understand what that is and if it can conflict with the same port listening on the server itself. – Massimo Nov 26 '20 at 17:24
  • Your networking seem to be a lot more complex than expected, due to Docker being involved. – Massimo Nov 26 '20 at 17:27
  • @Massimo ah, the server ip is different from that one. The spark master I am trying to connect is located in the `server_ip` itself. However, there are different spark workers in different servers. The `server_ip` only has one worker. – Snow Nov 26 '20 at 19:42
  • Is it possible that your server (whatever you have listening on TCP port 7077) is actively refusing connections from your machine? You said you get a "connection refused" error, while a firewall or routing issue would most likely result in a timeout due to the connection not being answered at all. – Massimo Nov 26 '20 at 20:40
  • @Massimo but the server listening on TCP port 7077 is the one I output the `iptables -L` of, so `server_ip`. According to that information, it is listening to that port. Or am I confusing your question? – Snow Nov 26 '20 at 21:37
  • I *think* this is not a firewall issue (although your networking is more complex than it seems); but a server, even if it's listening on all addresses and can actually receive a connection, can still choose to refuse it. When a firewall drops a connection, it usually times out (no answer). When the server chooses to explicitly *refuse* the connection, this results in a TCP RST and thus in a "connection refused" error. This looks like the second case. – Massimo Nov 26 '20 at 21:42
  • @Massimo I understand. If the server is _excplicitly refusing_ a connection, is there a way I can verify, or debug this? – Snow Nov 26 '20 at 22:16
  • The server should generate logs. And it should have config settings where it's defined which connections it should accept or reject. I'm not familiar with this software, but you should be able to look at its docs. – Massimo Nov 26 '20 at 22:32
  • @Massimo, I'll try that, thank you! – Snow Nov 26 '20 at 22:35

1 Answers1

1

Looks like the local firewall in the target machine is not allowing incoming connections on TCP port 7077.

This should solve the issue:

iptables -A INPUT -p tcp --dport 7077 -j ACCEPT

Depending on existing rules, you might need to use -I instead of -A:

iptables -I INPUT -p tcp --dport 7077 -j ACCEPT
Massimo
  • 70,200
  • 57
  • 200
  • 323
  • when I run `firewall-cmd --list-ports` I get the message `FirewallD is not running`. Woudln't this mean that there's no firewall? – Snow Nov 26 '20 at 15:20
  • I added the rule, but the error is still the same – Snow Nov 26 '20 at 15:25
  • FirewallD is a firewall management service, not the actual firewall. Please add the output of `iptables -L` to your question. – Massimo Nov 26 '20 at 15:30
  • Also, please check that on both the source and the destination machine; it's also possible that *your* machine doesn't allow outbound connections on TCP port 7077. – Massimo Nov 26 '20 at 15:34
  • I checked for my machine and outbound connections like so: `time nmap -p 7077 portquiz.net`. `State` shows as `open`. – Snow Nov 26 '20 at 15:49