0

I have a Django/Nginx/Gunicorn web server ("web02") and an Nginx file server ("fs02") that is used to store user images. When a user uploads images through the web site, they are saved to the file server via a directory that is cross-mounted from the file server via NFS. I build my servers using an Ansible playbook that provisions each server and then configures the file server first and the web server second. When I initially build my servers, NFS works perfectly. However, if I rebuild and reconfigure just the file server (for example, if it crashes and I need to rebuild and restore it), NFS doesn't work. In that situation my web server is unable to see the exported directory on the file server. I have confirmed this two ways:

# From web02
$ sudo rpcinfo -u fs02 mountd
rpcinfo: RPC: Unable to receive; errno = Connection refused
program 100005 version 0 is not available

$ sudo showmount -e fs02
rpc mount export: RPC: Unable to receive; errno = Connection refused

If I turn off the firewall on my file server and rerun both of the above commands, they run successfully and I can mount the file system. But if I re-enable the firewall, both commands fail again. What is baffling is that the firewall rules that I enabled when I rebuild the file server are identical to the rules that are enabled when the file server is initially built since the rules file is built by my Ansible playbook. Here are those rules:

*filter
# Allow all loopback (lo0) traffic and reject traffic
# to localhost that does not originate from lo0.
-A INPUT -i lo -j ACCEPT
-A INPUT ! -i lo -s 127.0.0.0/8 -j REJECT

# Allow ping.
-A INPUT -p icmp -m state --state NEW --icmp-type 8 -j ACCEPT

# Allow SSH connections.
-A INPUT -p tcp --dport 22 -m state --state NEW -j ACCEPT

# Allow HTTP and HTTPS connections from anywhere
# (the normal ports for web servers).
-A INPUT -p tcp --dport 80 -m state --state NEW -j ACCEPT
-A INPUT -p tcp --dport 443 -m state --state NEW -j ACCEPT

# Allow rsync from the web server
-A INPUT -p tcp -s <web-server-ip-addr> --dport 873 -m state --state NEW,ESTABLISHED -j ACCEPT
-A OUTPUT -p tcp --sport 873 -m state --state ESTABLISHED -j ACCEPT

# Allow NFS from the web server
-A INPUT -s <web-server-ip-addr> -p tcp -m multiport --dport 111,2049 -j ACCEPT
-A INPUT -s <web-server-ip-addr> -p udp -m multiport --dport 111,2049 -j ACCEPT

# Allow inbound traffic from established connections.
# This includes ICMP error returns.
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# Log what was incoming but denied (optional but useful).
-A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables_INPUT_denied: " --log-level 7

# Reject all other inbound.
-A INPUT -j REJECT

# Log any traffic which was sent to you
# for forwarding (optional but useful).
-A FORWARD -m limit --limit 5/min -j LOG --log-prefix "iptables_FORWARD_denied: " --log-level 7

# Reject all traffic forwarding.
-A FORWARD -j REJECT

COMMIT

Here is the result of doing "iptables -L" on the file server after applying the above rules:

Chain INPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  anywhere             anywhere
REJECT     all  --  loopback/8           anywhere             reject-with icmp-port-unreachable
ACCEPT     icmp --  anywhere             anywhere             state NEW icmp echo-request
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:ssh state NEW
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:http state NEW
ACCEPT     tcp  --  anywhere             anywhere             tcp dpt:https state NEW
ACCEPT     tcp  --  li470-156.members.linode.com  anywhere             tcp dpt:rsync state NEW,ESTABLISHED
ACCEPT     tcp  --  li470-156.members.linode.com  anywhere             multiport dports sunrpc,nfs
ACCEPT     udp  --  li470-156.members.linode.com  anywhere             multiport dports sunrpc,nfs
ACCEPT     all  --  anywhere             anywhere             state RELATED,ESTABLISHED
LOG        all  --  anywhere             anywhere             limit: avg 5/min burst 5 LOG level debug prefix "iptables_INPUT_denied: "
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
LOG        all  --  anywhere             anywhere             limit: avg 5/min burst 5 LOG level debug prefix "iptables_FORWARD_denied: "
REJECT     all  --  anywhere             anywhere             reject-with icmp-port-unreachable

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination
ACCEPT     tcp  --  anywhere             anywhere             tcp spt:rsync state ESTABLISHED

The file server knows to export the shared directory to the web server:

# /etc/exports on fs02
/var/www/mysite.com <web-server-ip-addr>/32(rw,no_root_squash,subtree_check)

The web server knows how to mount the shared directory from the file server:

# /etc/fstab on web02
/dev/sda     /               ext4    errors=remount-ro 0       1
/dev/sdb     none            swap    sw              0       0
<file-server-ip-addr>:/var/www/mysite.com /var/www/mysite.com nfs rw 0 0

Is there some other step I'm not aware of?

UPDATE

Here are the firewall rules if I run the command "sudo iptables -L -n -v":

Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     all  --  lo     *       0.0.0.0/0            0.0.0.0/0           
    0     0 REJECT     all  --  !lo    *       127.0.0.0/8          0.0.0.0/0            reject-with icmp-port-unreachable
    1    84 ACCEPT     icmp --  *      *       0.0.0.0/0            0.0.0.0/0            state NEW icmptype 8
   10   592 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:22 state NEW
    3   144 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:80 state NEW
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp dpt:443 state NEW
    0     0 ACCEPT     tcp  --  *      *       45.79.66.59          0.0.0.0/0            tcp dpt:873 state NEW,ESTABLISHED
    6   364 ACCEPT     tcp  --  *      *       45.79.66.59          0.0.0.0/0            multiport dports 111,2049
   15  1196 ACCEPT     udp  --  *      *       45.79.66.59          0.0.0.0/0            multiport dports 111,2049
 2508 1101K ACCEPT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED
   41  1941 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 LOG        all  --  *      *       0.0.0.0/0            0.0.0.0/0            limit: avg 5/min burst 5 LOG flags 0 level 7 prefix "iptables_FORWARD_denied: "
    0     0 REJECT     all  --  *      *       0.0.0.0/0            0.0.0.0/0            reject-with icmp-port-unreachable

Chain OUTPUT (policy ACCEPT 1983 packets, 259K bytes)
 pkts bytes target     prot opt in     out     source               destination         
    0     0 ACCEPT     tcp  --  *      *       0.0.0.0/0            0.0.0.0/0            tcp spt:873 state ESTABLISHED

They show the correct IP address for web02.

I also disabled the log limit directive and check /var/log/syslog and /var/log/messages on fs02 and don't see anything pertinent. However, I checked /var/log/messages on web02 and I see this message:

<timestamp> debian kernel: nfs: server 45.79.65.48 not responding, timed out

This message repeats every five minutes. I suspect this is the IP address of an earlier build of the file server since I can see it's a linode.com address if I run "whois" on it. But what's more interesting is that if I to a "grep -Rn 45.79.65.48 /etc" I see this address in the /etc/mtab file. I see now that this is the previous file server's IP address since I neglected to unmount the file server's directory before I destroyed and rebuilt the file server. I did "sudo umount -l /var/www/mysite.com" to unmount it. I then did "sudo mount -a" on the web server and now I can see that the file server directory is mounted onto the web server. However, if I re-run "sudo rpcinfo -u fs02 mountd" command on the web server, I still get the "Connection refused" message. I don't see how I can get that message if I'm now seeing the cross-mounted directory. I've been up all night working on this so maybe I'm tired and missing something.

Jim
  • 330
  • 7
  • 16
  • What do you see with `showmount -e nfs_server_ip`? If I were going to automate nfs config and rebuilding of servers, I would specify setting the fsid in the export and the mount just in case you have more than one export, but that is just an issue I run into a lot. Anything in syslog? – Aaron Mar 17 '17 at 15:34
  • I showed in the first code block above what happens if I run the showmount command. It just confirms what rpcinfo said. There's nothing useful in syslog. I thought that perhaps there might be some type of stale cache issue with the fs02 firewall since turning the firewall off resolves the problem. But doing "sudo iptables -F" on fs02 and then re-enabling the firewall rules didn't fix things. – Jim Mar 17 '17 at 15:40
  • Are you sure `li470-156.members.linode.com` is resolving to the IP you need it to? – iwaseatenbyagrue Mar 17 '17 at 15:42
  • Yes. Although I didn't mention it in my question I did ping that DNS name and confirmed that it resolved to the IP address of web02. – Jim Mar 17 '17 at 15:53
  • `iptables` will be happy to tell you what it's not letting through. Remove that limiting code from the log line, then retry the `rpcinfo` from the client, and see what turns up in the logs. You're going to have a problem with the client talking to `mountd`, but we haven't even got that far yet; making rcpinfo work is step 1. And in future, `iptables -L -n -v` produces much more usable output (and doesn't omit interface information which can be crucial, though in this case I'm not expecting that to be an issue). – MadHatter Mar 17 '17 at 15:57
  • Verify that the NFS server process is running and listening a TCP port? – Shane Madden Mar 17 '17 at 16:53
  • @MadHatter Please see my update above. I'm going to research this mtab. Shane Madden, I ran "sudo service nfs-kernel-server status" and confirmed it's "active (running)" on the file server. – Jim Mar 17 '17 at 17:52
  • I'm beginning to suspect name resolution. Can you verify on the client that `ping fs02` returns the right address (I don't care whether you can ping it, I'm just using that as a quick way to test practical name resolution). – MadHatter Mar 17 '17 at 22:13
  • Yes, ping returns the correct address. – Jim Mar 18 '17 at 03:18

0 Answers0