2

I've setup a bare metal cluster and want to provide different types of shared storage to my applications, one of which is an s3 bucket I mount via goofys to a pod that exports if via NFS. I then use the NFS client provisioner to mount the share to automatically provide volumes to pods.

Letting aside the performance comments, the issue is that the nfs client provisioner mounts the NFS share via the node's OS, so when I set the server name to the NFS pod, this is passed on to the node and it cannot mount because it has no route to the service/pod.

The only solution so far has been to configure the service as NodePort, block external connections via ufw on the node, and configure the client provisioner to connect to 127.0.0.1:nodeport.

I'm wondering if there is a way for the node to reach a cluster service using the service's dns name?

khc
  • 344
  • 2
  • 8
Assis Ngolo
  • 73
  • 2
  • 9
  • "because it has no route to the service/pod": It should be possible to access the pod using the pod's cluster IP, or you should be able to create a service in front of that pod, and use the service to connect to your pod. I don't know the calico specifics though. It should also be possible to access an individual pod using the pod name. Do you have dns installed on this cluster? – Burak Serdar Nov 11 '19 at 18:14
  • I do have a service in front of the pod. In fact, I can mount the NFS share in other pods that I launch in the cluster, but it seems like the [NFS provisioner](https://github.com/helm/charts/tree/master/stable/nfs-client-provisioner) mounts the share by running systemd-run on the node, and it times out. If I try the mount directly in the node, it also times out, so: `mount -t nfs service.namespace.svc.cluster.local:/ temp` works from a pod, but does not work from the node or with the nfs client provisioner – Assis Ngolo Nov 11 '19 at 19:11
  • Looks like even though I can see all the calico interfaces on the node, there is no route to the calico network from the node – Assis Ngolo Nov 11 '19 at 19:20
  • Ugly, but maybe you can add an alias for that name into /etc/hosts? If that works, then you can start searching for a more permanent solution. – Burak Serdar Nov 11 '19 at 19:22
  • k8s services of type ClusterIP don't get sticky IPs, so the dns name will point to a different IP whenever its rescheduled. The only solution that works now if the NodePort service type, which makes the service accessible on every node at the same port, so I can always reach it via localhost:port on any node and I can secure it by only allowing internal traffic to that port. It looks like the solution might be in the [calico docs](https://docs.projectcalico.org/v2.6/usage/external-connectivity), in the Inbound connectivity/orchestrator specific section, but it ends there, so I'll keep looking – Assis Ngolo Nov 11 '19 at 19:39
  • The service IP won't change unless you recreate it, right? – Burak Serdar Nov 11 '19 at 19:42
  • Yes, just realized that's true, however it does mean I'll need to set up that alias on every node and it will not scale. – Assis Ngolo Nov 11 '19 at 19:46
  • You *can* try setting resolv.conf to point to the k8s dns. Then the node will resolve the name. – Burak Serdar Nov 11 '19 at 19:52
  • So it turns out its a DNS resolution issue. It seems I can actually access the service's by their IP, so if I use the IP in the NFS client config, it does work, but not the service name. This is good in a way because I do not have to do any manual node config, so it's more scalable. Now I'll just have to figure out why the NS service is not resolving. Thanks for help! – Assis Ngolo Nov 12 '19 at 09:59
  • @AssisNgolo Have You solve your problem? If You found an answer please add an answer to your question and mark is as accepted so if someone from the community would have same question he will find the answer there. – Jakub Nov 12 '19 at 11:48
  • I wouldn't say I found an answer, its a workaround for an issue with my cluster, which should not be a long term solution and I'm still going to come back to it to fix. But I'll add what I found. – Assis Ngolo Nov 12 '19 at 17:35

2 Answers2

1

I've managed to get around my issue buy configuring the NFS client provisioner to use the service's clusterIP instead of the dns name, because the node is unable to resolve it to the IP, but it does have a route to the IP. Since the IP will remain allocated unless I delete the service, this is scalable, but of course can't be automated easily as a redeployment of the nfs server helm chart will change the service's IP.

Assis Ngolo
  • 73
  • 2
  • 9
0

I'd suggest you config a domain name for the NFS service ip at the external dns server, then point your node to that domainname to access NFS service. And for the cluster ip of NFS service, you can pin the ip in your helm chart with a customized values file.

Kun Li
  • 2,570
  • 10
  • 15