0

Q: I wish to SSH into a DC/OS public agent to mount my file share with Docker credentials, so I can deploy Docker via Marathon. How can I ssh directly into this agent node? Without going through master?


Backstory: I did a vanilla DC/OS installation on Azure. I got two nodes provisioned (a master and an agent). I installed Marathon on the master.

First, I tried to deploy a container from an image/repo I created on Azure Container Registry, through Marathon. It failed because of CPU resource not being satisfied; that's partly understandable because it seems like Marathon sucks up the entire CPU of the master node. But I couldn't figure out how to make Marathon notice that there was another node around - the public agent node. The public agent node is running nothing.

Second, I figured I can just use the "Service" interface provided on the DC/OS layer itself (which I believe is just a UI layer for marathon or similar).

This time, it accurately recognizes the agent node and that there is compute available on it. But to make it pull from my private registry, I need to put my Docker credentials on this node. Here's where I get stuck. can't SSH to the agent node to mount the shared storage (which is mounted on the master already). Since this node is provisioned through the virtual machine scale set, I really can't figure out the right inbound NAT rules and network security configuration to map to this node and get me a reliable FQDN and port that will allow me to SSH in and run cifs. Honestly, DC/OS should have taken care of this for me, since I am doing the most standard thing.

I tried this, but it isn't sufficient/correct (even though it creates the rule):

az network lb inbound-nat-rule create --resource-group production --lb-name <lb-name> --name NATRule --protocol TCP --frontend-port 2200 --backend-port 22

(All elaborate VMSS videos from Microsoft are for the old interface, and this idea of port range mapping, which I can't seem to figure out from the CLI. Plus, the portal is still in progress when it comes to inbound NAT rules)

I am new to the Azure and DC/OS world (moving resources from AWS), so I'd appreciate the help.


UPDATE: Fwiw, turns out I tried the in-preview DC/OS on Azure service, as opposed to DC/OS on Azure Container Service, which is slightly unstable still. Launch containers through the "Services" interface on main DC/OS instead of on Marathon.

Varun Arora
  • 332
  • 4
  • 8

2 Answers2

1

I really can't figure out the right inbound NAT rules and network security configuration to map to this node and get me a reliable FQDN and port that will allow me to SSH in and run cifs.

For now, we can add inbound rule to VMss load balancer via CLI 2.0, but we can't use CLI 2.0 to sepcify target NIC, so we can't use NAT to ssh VMss instances. enter image description here

If you only one instance in this VMSS, we can add a load balancer rule to ssh it. Add probe of port 22, and add load balancer rule of 22, after that we can ssh the VMSS public IP address with port 22.

Another way, to login the DCOS node, we can via master ssh to other nodes. For example, we can ssh to master then ssh to public agent. enter image description here Here a case talk about how to login DCOS agent via master, please refer to it.

After that, we can follow this article to mount Azure file share to your cluster nodes.

By the way, we can create container via DC/OS UI, please refer to it.

Jason Ye
  • 13,710
  • 2
  • 16
  • 25
  • You can modify NAT rules used for a scale set in CLI with "az network lb inbound-nat-pool update" – sendmarsh Jun 12 '17 at 17:40
  • Thank you so much for your thoughtful reply, @Jason. So the container creation via DC/OS UI fails for the same `insufficient resources` reason again. I followed all steps verbatim from your link. – Varun Arora Jun 13 '17 at 04:59
  • @VarunArora can you show me your docker image settings about CPU? – Jason Ye Jun 13 '17 at 05:13
  • Sure @Jason, here: `"cpus": 1, "mem": 64, "disk": 0, "instances": 1,` – Varun Arora Jun 13 '17 at 05:16
  • @VarunArora As far as I know, `"cpus": 1` means 100% CPU, please try 0.7 or less. – Jason Ye Jun 13 '17 at 05:25
  • @JasonYe-MSFT, I tried a bunch of things, from 0.1 to 0.8, nothing worked. They were all more than the available amount `0.0` (because the master is taking 100% of the CPU usage). The public agent is lying unused. – Varun Arora Jun 13 '17 at 05:28
  • @VarunArora can you create images on agent? we can't create images on master by default. – Jason Ye Jun 13 '17 at 05:31
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/146479/discussion-between-jason-ye-msft-and-varun-arora). – Jason Ye Jun 13 '17 at 05:40
  • @JasonYe-MSFT Oh that is my goal, to create it on the agent :) But I don't know how to make Marathon notice that I want it on my agent. It is probably asking DC/OS: "do you have any resources for me?", and DC/OS is probably saying, not on this node, son. It isn't saying to Marathon: "got another node chilling, here you go" – Varun Arora Jun 13 '17 at 05:41
1

Can you please describe what you are trying to achieve. You do describe what you have done and it all seems rather. There seem to be two things here:

1) Use ACR with DC/OS - https://learn.microsoft.com/en-us/azure/container-service/container-service-dcos-acr

2) Provide access to a public container - see https://learn.microsoft.com/en-us/azure/container-service/container-service-enable-public-access

If these don't give you what you need please edit your question to describe what you are trying to achieve.

rgardler
  • 592
  • 3
  • 7
  • Thank you very much, @rgardler. I updated the question. With regard to link (1), yes, I am able to mount the file share on the master node. I want to do it on the public agent. And for link (2), it gives access to the container application port, not the agent node VM itself. I need to SSH into the agent to mount a file share, which will allow me to launch containers there. (Not relevant to my question, but I am trying to open port 8983 for a Solr container based on your link (2), and despite following the steps on a DC/OS service container exposing the port w/ hostPort, it isn't working) – Varun Arora Jun 13 '17 at 05:39
  • I take the 8983 comment of exposing a port back. It takes a while before the update completes, but it eventually works. Thanks for that! Can I just to that with port 22 to SSH to the VM? – Varun Arora Jun 13 '17 at 06:13