3

Note: we use Puppet standalone (master-less) i.e. puppet apply.

Usually, while deploying a web application there are a number of back-end services and applications that run behind the front-facing application - things like the database, search server, caching server, other internal services, etc. These services are need not listen on the public network interfaces. Instead, they can listen on private network interface and all the applications can communicate over that securely. This is something that I already do.

The issue arises when you want to automatically deploy these services. We use Puppet for infra provisioning. When these services are deployed, we depend on facts to pickup things like ipaddress and hostname. Depending upon where your machine is, the name of the interface differs. For example, the identifiers in machines provided by Soft Layer are bond0, bond1, etc. and the same provided by Digital Ocean are eth0, eth1, etc. Out of these, lets say bond0 and eth0 are public interfaces and bond1 and eth1 are private.

Ideally, we should use the same Puppet scripts for provisioning infrastructure no matter where you are provisioning. And we use hiera for picking up default values for classes. So ideally, I would want to have facts like ipaddress_public, ipaddress_private available and then I can use them however I want for whichever class in Puppet. And the fact should hide away the gory details of figuring out where the machine is i.e. Soft Layer, Digital Ocean, AWS, etc. and get me the fact for the job. Or, I can create a hierarchy for infrastructure provider in hiera and have different defaults for different infrastructure providers.

The issue is I don't know how to figure out the provider for a particular machine. So for example, if I give you a machine to run Puppet on, can there be a reliable way to figure out if it is running on Soft Layer, Digital Ocean, AWS, etc.? How do you guys solve problems like these?

vaidik
  • 153
  • 4

1 Answers1

2

It's obviously not so easy as it seems in the first place. In case of AWS, there are custom facts that tell you you are on aws, for example:

# facter -p | grep ^ec2 |wc -l
33

Public IP is saved in 'ec2_public_ipv4' fact. So, it's easy to detect AWS.

But on DigitalOcean - there is nothing that indicate for within the VM itself that it's running on DigitalOcean. Only interesting fact I see is:

# facter -p | grep kvm
virtual => kvm

Amazon uses xenhvm. If SoftLayer uses something other then xen/kvm then you can use that fact as a starting point. Offcourse this method isn't very robust because each one of them can change virt tech in some point in time which may render all your VMs on that provider inoperable.

What I would suggest you is to write your own custom fact, which will take into account all the knowledge you have about different cloud providers you use, and then decide which IPs to expose to you scripts. There is no other way unfortunately.

Jakov Sosic
  • 5,267
  • 4
  • 24
  • 35