1

I'm using the distributed compute framework Bacalhau[0]. The pattern for setting up a cluster is the following:

$ curl -sL https://get.bacalhau.org/install.sh | bash
[...output...]
$ bacalhau serve
To connect another node to this private one, run the following command in your shell:
bacalhau serve --node-type compute --private-internal-ipfs --peer /ip4/10.158.0.2/tcp/1235/p2p/QmeEoVj8wyxMxhcUSr6p7EK1Dcie7PvNeXCVQny15Htb1W --ipfs-swarm-addr /ip4/10.158.0.2/tcp/46199/p2p/QmVPFmHmruuuAcEmsGRapB6yDDaPxhf2huqa9PhPVEHK8F

(doing this in a production friendly format involves using systemd - I have excluded it here).

What I'd like to do is have a Google managed instance group that watches a Cloud Pub/Sub (not covered here) to create a new instance when the signal is made. The problem is that the peering string is only known after the first instance starts. My initial thought is that I would start one instance, capture the output, and write it to a common location which everything could read from.

I've thought about the following patterns:

  1. Create an instance template that checks a central endpoint (KV store?) for this information
  2. Create an instance template that reads from a GCS bucket for this information
  3. Something else?

I've read this piece[1] about leader election using GCS, but can I force GCS as the locking mechanism? Or do I need to use a whole library[2]? Or is there another solution? I can use any managed service on GCP to accomplish this.

My preference would NOT be to use golang, but to use a non-compiled language (e.g. Python) to accomplish this.

[0] https://docs.bacalhau.org/quick-start-pvt-cluster

[1] https://cloud.google.com/blog/topics/developers-practitioners/implementing-leader-election-google-cloud-storage

[2] https://pkg.go.dev/github.com/hashicorp/vault/physical/gcs

aronchick
  • 6,786
  • 9
  • 48
  • 75

2 Answers2

2

One approach you could take is to use a metadata server to store the peering string. *GCP documentation.

GCP provides an instance metadata server that allows you to store and retrieve metadata for your instances. When you create a new instance, you can set the peering string as metadata on the instance using the gcloud command-line tool or the Google Cloud API:

gcloud compute instances add-metadata INSTANCE_NAME --metadata PEERING_STRING=VALUE

To read the metadata from within your Bacalhau startup script:

curl -H "Metadata-Flavor: Google" "http://metadata.google.internal/computeMetadata/v1/instance/attributes/PEERING_STRING"

If you specifically want a python script, make an api request using requests library to this curl link.

Jishan Shaikh
  • 1,572
  • 2
  • 13
  • 31
1

Another option to the first answer would be to use GCP Secrets.

In essence, the initial node would write the needed information to a Secret and the code processing the Pub/Sub message would pull that information from the Secret in order to use it to add new nodes.

The reason I suggest this is that, to me, it seems that this information you need would give a hacker access to join your cluster maliciously. I would treat that info as protected. Using the metadata of the prime instance allows anybody with a pretty low-level permission to access the metadata, and therefore potentially add new infected nodes to your cluster.

Mac
  • 48
  • 1
  • 9