I'm using the distributed compute framework Bacalhau[0]. The pattern for setting up a cluster is the following:
$ curl -sL https://get.bacalhau.org/install.sh | bash
[...output...]
$ bacalhau serve
To connect another node to this private one, run the following command in your shell:
bacalhau serve --node-type compute --private-internal-ipfs --peer /ip4/10.158.0.2/tcp/1235/p2p/QmeEoVj8wyxMxhcUSr6p7EK1Dcie7PvNeXCVQny15Htb1W --ipfs-swarm-addr /ip4/10.158.0.2/tcp/46199/p2p/QmVPFmHmruuuAcEmsGRapB6yDDaPxhf2huqa9PhPVEHK8F
(doing this in a production friendly format involves using systemd - I have excluded it here).
What I'd like to do is have a Google managed instance group that watches a Cloud Pub/Sub (not covered here) to create a new instance when the signal is made. The problem is that the peering string is only known after the first instance starts. My initial thought is that I would start one instance, capture the output, and write it to a common location which everything could read from.
I've thought about the following patterns:
- Create an instance template that checks a central endpoint (KV store?) for this information
- Create an instance template that reads from a GCS bucket for this information
- Something else?
I've read this piece[1] about leader election using GCS, but can I force GCS as the locking mechanism? Or do I need to use a whole library[2]? Or is there another solution? I can use any managed service on GCP to accomplish this.
My preference would NOT be to use golang, but to use a non-compiled language (e.g. Python) to accomplish this.
[0] https://docs.bacalhau.org/quick-start-pvt-cluster
[2] https://pkg.go.dev/github.com/hashicorp/vault/physical/gcs