I am using Terraform to create a dataproc cluster that uses a GCP cloudsql instance as the hivemetastore, the terrafrm project creates the cluster and all its prerequisites (network, service account, cloudsql instance & user, etc).
cloud-sql-proxy.sh
is provided to assist with this however I can't get it to work, when the cluster is created cloud-sql-proxy.sh
fails with error:
nc: connect to localhost port 3306 (tcp) failed: Connection refused
I've banged my head against the wall trying to figure out why but can't get to the bottom of it so am hoping someone here can help.
I've hosted the terraform project at https://github.com/jamiekt/democratising-dataproc. Reproducing the problem is very easy, follow these steps:
- Install terraform if you haven't already
- Install
gcloud
if you haven't already - Create a new GCP project
- Enable the Cloud Dataproc API for your new project
gcloud auth application-default login #creates a file containing credentials that terraform will use
git clone git@github.com:jamiekt/democratising-dataproc.git && cd democratising-dataproc
export GCP_PROJECT=name-of-project-you-just-created
make init
make apply
That should successfully spin up a network, subnetwork, cloudsql instance, a couple of storage buckets (one of them containing cloud-sql-proxy.sh), a service account, a firewall then fail when attempting to create the dataproc cluster.
if anyone could take a look and tell me why this is failing I'd be very grateful.