1

I'm trying to connect to a Cloud SQL instance with a public IP from a GKE cluster using cloud-sql-proxy. I created my cluster with the following commands:

gcloud services enable compute.googleapis.com
gcloud services enable container.googleapis.com
gcloud container clusters create my-cluster \
  --disk-size=10GB \
  --machine-type=e2-small \
  --node-locations=us-central1-b,us-central1-c,us-central1-f \
  --num-nodes=1 \
  --preemptible \
  --release-channel=regular \
  --workload-pool=my-production.svc.id.goog \
  --zone=us-central1-f \
  --no-enable-master-authorized-networks \
  --enable-ip-alias \
  --enable-private-nodes \
  --master-ipv4-cidr 172.16.0.32/28

I created my Cloud SQL with the following commands:

gcloud sql instances create my-db \
  --database-version=POSTGRES_12 \
  --region=us-central1 \
  --storage-auto-increase \
  --storage-size=10 \
  --storage-type=SSD \
  --tier=db-f1-micro

I also set up a service account with these commands:

gcloud iam service-accounts create my-service-account
gcloud iam service-accounts add-iam-policy-binding \
  --role=roles/iam.workloadIdentityUser \
  --member="serviceAccount:my-production.svc.id.goog[default/my-service-account]" \
  my-service-account@my-production.iam.gserviceaccount.com
gcloud projects add-iam-policy-binding my-production \
  --member serviceAccount:"my-service-account@my-production.iam.gserviceaccount.com" \
  --role "roles/cloudsql.client"

The sidecar container for cloud-sql-proxy in the pod is set up like this:

      - name: cloud-sql-proxy
        image: gcr.io/cloudsql-docker/gce-proxy:1.20.2
        command:
          - "/cloud_sql_proxy"
          - "-instances=my-production:us-central1:my-db=tcp:5432"
          - "-term_timeout=20s"

Despite all that, when my app tries to connect to the Cloud SQL instance, I can see the following error in cloud-sql-proxy logs:

2021/03/19 21:30:02 couldn't connect to "my-production:us-central1:my-db": dial tcp MY_DB_PUBLIC_IP:3307: connect: connection timed out

I checked and the pod has Internet access (I can access www.google.com) so it should be able to connect to Cloud SQL's public IP. I can use cloud-sql-proxy without problems on my laptop and connect to the instance there. What am I missing? What else can I check?

I found GKE private cluster and cloud sql proxy connection but I have SQL Admin API enabled. Connection between Private GKE and Cloud SQL only talks about a GKE cluster that has no Internet access.

Juliusz Gonera
  • 4,658
  • 5
  • 32
  • 35

4 Answers4

1

It seems like you are doing everything correctly. Do you have a firewall that might be blocking traffic outbound to your cloud sql instance on 3307?

kurtisvg
  • 3,412
  • 1
  • 8
  • 24
  • I did not set up any custom firewall rules, but perhaps GCP sets something up automatically that gets in the way in this particular case? – Juliusz Gonera Mar 21 '21 at 14:33
  • This is what `gcloud compute firewall-rules list` shows me: https://gist.github.com/jgonera/94b5da681048ac5a8e7d8c22d963d9ad – Juliusz Gonera Mar 21 '21 at 15:33
  • I tried creating a cluster without `--enable-private-nodes` (and `--master-ipv4-cidr`) and then everything works (connecting to the same Cloud SQL instance). However, I'd love to use `--enable-private-nodes` so that I'm not charged for external IPs for nodes. I don't need them because my pods are web servers and I use Google Cloud Load Balancer to route traffic to them. – Juliusz Gonera Mar 22 '21 at 03:05
  • Glad you were able to figure it out! It seems like you didn't actually have external access, but were perhaps still able to resolve a host for google.com? Taking a look at [this page](https://cloud.google.com/kubernetes-engine/docs/concepts/private-cluster-concept), it says "If you want to provide outbound internet access for certain private nodes, you can use Cloud NAT or manage your own NAT gateway." Otherwise, I'd recommend just using private IP with Cloud SQL since you are already using a private cluster. – kurtisvg Mar 22 '21 at 16:25
  • I did have external access. I was able to open a socket connection and fetch `/` from www.google.com. It's still unclear for me why I can't even open a socket connection to my Cloud SQL instance on its public IP. – Juliusz Gonera Mar 22 '21 at 16:47
  • 1
    Are you able to fetch non `google.com` resources? From that page I linked above, it sounds like it's intended if you use `--enable-private-nodes` that your nodes won't have external access. It's possible `google.com` domain might still be available via the VPC. – kurtisvg Mar 22 '21 at 20:30
  • This was exactly the problem. I wrongly assumed that fetching www.google.com meant Internet access. Trying to fetch any other website fails so it seems that www.google.com is considered "internal" by GCP. Thank you! – Juliusz Gonera Mar 23 '21 at 18:42
0

The conclusion is:

  • Don't use www.google.com to check if a node has Internet access on GCP's infrastructure. It seems that www.google.com is viewed as internal traffic on GCP. Trying to fetch www.amazon.com fails on a private GKE cluster.
  • To connect to Cloud SQL from a private GKE cluster use a private IP for Cloud SQL (remember to add -ip_address_types=PRIVATE to cloud-sql-proxy).
Juliusz Gonera
  • 4,658
  • 5
  • 32
  • 35
0

I had same issue and was banging my head around it. I enabled my SQL Instance with Private IP and this connection issue is solved. I would say this is the correct answer. My GKE Cluster was private. SQL instance was having public IP available and I was able to connect to it from my local computer using auth proxy but same thing was not working in GKE private cluster. After spending 2 days I found this post with this answer which worked for me. I just made private IP enabled and added that flag in my deployment too.

To connect to Cloud SQL from a private GKE cluster use a private IP for Cloud SQL (remember to add -ip_address_types=PRIVATE to cloud-sql-proxy).

0

I think you got the ports wrong.
Inside your application you are giving MY_DB_PUBLIC_IP:3307 i.e. port 3307 but un the sidecar confg. you have specified the port my-production:us-central1:my-db=tcp:5432 i.e. 5432

Tyler2P
  • 2,324
  • 26
  • 22
  • 31
  • This was not the issue. I described what the issue was in my own answer. Also see https://github.com/GoogleCloudPlatform/cloudsql-proxy/issues/395 – Juliusz Gonera Jul 28 '22 at 11:25