How to link Google Kubernetes Engine (GKE) and Cloud SQL database?

Question

The code language I'm using is Terraform (Hashicorp Configuration Language or HCL) as the 'infrastructure as code' to automate the process of GCP creating everything for the server stack, and Helm charts for the M2 application. I've gotten as far as Terraform will create the GKE cluster, node pool, and Cloud SQL database, including the VPC network, virtual peering, and it's all secured with service accounts, etc. The database is on a private IP address with SSL applied, and I can confirm it is linked properly as far as on the same private VPC network as the GKE cluster (which has a public endpoint and Ingress setup with an external load balancer). The next step is to enable GKE to work on the Cloud SQL instance.

In the Google Guides section, they describe various methods to connect GKE and Cloud SQL, here at this link:

https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine

Cloud Auth Proxy is mentioned, although in my research, I heard going with that option it was a more complex route -- and in the Google Guide, it says:

"If you are using Google Kubernetes Engine, the preferred method is to use GKE's Workload Identity feature."

Which seems to indicate that using GKE's Workload Identity, to link GKE to Cloud SQL, is a separate option than via Cloud Auth Proxy, no?

For example, in my Terraform file, in creating the primary cluster, I have:

 // GKE: GOOGLE KUBERNETES ENGINE

  // Creates the primary cluster

resource "google_container_cluster" "primary" {
  name         = var.cluster
  project      = var.project_id
  location     = var.zones

... ... ...

  // This is required to have workloads and to enable gke metadata on the node pods.

    workload_identity_config {
      workload_pool = "${var.project_id}.svc.id.goog"
    }

... ... ...

I'm using YAML files for my Magento 2 application, but I'm running that as a separate deployment from the Terraform code which manages the GCP server stack, which deploys the app only once the Terraform has configured the server stack on GCP. I currently do not have any YAML files configured manually with the Terraform code base; however, if there is a way to add a YAML to a Terraform chart, then I would be eager to learn more about that process.

I've added the Workload Identity to the GKE cluster, but so far my testing hasn't resulted in GKE successfully linking to the Cloud SQL database.

However, there is a new 'simplified operator available in YAML file to connect GKE and Cloud SQL available here:

https://cloud.google.com/blog/products/databases/public-preview-for-cloud-sql-auth-proxy-kubernetes-operator/

Per the above link, what is the difference between Cloud SQL Proxy Operator and Cloud Auth Proxy?
Does Cloud SQL Proxy Operator work along with Google's Workload Identity (per the link below) -- to connect GKE and Cloud SQL -- or is Workload Identity not necessary? My next test for the Workload Identity will be to bind the GCP service account on the database to the Kubernetes service account, but I don't know if that will be sufficient to enable GKE to work directly on the Cloud SQL database -- or whether Cloud SQL Proxy Operator or Cloud Auth Proxy would also be helpful to add that as well.

https://medium.com/@daphneyigwe/deploying-an-application-on-google-kubernetes-engine-with-a-cloud-sql-database-using-terraform-afaf11a072c1

Is Cloud SQL Proxy Operator only intended to be packaged in YAML files (or run directly in gcloud CLI)? I would prefer to use Terraform code for any part of the main GCP server stack, since that's what I've already custom coded. Since Cloud SQL Proxy Operator seems fairly new, then unless a Terraform code or module is added or becomes available, I don't know how else to implement Cloud SQL Proxy Operator to the Terraform code that I've already been working on (and my Terraform code repository is quite extensive already).

score 2 · Answer 1 · answered May 17 '23 at 20:25

There are several parts to this question. As I read this question, I think it might be more of a Magento question than a Kubernetes/Cloud SQL question. I don't have much experience with PHP and I have no experience with Magento, so I'm not sure how helpful I can be.

At some point, your actual application code will need to connect to the database (send queries, etc.) and I don't know how that code should look in your application.

Because I imagine you will have to edit PHP code at some point, asking a more specific question in the Magento Stack Exchange could be a good place to go next: https://magento.stackexchange.com/

While I don't think I can help with the core of your problem, here are a few answers I can provide:

Which seems to indicate that using GKE's Workload Identity, to link GKE to Cloud SQL, is a separate option than via Cloud Auth Proxy, no?

Cloud Auth Proxy is a tool to connect your application to your database. It can use Workload Identity (also referred to as service account IAM Authentication) or a traditional username and password.

Cloud Auth Proxy and Workload Identity are separate things.

Per the above link, what is the difference between Cloud SQL Proxy Operator and Cloud Auth Proxy?

Cloud SQL Proxy Operator for Kubernetes uses Cloud Auth Proxy.

Jonathan Hess · Answer 2 · 2023-05-18T22:31:24.180

The Cloud SQL Auth Proxy Operator configures the Cloud SQL Auth Proxy on workloads in GKE. You can configure it to automatically add an Cloud SQL Auth Proxy container to your Magento deployment's pods.

I would recommend against using Terraform to manage Kubernetes resources like Deployments. Terraform is good for configuring cloud resources, but there are better tools for managing Kubernetes clusters. Consider using Google Cloud Config Sync or ArgoCD instead.

radiorider · Answer 3 · 2023-05-19T23:59:42.373

Nice! That's good to learn. It sounds like the Cloud SQL Proxy Operator is a great new improvement and kudos to your team for figuring that out in YAML!

I agree that Terraform is limited for handling Kubernetes resources for Deployments. It's good for testing, but not for a production environment. Terraform is really good for handling the server stack though. That's my focus for this inquiry. I already have a separate Helm charts code base for the Magento 2 application and its deployment.

As a result, if I understand correctly, the GKE-Cloud SQL connection could be either added into the Magento 2 Helm/YAML files, or it could also be added to the server stack, which for my project is already written quite extensively in Terraform code. If I had heard of Config Sync or ArgoCD when I began, and did my preliminary research on which path to take for the server stack, I may have tested those as well. Since I've already significantly invested into the Terraform code, though, I'm likely too far along to look at changing to any alternatives. That, and I think the Terraform code language is very clearly written and defined on how to manage everything within GCP, at least for the server stack management from an automation standpoint.

For example, if I try to link the Cloud SQL database to the same virtual private network as my GKE cluster, and I inadvertently omit a minor setting on the database instance, then I can just 'terraform destroy' and it wipes the entire GCP testing account. Then, update that minor setting in the Terraform code, and 'terraform apply' -- it automatically re-creates the entire GKE cluster, Cloud SQL, virtual private network -- everything in GCP all over again. Which is especially a handy tool for a dev testing environment -- but not so much for a live production (where things aren't intended to be destroyed).

I don't even use Terraform modules since the Google-provided Terraform code is so well constructed (and I'm concerned about the possibility of modules breaking down with future updates).

When I review the links per your reply, for Config Sync, it says the following:

"Integrated with Anthos: platform admins can install Config Sync using a few clicks in the Google Cloud console, using Terraform, or by using Google Cloud CLI on any cluster connected to your Anthos fleet."

So does Config Sync....work with Terraform? I'm not using Anthos, since its just a single cluster, and my understand was Anthos was meant for multi-cluster environments. As far as automating the M2 app deployment onto Kubernetes, I was thinking of going the route of Cloud Build...that's perhaps a better conversation for another thread though....unless Config Sync may be a better alternative for CI/CD onto Kubernetes than via Cloud Build?

It sounds like my GKE-and-Cloud-SQL connection setup will require Cloud SQL Auth Proxy regardless, so I'll test adding that along with the Workload Identity per the Medium article as a next step. And then, I'll report back once I've got it working.

Thank y'all for your expertise and advice!

score 0 · Answer 4 · answered May 20 '23 at 00:11

0

The Cloud SQL Auth proxy is the recommended way to connect to Cloud SQL, even when using private IP. This is because the Cloud SQL Auth proxy provides strong encryption and authentication using IAM, which can help keep your database secure.

Connect to Cloud SQL from Google Kubernetes Engine

answered May 20 '23 at 00:11

Ronnie Royston

16,778
6
77
91

I actually referenced the exact same link per my initial post inquiry, when I mentioned the 'In the Google Guides section'. :) – radiorider May 22 '23 at 16:34

How to link Google Kubernetes Engine (GKE) and Cloud SQL database?

4 Answers4