2

I've made a DAG which connects to Cloud SQL (MySQL) through a Cloud SQL Proxy installed on a GCE. It reads a list of tables and generates a number of tasks based on these. I've run this DAG in Airflow locally on my machine with success, but once I try to deploy it a Cloud Composer instance, the DAG doesn't seem to load properly into the Airflow Web UI. The only options available for the DAG are refresh and delete, and not all the other ones.

The DAG is found by the scheduler and I can see in the logs that a connection is made to Cloud SQL, retrieving the tables, but for some reason the Airflow web UI doesn't like it. There are no errors in the log.

I am aware of the architecture of Composer as depicted here: https://cloud.google.com/composer/docs/concepts/overview, and I'm wondering if it has something to do with the admin web UI being in a tenant project. I have however tried to open up the firewall to all connections from everywhere briefly to see if it was a firewall issue, but no luck. So I'm thinking it might be a routing issue.

The code which connects to the Cloud SQL Proxy looks like this:

with connection.cursor(pymysql.cursors.DictCursor) as cursor:
    sql = "select <redacted>"
    cursor.execute(sql)
    result = cursor.fetchall()

I create the cluster like this:

gcloud composer environments create comp-etl-runner \
--disk-size="30GB" --location="europe-west1" --zone="europe-west1-b" \
--machine-type="n1-standard-1" --node-count=3 \
--service-account="<redacted>" \
--python-version=3 --image-version="composer-1.7.2-airflow-1.10.2" --network="dev-network-1" \
--subnetwork="dev-subnet-3"

I've tried enabling ip aliasing and specifying the ip ranges like this:

gcloud beta composer environments create comp-etl-runner \
--disk-size="30GB" --location="europe-west1" --zone="europe-west1-b" \
--machine-type="n1-standard-1" --node-count=3 \
--service-account="<redacted>" \
--python-version=3 --image-version="composer-1.7.2-airflow-1.10.2" --network="dev-network-1" \
--subnetwork="dev-subnet-3" \
--enable-ip-alias \
--cluster-ipv4-cidr="10.207.0.0/19" \
--services-ipv4-cidr="10.207.32.0/19"

But that didn't make a difference.

I also tried adding these two parameters:

--enable-private-environment \
--master-ipv4-cidr="10.207.64.0/19" \

but then the environment creation just fails.

I'm tearing my hair out as my DAG is working perfectly in Airflow on my machine, but not in Cloud Composer. So any ideas would be greatly appreciated.

itroulli
  • 2,044
  • 1
  • 10
  • 21
Bjoern
  • 433
  • 3
  • 16
  • 1
    By "not loading properly", do you mean that the DAG appears in the UI, but doesn't become "ready"? Additionally, does the instance your SQL proxy is running on have a public IP address? Do you use any service discovery mechanisms? If the GCE instance doesn't have a public IP, do you connect to the SQL proxy using VPC peering or anything of the sort? – hexacyanide Sep 29 '19 at 07:25
  • Yes, it appears in the UI but doesn't become ready. No errors though. The SQL proxy does have a public IP, but I connect to the private one. That works from the k8s cluster. I'm not using any service discovery mechanisms. I haven't set any VPC peering up no. Bonus info: I can run the DAG manually through the CLI with success. It's just the UI which doesn't work. – Bjoern Sep 29 '19 at 15:24
  • Hmm... a few more questions just to be sure: is this a private IP environment? Is the GCE instance (with the SQL proxy on it) in the same VPC as your Composer GKE cluster? – hexacyanide Oct 01 '19 at 03:51
  • Yes, it is a private IP environment, and yes, the SQL Proxy GCE is on the same VPC as the Composer GKE cluster. The GKE cluster can connect to the SQL Proxy, but as I understand the documentation, the Airflow Web UI lives in a tenant project, and therefore not in my VPC. And I think that is why there are issues. – Bjoern Oct 01 '19 at 12:12

1 Answers1

0

I believe that the most suitable workaround is to deploy a self-managed webserver (as described here) in the same GKE cluster to be able to go through the Cloud SQL proxy.

Another option is to use a Public IP for your CloudSQL instance and whitelist everything making it accessible on the public internet. I am not sure if you can afford this in your use case though. If you choose this option, you should configure your instance to use SSL to maximize security.

itroulli
  • 2,044
  • 1
  • 10
  • 21