-1

I have specified 3 nodes when creating a cloud composer environment. I tried to connect to worker nodes via SSH but I am not able to find airflow directory in /home. So where exactly is it located?

1 Answers1

2

Cloud Composer runs Airflow on GKE, so you won't find data directly on any of the host GCE instances. Instead, Airflow processes are run within Kubernetes-managed containers, which either mount or sync data to the /home/airflow directory. To find the directory you will need to look within a running container.

Since each environment stores its Airflow data in a GCS bucket, you can alternatively inspect files by using Cloud Console or gsutil. If you really want to view /home/airflow with a shell, you can use kubectl exec which allows you to run commands/open a shell on any pod/container in the Kubernetes cluster. For example:

# Obtain the name of the Composer environment's GKE cluster
$ gcloud composer environments describe $ENV_NAME

# Fetch Kubernetes credentials for that cluster
$ gcloud container cluster get-credentials $GKE_CLUSTER_NAME

Once you have Kubernetes credentials, you can list running pods and SSH into them:

# List running pods
$ kubectl get pods

# SSH into a pod
$ kubectl exec -it $POD_NAME bash
airflow-worker-a93j$ ls /home/airflow
hexacyanide
  • 88,222
  • 31
  • 159
  • 162