I have specified 3 nodes when creating a cloud composer environment. I tried to connect to worker nodes via SSH but I am not able to find airflow directory in /home. So where exactly is it located?
Asked
Active
Viewed 1,409 times
-1
-
Try `echo $AIRFLOW_HOME`. – Maroun Mar 28 '19 at 06:12
1 Answers
2
Cloud Composer runs Airflow on GKE, so you won't find data directly on any of the host GCE instances. Instead, Airflow processes are run within Kubernetes-managed containers, which either mount or sync data to the /home/airflow
directory. To find the directory you will need to look within a running container.
Since each environment stores its Airflow data in a GCS bucket, you can alternatively inspect files by using Cloud Console or gsutil
. If you really want to view /home/airflow
with a shell, you can use kubectl exec
which allows you to run commands/open a shell on any pod/container in the Kubernetes cluster. For example:
# Obtain the name of the Composer environment's GKE cluster
$ gcloud composer environments describe $ENV_NAME
# Fetch Kubernetes credentials for that cluster
$ gcloud container cluster get-credentials $GKE_CLUSTER_NAME
Once you have Kubernetes credentials, you can list running pods and SSH into them:
# List running pods
$ kubectl get pods
# SSH into a pod
$ kubectl exec -it $POD_NAME bash
airflow-worker-a93j$ ls /home/airflow

hexacyanide
- 88,222
- 31
- 159
- 162
-
I just tried this and I can't see any of the workers when I list pods? – szeitlin Jun 24 '20 at 16:58
-
1Check for the Composer namespace: `kubectl get ns | grep composer`. – hexacyanide Jun 25 '20 at 03:42
-
Ended up doing `kubectl get pods --all-namespaces` and that worked. Did not realize it wouldn't do all namespaces by default. – szeitlin Jun 25 '20 at 16:45