Questions tagged [google-cloud-composer]

Google Cloud Composer is a fully managed workflow orchestration service, built on Apache Airflow, that empowers you to author, schedule, and monitor pipelines that span across clouds and on-premises data centers.

Cloud Composer is a product of Google Cloud Platform (GCP). It is essentially "hosted/managed Apache Airflow."

The product allows you to create, schedule, and monitor jobs, each job being represented as a DAG (directed acyclic graph) of various operators. You can use Airflow's built-in operator definitions and/or or define your own in pure Python.

While technically you can do data processing directly within a task (instantiated operator), more often you will want a task to invoke some sort of processing in another system (which could be anything - a container, BigQuery, Spark, etc). Often you will then wait for that processing to complete using an Airflow sensor operator, possibly launch further dependent tasks, etc.

While Cloud Composer is managed, you can apply a variety of customizations, such as specifying which pip modules to install, hardware configurations, environment variables, etc. Cloud Composer allows overriding some but not all Airflow configuration settings.

Further technical details: Cloud Composer will create a Kubernetes cluster for each Airflow environment you create. This is where your tasks will be run, but you don't have to manage it. You will place your code within a specific bucket in Cloud Storage, and Cloud Composer will sync it from there.

1225 questions
5
votes
1 answer

How can I read a config file from airflow packaged DAG?

Airflow packaged DAGs seem like a great building block for a sane production airflow deployment. I have a DAG with dynamic subDAGs, driven by a config file, something like: config.yaml: imports: - project_foo - project_bar` which yields subdag…
Jake Biesinger
  • 5,538
  • 2
  • 23
  • 25
5
votes
2 answers

Apache Airflow - How to retrieve dag_run data outside an operator in a flow triggered with TriggerDagRunOperator

I set up two DAGs, let's call the first one orchestrator and the second one worker. Orchestrator work is to retrieve a list from an API and, for each element in this list, trigger the worker DAG with some parameters. The reason why I separated the…
Mikael Gibert
  • 355
  • 1
  • 3
  • 8
5
votes
1 answer

Airflow Scheduler keeps crashing, DB connection error (Google Composer)

I've been using Google Composer for a while (composer-0.5.2-airflow-1.9.0), and had some problems with the Airflow scheduler. The scheduler container crashes sometimes, and it can get into a locked situation in which it cannot start any new tasks…
Dalar
  • 135
  • 1
  • 7
5
votes
1 answer

Migrate existing Airflow DB to Cloud Composer

Is there any way to migrate an existing Airflow instance onto Google Cloud Composer? We're currently running our own instance of Airflow using postgres for the db. Ideally we'd be able to preserve the existing history of the DAGs, which I believe…
5
votes
2 answers

Can you get a static external IP address for Google Cloud Composer / Airflow?

I know how to assign a static external IP address to a Compute Engine, but can this be done with Google Cloud Composer (Airflow)? I'd imagine most companies need that functionality since they'd generally be writing back to a warehouse that is…
5
votes
3 answers

Delete a DAG in google composer - Airflow UI

I want to delete a DAG from the Airflow UI, that's not longer available in the GCS/dags folder. I know that Airflow has a "new" way to remove dags from the DB using airflow delete_dag my_dag_id command, seen in…
Pablo
  • 3,135
  • 4
  • 27
  • 43
5
votes
1 answer

How do I read a file in the airflow cloud composer bucket?

To separate bigquery queries from the actual code I want to store the sql in a separate file and then read it from the python code. I have tried to add the file in the same bucket as the DAGs and also in a sub folder, but it seems like I can't read…
Tomas Jansson
  • 22,767
  • 13
  • 83
  • 137
5
votes
4 answers

Airflow dag dependencies not available to dags when running Google's Cloud Compose

Airflow allows you to put dependencies (external python code to the dag code) that dags rely on in the dag folder. this means any components/members or classes in those external python code is available for use in the dag code. When doing this (in…
5
votes
2 answers

How can I setup Cloud Composer to send email?

I want to get email notifications with Cloud Composer but I am unsure how to do that. How can I configure a Composer environment to send email notifications?
James
  • 2,321
  • 14
  • 30
4
votes
1 answer

Cloud Composer 2: prevent eviction of worker pods

I am currently planning to upgrade our Cloud Composer environment from Composer 1 to 2. However I am quite concerned about disruptions that could occur in Cloud Composer 2 due to the new autoscaling behavior inherited from GKE Autopilot. In…
4
votes
0 answers

How to fix dependency 'Task Instance Not Running' FAILED: Task is in the running state

I sometimes get this log for tasks: [2022-10-03, 00:34:03 UTC] {taskinstance.py:1034} INFO - Dependencies not met for , dependency 'Task Instance Not Running' FAILED:…
4
votes
1 answer

How to execute Cloud Run containers into an Airflow DAG?

I'm trying to run a container with Cloud Run as a task of an Airflow's DAG. Seems that there are no things like a CloudRunOperator or similar and I can't find anything on documentations (either Cloud Run and Airflow one). Have someone ever dealt…
4
votes
1 answer

Local testing for cloud composer

What's the best way to replicate cloud composer in a local environment? Before we deploy our code, we want to test it locally, for this, we already use a docker-compose set up with the image apache/airflow:1.10.14, since we use airflow…
tty
  • 95
  • 1
  • 5
4
votes
1 answer

How to access Google Cloud Composer's data folder from a pod launched using KubernetesPodOperator?

I have a Google Cloud Composer 1 environment (Airflow 2.1.2) where I want to run an Airflow DAG that utilizes the KubernetesPodOperator. Cloud Composer makes available to all DAGs a shared file directory for storing application data. The files in…
urig
  • 16,016
  • 26
  • 115
  • 184
4
votes
1 answer

Enable Google Drive OAuth Scopes on Cloud Composer 2.0

Is there a method to add/modify google oauth scopes to a Cloud Composer 2.0 environment? When installing Composer 2.0 there is no option to modify the oauth scopes from the UI or command line. I need to add google drive to the oauth scopes on the…