Questions tagged [google-cloud-composer]

Google Cloud Composer is a fully managed workflow orchestration service, built on Apache Airflow, that empowers you to author, schedule, and monitor pipelines that span across clouds and on-premises data centers.

Cloud Composer is a product of Google Cloud Platform (GCP). It is essentially "hosted/managed Apache Airflow."

The product allows you to create, schedule, and monitor jobs, each job being represented as a DAG (directed acyclic graph) of various operators. You can use Airflow's built-in operator definitions and/or or define your own in pure Python.

While technically you can do data processing directly within a task (instantiated operator), more often you will want a task to invoke some sort of processing in another system (which could be anything - a container, BigQuery, Spark, etc). Often you will then wait for that processing to complete using an Airflow sensor operator, possibly launch further dependent tasks, etc.

While Cloud Composer is managed, you can apply a variety of customizations, such as specifying which pip modules to install, hardware configurations, environment variables, etc. Cloud Composer allows overriding some but not all Airflow configuration settings.

Further technical details: Cloud Composer will create a Kubernetes cluster for each Airflow environment you create. This is where your tasks will be run, but you don't have to manage it. You will place your code within a specific bucket in Cloud Storage, and Cloud Composer will sync it from there.

1225 questions
0
votes
2 answers

I want to use Python Fabric with my custom Operator, how should I install fabric on workers?

At this point I'm thinking about calling bash command pip install fabric2 each time my operator executed, but this does not looke like a good idea.
0
votes
2 answers

Airflow's bigqueryoperator not working with udf

I'm trying to run a basic bigquery operator in Airflow (using Google's Composer) task which uses a user defined function (UDF). The example comes from https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions and runs…
0
votes
0 answers

Cloud composer detect container finished running

I am using the bash operator in order to schedule a container to run on a compute instance. gcloud beta compute instances create-with-container airflow-vm --zone us-central1-a --container-image…
JY2k
  • 2,879
  • 1
  • 31
  • 60
0
votes
1 answer

Cloud composer running containr on compute

How can I launch a compute instance and deploy a container on it? I can see that there is a python operator but to my understanding that will run the script in a pre-made container on the Airflow workers rather than on an external instance.
JY2k
  • 2,879
  • 1
  • 31
  • 60
-1
votes
1 answer

Composer2 v2.4.3 : Task exited with return code Negsignal.SIGKILL

I'm using DataprocSubmitJobOperator in GCP Composer2(Airflow dags) and the jobs are failing in Composer v2.4.3, while the jobs are going through in the v2.2.5 clusters. Error is as shown below: [2023-05-05, 19:10:59 PDT] {dataproc.py:1953} INFO -…
-1
votes
1 answer

Google Composer - Creation of partitioned External tables

I am trying to create an External table on top of Google Cloud Storage bucket through Composer DAG. My upstream provides me partitioned parquet files based on specific country. So, I would like to create an external table with Source Data…
Sri Bharath
  • 115
  • 1
  • 2
  • 10
-1
votes
2 answers

airflow - Dags are not being executed on scheduled time

I wanted to have my dags to have first run on 2:00 AM on 25th and then onwards Tuesday to Sat daily run at 2:00 am. following is how my scheduling look like. with DAG( dag_id='in__xxx__agnt_brk_com', schedule_interval='0 2 * * 2-6', …
Gaurang Shah
  • 11,764
  • 9
  • 74
  • 137
-1
votes
2 answers

GKE Workload from Cloud Composer

I'm trying to deploy a GKE Workload (yaml file) via Cloud Composer 2. I can do it easily via console, but from Cloud Composer I'm facing authorization issues, which led me to think if this is the best method. The general idea for this pipeline is as…
-1
votes
1 answer

GCP composer best suitable versions of providers

Where i can find best suitable version for any provider with resepct to the airflow version for example Airflow version 2.2.5 best sutaible version of goolge provider is 4.0.0
-1
votes
1 answer

How to deal with Negsignal.SIGSEGV in cloud composer

I recently created a new production cloud composer environment and migrated my existing dags to it and have been getting the error INFO - Task exited with return code Negsignal.SIGSEGV on some tasks that work perfectly fine in my dev environment…
tacoofdoomk
  • 121
  • 1
  • 1
  • 7
-1
votes
1 answer

Composer Airflow - Cross DAG Task Dependancy

I have below 2 DAGs and tasks DAGA - Task1,Task2,Task3 DAGB - Task4,Task5,Task6 Now Task4 & 5 of DAG B depends on Task1 of DAGA and Task6 depends on Task3 but I want this dependency only on Monday and for remaining days I don't want this dependency.
Rahul Wagh
  • 281
  • 6
  • 20
-1
votes
1 answer

Which is a more efficient orchestrating mechanism, chaining Databricks notebooks together or using Apache Airflow?

The data size for the data is in the Terabytes. I have multiple Databricks notebooks for incremental data load into Google BigQuery for each dimension table. Now, I have to perform this data load every two hours i.e. run these notebooks. What is a…
-1
votes
1 answer

Issues with importing airflow.operators.sensors.external_task import ExternalTaskSensor module and triggering external dag

I am trying to trigger multiple external dag dataflow job via master dag. I plan to use TriggerDagRunOperator and ExternalTaskSensor . I have around 10 dataflow jobs - some are to be executed in sequence and some in parallel . For example: I want…
-1
votes
1 answer

Airflow workers : how do they know what to do? + problem with

I am working with Airflow (on Cloud Composer) since one year now and I have difficulties to find out how (Celery) workers do know what actions to perform when they receive a task to execute. From what I understand : We put some DAGs in the /dags…
Ferdi777
  • 317
  • 4
  • 14
-1
votes
1 answer

Unable to create Private Cloud Composer Environment

I'm able to create Private IP VPC native GKE cluster without any issue. But when I create cloud composer private IP environment using same network,secondary ranges for POD and Services which I used for GKE cluster if fails with below error message.…
1 2 3
81
82