Questions tagged [google-cloud-data-fusion]

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

This tag can be added to any questions related to using/troubleshooting Google Cloud Data Fusion.

Useful links:

445 questions
1
vote
0 answers

Send an email when a pipeline fails in Data fusion

I need when a pipeline fails the data merge to create a table inside bigquery that says it failed or send me a mail. How could I achieve it? I know I can use a condition, but I need to capture the failure event. Does anyone have an idea how to do…
1
vote
1 answer

Cloud data fusion Permission denied due to datastream.connectionProfiles.discover

I am trying to create a cloud data fusion replication job from oracle to bigquery. Receiving the below error. Failed to connect to the database due to below error : io.grpc.StatusRuntimeException: PERMISSION_DENIED:…
1
vote
1 answer

Cannot connect Public Cloud PostgreSQL instance to Public Cloud Data Fusion Instance

I am trying to Connect Both Public Cloud SQL and Data Fusion instances but ended up receiving the 403 error. Failed to create connection to database via connection string:…
1
vote
0 answers

Send Email Alerts Using Google Cloud Data Fusion

I am trying to send out alert emails after each run in data fusion using the Email plugin in Data Fusion. However, I always get errors for authentication, I am not sure how to set it up using my gamil account, could you please help me with it. Also,…
Yuan
  • 11
  • 1
1
vote
1 answer

Update Google Cloud Data Fusion replication job to reflect the SQL Server table schema

I've created a Data Fusion replication job to replicate some tables on a test database. It works well at the beginning if I don't change the tables schema. But I've added a new column and that column is ignored from the replication job. I guess that…
1
vote
1 answer

How to get response from HTTP Sink Plugin in Cloud Data Fusion?

Expertise: I am new to Cloud Data Fusion. What I am trying to achieve: Create a Data pipeline in the Google Cloud Data Fusion: Read a file from GCS. Call an HTTP Endpoint with the parsed data of GCS. Save the response received from HTTP in the GCS…
1
vote
1 answer

send data to http endpoint using data fusion realtime pipeline

I'm creating a real-time data fusion pipeline where the Sink is a HTTP plugin call to Vertex AI endpoint in another GCP project. The request body will be provided by a previous step in the pipeline. The http sink plugin being used (HTTP v1.2.2)…
1
vote
0 answers

Google Ads API & Data fusion Error Exception while downloading report definition :null

I am trying to use Data fusion to get detail report of my Google ads MCC level. I did the following Have a test developer account token on my MCC Generated the refresh token Generated the client ID & client secret Authorised Google ads API to…
1
vote
1 answer

CDAP Trucate a long type column representing an epoch in milliseconds to year/month

I'm using CDAP and Cloud Data Fusion 6.6.0 I have a column called ts with a long representing a timestamp in milliseconds. The type is long. In the next step, I need to group by year and month, so I need to create two new fields year and month or…
angelcervera
  • 3,699
  • 1
  • 40
  • 68
1
vote
2 answers

Error while Data Ingestion from SFTP to GCS or BigQuery using Cloud Data Fusion

I am trying to move CSV files in SFTP folder to GCS using Data Fusion. But I am unable to do it and throwing below error: Here are the properties of both FTP and GCS plugins. Surprisingly, I could see the data in PREVIEW mode in all the stages but…
1
vote
0 answers

Cannot connect to GCP SQL instance from data fusion

I have a GCP postgreSQL instance connected to data studio through big query. I want to do exactly the same thing with a sql server GCP but have found that big query cannot connect to sql server in the same way. I'm trying to use data fusion to send…
1
vote
1 answer

Retry a Google Cloud Data Fusion Pipeline if it fails

We have several cloud data fusion pipelines and some fail, for seemingly random reasons. Albeit rarely. Is there a way to automatically retry a pipeline if it fails, say 3 times?
ConfusedNoob
  • 9,826
  • 14
  • 64
  • 85
1
vote
1 answer

Unable to load csv file in wrangler data fusion

i am trying to import a file which has more than 100 columns and size of file is 4 mb. after applying operation "parse-as-csv :body ',' true" . no data is showing but if file size is < 2mb it is working. please give any solution
1
vote
2 answers

Cloud Data Fusion ETL from PostGres to BigQuery - idempotent load

I'm trying to use Google's Cloud Data Fusion (CDF) to perform an ETL of some OLTP data in PostGres into BigQuery (BQ). We will copy the contents of a few select tables into an equivalent table in BQ every night - we will add one column with the…
ConfusedNoob
  • 9,826
  • 14
  • 64
  • 85
1
vote
0 answers

GCP Cloud Data fusion Error- Spark program 'phase-1' failed with error: canCommit()

I am receiving following error while executing Data Pipeline in GCP Cloud Data Fusion. Spark program 'phase-1' failed with error: canCommit() is called for transaction More information: So the Pipeline is responsible for lift'n'shift operation,…