Questions tagged [google-cloud-data-fusion]

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

This tag can be added to any questions related to using/troubleshooting Google Cloud Data Fusion.

Useful links:

445 questions
2
votes
1 answer

Google Data Fusion: "Looping" over input data to then execute multiple Restful API calls per input row

I have the following challenge I would like to solve preferably in Google Data Fusion: I have one web service that returns about 30-50 elements describing an invoice in a JSON payload like this: { "invoice-services": [ { "serviceId":…
JensU
  • 21
  • 1
2
votes
4 answers

Macros in Datafusion using Argument setter

Using Argument setter by supplying the parameter value I want to make the Datafusion pipeline as resuable. As said by many other answer's have tried implementing using the cloud reusable pipeline example given in Google guide.I was not able to pass…
2
votes
1 answer

REST API for Google Cloud Fusion

I am exploring new tools for data pipelines. Trying Google's Data Fusion I was very surprised to not see any REST connector. It seems like REST API's are a very standard way to access data and I am confused not to see it available. Am I missing…
nate-k
  • 191
  • 2
  • 10
2
votes
2 answers

Problem with creating a data pipeline from SQL Server to BigQuery using cloud data fusion

I am trying to create a data pipeline from "SQL SERVER (from GCP VM)" To "BigQuery" using CLOUD DATA FUSION; I have done all the below setup configurations, Created the new instance in Cloud data fusion. Added this as a service account in IAM &…
2
votes
1 answer

How does regex_path_filter work in GCSFile properties of DATA FUSION pipeline in GCP

IN Data fusion pipeline of GCP, the GCSFile properties having a field named "Regex path filter". How does it work?. I don't get proper documentation on this.
madhusudan
  • 21
  • 1
2
votes
1 answer

Setting up Datafusion instance to connect with secured Dataproc cluster

We have a secured Dataproc cluster, we are able to successfully SSH into it with individual user ID's with the command: gcloud compute ssh cluster-name --tunnel-through-iap But when we create a profile and attach it to Data Fusion instance and…
2
votes
3 answers

Convert to date in cloud datafusion

How do we convert a string to date in cloud datafusion? I have a column with the value say 20191120 (format of yyyyMMdd) i want to load this into a table in bigquery as date. The table column datatype is also date. What i have tried so far is that…
Trishit Ghosh
  • 235
  • 3
  • 10
2
votes
1 answer

Elasticsearch to BigQuery pipeline deployment fails on cloud data fusion instance

I am deploying a data fusion pipeline which takes data from an index at elasticsearch and load that data to bigQuery table. Pipeline simply consist of elasticsearch plugin connector to BigQuery connector. When I run the pipeline it generates the…
2
votes
2 answers

Using compressed files with Datafusion

Is there a way to use compressed files with Cloud data fusion. I have used Google Storage as a source and placed a gzip file in the preferred location. In the wrangler transform, I don't see a preview. When I try to select the file using select Data…
Trishit Ghosh
  • 235
  • 3
  • 10
2
votes
1 answer

Unable to upload custom Plugin

I created a custom plugin to be uploaded on Google Cloud Data Fusion platform, which is based on CDAP platform. I followed the instructions for developing and deploying plugins but the upload fails when I try to associate the corresponding Json…
Vincenzo Maggio
  • 3,787
  • 26
  • 42
2
votes
0 answers

String to Date conversion in wrangler

I am having my raw data in the format of '2019-10-10' in csv file. After reading file I have loaded into Wrangler for transformation . My target column is having data type as DATE. I have applied below transformation: set-column TODATE…
2
votes
1 answer

Cannot connect to on-prem SQL Server with Google Cloud Data Fusion

I am trying to test a connection using Cloud Data Fusion to connect to an on-prem SQL Server. Our GCP Project does not use the default network but rather a custom VPC. It's important to note that security is very important as this database contains…
Chris
  • 31
  • 2
2
votes
1 answer

Saleforce plug-in error in google cloud data fusion

I'm testing salesforce connectivity from Google Cloud Data Fusion. I get this error "Error: No discoverable found for request POST /v3/namespaces/system/apps/pipeline/services/studio/methods/v1/contexts/default/validations/stage HTTP/1.1" when…
AugS
  • 21
  • 1
2
votes
2 answers

Can't connect to Salesforce With Google Data Fusion

Trying to configure the salesforce connector to read data from salesforce using Google Data Fusion, but can't connect to Salesforce. I keep getting "Connection to salesforce with plugin configurations failed" error message when hitting the "get…
Yaniv
  • 21
  • 3
2
votes
0 answers

FAILED_PRECONDITION: Cannot delete cluster 'cdap-fusionpip-462363d2-a154-11e9-869a-16df235ccdf8' while it has other pending delete operations

When deploying a Data Fusion pipeline, it keeps failing and throwing the following error, com.google.api.gax.rpc.FailedPreconditionException: io.grpc.StatusRuntimeException: FAILED_PRECONDITION: Cannot delete cluster…