Questions tagged [google-cloud-data-fusion]

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

This tag can be added to any questions related to using/troubleshooting Google Cloud Data Fusion.

Useful links:

445 questions
0
votes
1 answer

Do we have expiration for google cloud data fusion drafts

Anyone is aware whether CDAP drafts saved has any expiry , I have created few drafts in one of the cloud data fusion instance and the same instance is being used by other folks . But after like 2-3 days when I tried to retrieve the draft , found it…
Psen
  • 5
  • 3
0
votes
1 answer

CDAP API Calling in GCP failing

I am trying to create a sample Pipeline in my Data Fusion instance, as part of my Project POC. I am using CDAP API for automate the pipeline creation. I am facing issue while calling below CDAP API in GCP, curl -H "Authorization: Bearer $(gcloud…
0
votes
2 answers

unable to delete custom plugin from datafusion instance

I tried uploading a custom jar as cdap plugin and it has few errors in it. I want to delete that particular plugin and upload a new one. what is the process for it ? I tried looking for documentation and it was not much informative. Thanks in…
0
votes
1 answer

Converting data types of Foreign Keys to use Joiner in Google Cloud Data Fusion Pipeline

I am building a pipeline that connects to an on-prem Oracle DB using the Database Plugin, queries two tables (table_a, table_b), and joins those tables using Joiner Plugin, before uploading to a BigQuery table. The problem I have now is that the…
Korean_Of_the_Mountain
  • 1,428
  • 3
  • 16
  • 40
0
votes
1 answer

Why datafusion HTTP plugin URL macro is not working?

I am exploring macros in datafusion pipelines. I am using HTTP Sink plugin and trying to enable macro option for URL option like {URL}. when i try to deploy the pipeline, it is throwing the following error. Failed to configure pipeline: Stage…
0
votes
1 answer

How to run cdap datafusion pipelines sequentially

I have a scenario where i have 5 pipelines which i want to run sequentially one after other. Is there any way to do this. I tried reading the documentation but it wasn't clear. Thanks in advance!
code tutorial
  • 554
  • 1
  • 5
  • 17
0
votes
1 answer

GCP Datafusion upload custom plugin is too slow

Currently I am using basic edition of datafusion and i wanted to upload a custom plugin. It is too slow to upload the jar and json. Taking around 10 minutes of time to upload the plugin and entire browser is getting hanged. Did anyone faced this…
code tutorial
  • 554
  • 1
  • 5
  • 17
0
votes
1 answer

Issue coming while connecting google datafusion pipeline to mysql database

I am connecting MYSQL database to google bigquery through datafusion pipeline,i used a jdbc driver jar file, i installed it and put details into source pipeline, at the time of browsing data at connection database(mysql) i put detail of host…
0
votes
1 answer

Unable to upload CDAP custom plugin in Datafusion

I am trying to upload http-sink plugin from Datafusion upload button. I cloned the http-sink repository and I made few very minimal changes and packaged it as a JAR. Now when I try to upload the JAR and the JSON files using upload plugin button, I…
0
votes
1 answer

Implementing SCD type2 in Data Fusion

Am trying to implement type2 in Datafusion.Can someone help in performing insert and updating by using the pipeline transformations/action/conditions to achieve this. I was trying to generate hash using Wrangler for both source and Target and join…
0
votes
2 answers

How to configure JDBC for Cloud Fusion to connect MySQL installed on localhost:3306

I'm trying to connect my local standalone MySQL with Cloud Fusion to create and test a data pipeline. I have deployed the driver successfully. Also, I have configured the pipeline properties with correct values of jdbc string, user name and password…
0
votes
2 answers

Google Cloud Data Fusion + CI/CD for Data Pipelines

I am just getting started with both, GCP & Google Cloud Data Fusion. Just viewed the intro video. I see that pipelines can be exported. I was wondering how we might promote a pipeline from say, Dev to Prod env? My guess is that after some testing,…
user2849678
  • 613
  • 7
  • 15
0
votes
1 answer

auto-detect nested json response of HTTP plugin in datafusion

I'm trying to call an HTTP GET API using HTTP batch source plugin in data fusion. The response of the API is a complex dynamic nested json because of which I cannot manually specify the output schema. Is there anyway to overcome this problem. Thanks…
code tutorial
  • 554
  • 1
  • 5
  • 17
0
votes
0 answers

Data fusion pipeline- wrangler transformation not working

Datafusion pipeline Source is a csv file in GCS Wrangler -- > Tried to validate few columns like below Mobile_Number(column) -- > send to error --> value matches regex expression - >^[0]?[789]\d{9}$ When the similar transformations are added,…
0
votes
0 answers

Pipeline Fail to put into BigQuery Sink MapReduce Program 'phase-1' failed

I am trying to build a simple pipeline that moves data from our Cloud SQL (MySQL) into BigQuery. All the JDBC driver stuff is working fine (if I use the trash can as a sink, I can see the preview data) and the schema propagated. I created the…
Dino
  • 352
  • 2
  • 8