Questions tagged [google-cloud-data-fusion]

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

Google Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. Data Fusion has a visual point-and-click interface, transformation blueprints, and connectors to make ETL pipeline development fast and easy. Cloud Data Fusion is based on the open-source CDAP project.

This tag can be added to any questions related to using/troubleshooting Google Cloud Data Fusion.

Useful links:

445 questions
0
votes
2 answers

Getting Null Pointer Exception when mapping SQL Server Database to MySQL Database with MapReduce

I am new to Cloud Data Fusion and am trying to map tables in a SQL Server Database to a MySQL Database. I have already faced many issues which I managed to solve namely: Fixed permissions for the service account so it could access all the resources…
Pedromlm
  • 89
  • 1
  • 9
-1
votes
1 answer

to create a flow to import the data from mysql/postgres/sql server using GCP Cloud data fusion

I want to create a logical flow in GCP which will import the data (as a source) from any RDBMS (mysql/postgres/sql server/ Oracle) and dump the data in Bigquery (as a destination) . At present , Cloud data fusion seems to be feasible option . I…
-1
votes
1 answer

Do we need to Peer every Data fusion instance to the Shared VPC ? How to avoid the 25 peering limitation

Use Case : Using GCP Data Fusion as an ETL for customers Source and Resources : My resources are on a shared VPC ( dataproc and runtime for data fusion on subnets taken from this shared VPC ) Based on the documentation of google data fusion, I need…
-1
votes
1 answer

How to connect Oracle with GCP Data Fusion

I wanna connect my oracle db with GCP Data Fusion but I don't it. I couldn't find jar file. I have a table in Oracle and get data to BQ so I don't know what I will use. Can you help me?
-1
votes
1 answer

How to build multi row formula using Cloud Data Fusion

I am trying to build a new column that has the running total of a specific column. Are there directives available to do this? Any suggestions on how to accomplish this?
-1
votes
1 answer

GCP Datafusion oracle connectivity issues

We're currently having issues setting up a connection to oracle datasources in datafusion (via jdbc) and i'm not sure if i am missing something. Bit of a background, we had issues connecting to any data source initially so in gcp i setup a vm…
Adam Briers
  • 101
  • 3
  • 14
-1
votes
1 answer

How to use the regex split filter in the datafusion?

I'm using Google Cloud Platform DataFusion products. Does it supposed to put a regular expression in the Regex Path Filter part in the Advanced section of the GCS Properties? e.g) /[0-9] But, If i enter a value in the Regex Path Filter and run the…
Quack
  • 680
  • 1
  • 8
  • 22
-2
votes
2 answers

i am getting some 0 when i am using md5 hash fuction in datafusion wrangler

When i am using the md5 hash fuction like this in wrangler set-column :numbertohash "12345" hash :numbertohash MD5 true i am getting the output like this its 32 0's are appended at start expected md5 for 12345 is ::…
naveen
  • 29
  • 4
-3
votes
1 answer

Is there a CDAP / Data Fusion plugin for transforming to Delta (Delta Lake) Format?

I'd like to use Data fusion on GCP as my ETL pipeline manager and store the raw data in GCS using the delta format. Has anyone done this, or does a plugin exist?
1 2 3
29
30