0

I am looking to perform the equivalent of SQL UNION within one pipeline, in Data Fusion. I do not see a plugin named UNION.

Can I achieve this functionality using any other existing plugin? Can I leverage a plugin designed for another purpose, to achieve this function?

What are my alternatives, if this function is not directly made available within Data Fusion?

Data is all structured, extracted from CSV and Relational tables. For schema, its assumed that they have the same schema, which is the requirement for SQL UNION.

I am assuming, stage the data in a BQ table, with APPEND mode, and then perform additional transformations in a separate pipeline?

Ravi
  • 11
  • 2
  • Could you give more information about your data? what type files are you combining? Do they have the same structure/schema for both files? – Sakshi Gatyan Mar 02 '23 at 14:51
  • Data is all structured, extracted from CSV and Relational tables. For schema, its assumed that they have the same schema, which is the requirement for SQL UNION. – Ravi Mar 02 '23 at 16:04
  • Did you have a look at this [thread](https://stackoverflow.com/questions/55824910/how-to-union-multiple-csv-files-with-the-same-structure-in-cloud-data-fusion)? – Sakshi Gatyan Mar 06 '23 at 08:18
  • Though there is no explicit plugin to perform UNION, wrangler is capable of accepting 2 input links and write to a sink, achieving UNION functionality. Please refer to the link in Sakshi Gatyan`s response. – Ravi Mar 17 '23 at 02:47

1 Answers1

0

By default if you are reading from multiple sources with same schema, they are treated as union. If the schema is different, you can use wrangler plugin to make the schema same before writing to sinks. Hope this helps

vinisha
  • 136
  • 3