Why Data Fusion, well coz I need to run several more steps ( run Data Proc clusters ) , insert to DBs and do it in a schedule. Also the data could explode ( 10s of TB ) or shrink ( 10s of GBs).
Asked
Active
Viewed 184 times
1 Answers
0
Stacking several TB files isn't a good idea. The Storage size limit per object is 5TB.
I don't know your need of stacking file.
Maybe Bigquery can be a solution for loading easily your CSV files and then to query subset of file for further processing. But querying 10s of TB is expensive! (5$ per TB)
For more help, add more detail on what you want to achieve.

guillaume blaquiere
- 66,369
- 2
- 47
- 76