I tried to concatenate multiple csv file with gsutil cat and gsutil compose but the problem is that in the output file the header is getting repeated which result in data discrepancy.
Asked
Active
Viewed 381 times
2
-
Does this answer your question? [How can I append data to a file on google cloud storage](https://stackoverflow.com/questions/58947608/how-can-i-append-data-to-a-file-on-google-cloud-storage) – Chris32 Sep 22 '21 at 08:51
-
1gsutil does not process the data within files or objects. You will need to use a different tool to concatenate CSV formatted files. This is a very simple task for a Python program. – John Hanley Sep 22 '21 at 15:57
-
What about avoiding the header lines all together? If for example using `bq extract ... gs://table-name*.csv.gz` to generate multiple gz files, you can specify the `--noprint_header` option. – Ezequiel Muns Feb 15 '22 at 05:36
1 Answers
0
You can't do it directly with gsutil. But I wrote an article where I use BigQuery to (try to) solve this problem

guillaume blaquiere
- 66,369
- 2
- 47
- 76