0

I have a GCS bucket that gets ~10k to 20k new files daily.

I want to set up a BQ data transfer to load the new files into my table each day.

Given the large amount of files it runs up against the quotas and gives me this error: Error status: Transfer Run limits exceeded. Max size: 15.00 TB. Max file count: 10000. Found: size = 24448691 B (0.00 TB) ; file count = 19844.

Is there a way to avoid these quotas?

A Clockwork Orange
  • 23,913
  • 7
  • 25
  • 28

1 Answers1

0

According to GCP documentation Bigquery transfer jobs are already included in predefined quotas on load jobs restrictions.

Based on the provided information, Maximum number of source URIs in job configuration limit seems to be the most likely root cause of this reported Bigquery transfer issue, since the relevant documented limit (i.e. 10k) and the current file count metric equals 19844.

In addition to the answer being posted by @Kevin Quinzel in this Stack thread, waiting for any efforts resolving the feature request, I've noticed that there is a sharding whitelist feature offered by the vendor which allows processing of more than 10k files, assuming that Biqguery transfer service can automatically launch multiple BQ import jobs in order to shard files across multiple BQ load jobs mitigating 10k file limit.

In order to enable this feature for particular GCP project you may be requested to file a separate support case to vendor.

Nick_Kh
  • 5,089
  • 2
  • 10
  • 16