0

Im creating a dataprep flow that imports a CSV to BQ. This works fine but it takes too long time. Even for very small files. Is there a way to add more workers on the job? maxNumWorkers is always 1 by default.

Br Cris

i_am_cris
  • 557
  • 1
  • 5
  • 19
  • How long does this usually take? What's a normal size for your CSV files? The reason I'm asking is that Dataflow and BQ have some fixed startup time of 3 minutes each (at least). If your files are very large, then it will be worth it to add more workers, but if not, then that 6-minute limit will be as low as you can get. – Pablo Sep 04 '18 at 22:09

1 Answers1

0

The first time that a Dataflow job was executed by Dataprep, the settings will be the default ones. However, you could re-run these jobs with different parameters directly from Dataflow by using its templates. For instance, you could use the REST API and using the numWorkers field to specify the workers to execute the job, as it is unspecified, the service will attempt to choose a reasonable default. For more information regarding the REST API, you could review this document.

Keep in mind that it has limitations

F10
  • 2,843
  • 2
  • 12
  • 18