I have a few datasets from the government dataset that I'm using on my ML model, the problem is, their server is not that great to put it nicely. Whenever I run my pipeline, when I pull from their API all at once, their server goes down for a few minutes.
This is how their data is represented on our catalog.yml
:
external-safra-cana:
type: api.APIDataSet
url: https://apisidra.ibge.gov.br/values/t/6588/p/all/v/allxp/c48/39456/n3/all
external-safra-algodao:
type: api.APIDataSet
url: https://apisidra.ibge.gov.br/values/t/6588/p/all/v/allxp/c48/39429/n3/all
external-safra-arroz:
type: api.APIDataSet
url: https://apisidra.ibge.gov.br/values/t/6588/p/all/v/allxp/c48/39432/n3/all
external-safra-milho1:
type: api.APIDataSet
url: https://apisidra.ibge.gov.br/values/t/6588/p/all/v/allxp/c48/39441/n3/all
external-safra-milho2:
type: api.APIDataSet
url: https://apisidra.ibge.gov.br/values/t/6588/p/all/v/allxp/c48/39442/n3/all
What I want to do is, if the data fails to download, I want to sleep for a few seconds and retry, but I could not find anything like that on the documentation, is there a way to get this behavior from the APIDataSet
?