I'm in the process of converting a Makefile-based data workflow to dvc. I have a Google spreadsheet that I'm using in a data workflow to make it easy to update a few things in a makeshift database. Currently this works with something like this:
# Makefile
data.csv:
curl -L https://docs.google.com/spreadsheets/d/MY-GOOGLE-DOC-ID/export?exportFormat=csv > data.csv
Of course, I can incorporate the same step into my dvc pipeline directly with dvc run
, but my understanding is that something like dvc import-url
would be more appropriate but I'm getting an error:
$ poetry run dvc import-url https://docs.google.com/spreadsheets/d/MY-GOOGLE-DOC-ID/export?exportFormat=csv data.csv
Importing 'https://docs.google.com/spreadsheets/d/MY-GOOGLE-DOC-ID/export?exportFormat=csv' -> 'data.csv'
ERROR: unexpected error - 'NoneType' object has no attribute 'endswith'
My guess is that this is because the response data from the Google Spreadsheet export url doesn't have a filename suffix associated with it. Is there a way to work around this problem? Is there a better way to pull data from a google spreadsheet into a dvc workflow?