I'm in a team using dvc with git to version-control data files. We are using dvc 1.3.1, with the an S3 bucket remote. I'm getting this error when executing dvc fetch
or dvc pull
on a colleague's branch:
ERROR: failed to fetch data from the cloud - DVC-file 'C:\Users\blah\Documents\repo\data\processed_data.dvc' format error: extra keys not allowed @ data['outs'][0]['size']
When I check the dvc file for a cached file with which I have no problem I see this:
md5: ded591aacbe363f0518ceb9c3bc1836b
outs:
- md5: efdab20e8b59903b9523cc188ff727e5
path: completion_header.p
cache: true
metric: false
persist: false
but a problematic file only has this:
outs:
- md5: f4e15187d9a0bbb328e629eabd8d1784.dir
size: 112007
nfiles: 3
path: processed_data
In all cases, files are added to dvc with the command dvc add %dirname%
. This is the second time I've seen this on a colleague's branch (2 different people).
Since posting, I have realized that my colleague dvc'd a directory. I have attempted creating the directory first, then calling dvc fetch
, but get the same error.