Data Version Control (DVC) is an open-source version control system for ML and data science projects. Use this tag for questions related to DVC usage and workflows.
Questions tagged [dvc]
138 questions
1
vote
1 answer
DVC error : scmrepo.exceptions.SCMError: Empty git repo
I tried initializing DVC and was getting this error .
I am not using GIT actively to track anything although git is initialised . I am trying to create some plots
from dvclive import Live
live = Live("evaluation2/metrics")
does anyone have any…

Kaiser
- 187
- 8
1
vote
1 answer
How to change or specify a DVC experiment name?
How do I change the name of the experiment? I tried to use dvc exp run -n to name the project then use git to push to github. However the experiment name is still SHA.
Tried: I tried to use dvc exp run -n to name the project then use git to push to…

cyMLOps
- 23
- 3
1
vote
0 answers
DVC get problems with the url
I'm trying to obtain the diamonds.csv dataset contained here trough dvc get using
$ dvc get https://github.com/tidyverse/ggplot2 \ data-raw/diamonds.csv -o data/dataset/diamonds.csv
But I obtained the following error:
Which summarise should be:…

paolopazzo
- 277
- 2
- 14
1
vote
1 answer
How to disable DVC git hooks
DVC has Git hooks which are installed with dvc install. The hooks were working fine but after an error with dvc push and the DVC remote, I cannot git push because before git push gets executed, dvc push runs and generates an error. Which means I…

Bex T.
- 1,062
- 1
- 12
- 28
1
vote
1 answer
DVC remove unused file from remote repository S3
I'm trying to remove 1 file from S3 remote repository which is no longer tracked by DVC.
So, I did:
dvc remove .dvc file
git add & commit the .gitignore and .dvc files
run dvc gc -c --workspace
however, the process of deleting 1 file (13KB) took 6…

wumbow
- 11
- 3
1
vote
1 answer
dvc push certificate verify fail for gcp
Trying to push data into a bucket in Google Cloud Platform (GCP) from Visual Studio via dvc push
ERROR:
Tried:
gcloud auth login
gcloud auth application-default login
gsutil ls
pip install dvc[gs]
And I am in the corresponding project and bucket…

MLB9
- 23
- 3
1
vote
1 answer
Experiment tracking for multiple ML independent models using WandB in a single main evaluation
Can you recommend from your experience about choosing a convenient tracking experiment tool and versioning only "Multi independent models, but one input->multi-models->one output" in order to get single main evaluation and conveniently compare…

AlexeyPrikhodko
- 43
- 6
1
vote
1 answer
Python, Using dvc, How does it work? Does it keep all the data files versions? Can it lead to extra cloud charges?
I consider to learn about using dvc (https://dvc.org/), but before that I have some questions regarding dvc with cloud:
Does dvc saves all the different versions of the dataset?
Does dvc support all data files format (csv, feather)?
Can the usage…

Ilan Geffen
- 179
- 8
1
vote
1 answer
How to efficiently use S3 remote with DVC among multiple developers with different aws configs?
The DVC remote configuration allows to define a profile for the AWS CLI to use. However, some developers might have their local AWS cli configuration use different profiles whose name they find helpful.
Is there a way to override the profile used by…

Edmondo
- 19,559
- 13
- 62
- 115
1
vote
2 answers
ERROR: Cannot add 'folder-path', because it is overlapping with other DVC tracked output:
Goal: add commit push all contents of project_model/data/ to dvcstore.
I don't have any .dvc files in my project.
$ dvc add ./project_model/data/
ERROR: Cannot add '/home/me/PycharmProjects/project/project_model/data/images', because it is…

DanielBell99
- 896
- 5
- 25
- 57
1
vote
1 answer
DVC imports authentication to blob storage
I'm using DVC to track and version data that is stored locally on the file system and in Azure Blob storage.
My setup is as follows:
DataProject1, it uses a local file location as a remote therefore it does not require any…

Giuseppe Romagnuolo
- 3,362
- 2
- 30
- 38
1
vote
1 answer
DVC | Permission denied ERROR: failed to reproduce stage: failed to run: .py, exited with 126
Goal: run .py files via. dvc.yaml.
There are stages before it, in dvc.yaml, that don't produce the error.
dvc exp run:
(venv) me@ubuntu-pcs:~/PycharmProjects/project$ dvc exp run
Stage 'inference' didn't change, skipping
Running stage 'load_data':
>…

DanielBell99
- 896
- 5
- 25
- 57
1
vote
1 answer
DVC shows files not tracked in source control in visual studio code
I'm using DVC extension in VScode inside a python project. The problem is that dvc shows files not tracked by dvc in the source control panel! As in the following picture.
DVC track only data folder and not the src folder. How can I fix it? Have you…

Will
- 1,619
- 5
- 23
1
vote
1 answer
Python: Ssl Certificate verify failed
I have installed dvc on my ubuntu-18.04-LTS system and while trying to download the data files from github using dvc, it fails with below error.
$ dvc get https://github.com/iterative/dataset-registry get-started/data.xml -o data/data.xml…

user4948798
- 1,924
- 4
- 43
- 89
1
vote
0 answers
How to fix DVC error 'FileNotFoundError: [Errno 2] No such file or directory' in Github actions
Trying to pull a folder with test data into a GitHub actions container, I get
FileNotFoundError: [Errno 2] No such file or directory
I tried running dvc checkout --relink locally, but that did not work. I am using Gdrive for the data-repository…

Soerendip
- 7,684
- 15
- 61
- 128