Questions tagged [dvc]

Data Version Control (DVC) is an open-source version control system for ML and data science projects. Use this tag for questions related to DVC usage and workflows.

138 questions
1
vote
1 answer

DVC error : scmrepo.exceptions.SCMError: Empty git repo

I tried initializing DVC and was getting this error . I am not using GIT actively to track anything although git is initialised . I am trying to create some plots from dvclive import Live live = Live("evaluation2/metrics") does anyone have any…
Kaiser
  • 187
  • 8
1
vote
1 answer

How to change or specify a DVC experiment name?

How do I change the name of the experiment? I tried to use dvc exp run -n to name the project then use git to push to github. However the experiment name is still SHA. Tried: I tried to use dvc exp run -n to name the project then use git to push to…
cyMLOps
  • 23
  • 3
1
vote
0 answers

DVC get problems with the url

I'm trying to obtain the diamonds.csv dataset contained here trough dvc get using $ dvc get https://github.com/tidyverse/ggplot2 \ data-raw/diamonds.csv -o data/dataset/diamonds.csv But I obtained the following error: Which summarise should be:…
paolopazzo
  • 277
  • 2
  • 14
1
vote
1 answer

How to disable DVC git hooks

DVC has Git hooks which are installed with dvc install. The hooks were working fine but after an error with dvc push and the DVC remote, I cannot git push because before git push gets executed, dvc push runs and generates an error. Which means I…
Bex T.
  • 1,062
  • 1
  • 12
  • 28
1
vote
1 answer

DVC remove unused file from remote repository S3

I'm trying to remove 1 file from S3 remote repository which is no longer tracked by DVC. So, I did: dvc remove .dvc file git add & commit the .gitignore and .dvc files run dvc gc -c --workspace however, the process of deleting 1 file (13KB) took 6…
wumbow
  • 11
  • 3
1
vote
1 answer

dvc push certificate verify fail for gcp

Trying to push data into a bucket in Google Cloud Platform (GCP) from Visual Studio via dvc push ERROR: Tried: gcloud auth login gcloud auth application-default login gsutil ls pip install dvc[gs] And I am in the corresponding project and bucket…
1
vote
1 answer

Experiment tracking for multiple ML independent models using WandB in a single main evaluation

Can you recommend from your experience about choosing a convenient tracking experiment tool and versioning only "Multi independent models, but one input->multi-models->one output" in order to get single main evaluation and conveniently compare…
1
vote
1 answer

Python, Using dvc, How does it work? Does it keep all the data files versions? Can it lead to extra cloud charges?

I consider to learn about using dvc (https://dvc.org/), but before that I have some questions regarding dvc with cloud: Does dvc saves all the different versions of the dataset? Does dvc support all data files format (csv, feather)? Can the usage…
1
vote
1 answer

How to efficiently use S3 remote with DVC among multiple developers with different aws configs?

The DVC remote configuration allows to define a profile for the AWS CLI to use. However, some developers might have their local AWS cli configuration use different profiles whose name they find helpful. Is there a way to override the profile used by…
Edmondo
  • 19,559
  • 13
  • 62
  • 115
1
vote
2 answers

ERROR: Cannot add 'folder-path', because it is overlapping with other DVC tracked output:

Goal: add commit push all contents of project_model/data/ to dvcstore. I don't have any .dvc files in my project. $ dvc add ./project_model/data/ ERROR: Cannot add '/home/me/PycharmProjects/project/project_model/data/images', because it is…
DanielBell99
  • 896
  • 5
  • 25
  • 57
1
vote
1 answer

DVC imports authentication to blob storage

I'm using DVC to track and version data that is stored locally on the file system and in Azure Blob storage. My setup is as follows: DataProject1, it uses a local file location as a remote therefore it does not require any…
Giuseppe Romagnuolo
  • 3,362
  • 2
  • 30
  • 38
1
vote
1 answer

DVC | Permission denied ERROR: failed to reproduce stage: failed to run: .py, exited with 126

Goal: run .py files via. dvc.yaml. There are stages before it, in dvc.yaml, that don't produce the error. dvc exp run: (venv) me@ubuntu-pcs:~/PycharmProjects/project$ dvc exp run Stage 'inference' didn't change, skipping Running stage 'load_data': >…
DanielBell99
  • 896
  • 5
  • 25
  • 57
1
vote
1 answer

DVC shows files not tracked in source control in visual studio code

I'm using DVC extension in VScode inside a python project. The problem is that dvc shows files not tracked by dvc in the source control panel! As in the following picture. DVC track only data folder and not the src folder. How can I fix it? Have you…
Will
  • 1,619
  • 5
  • 23
1
vote
1 answer

Python: Ssl Certificate verify failed

I have installed dvc on my ubuntu-18.04-LTS system and while trying to download the data files from github using dvc, it fails with below error. $ dvc get https://github.com/iterative/dataset-registry get-started/data.xml -o data/data.xml…
user4948798
  • 1,924
  • 4
  • 43
  • 89
1
vote
0 answers

How to fix DVC error 'FileNotFoundError: [Errno 2] No such file or directory' in Github actions

Trying to pull a folder with test data into a GitHub actions container, I get FileNotFoundError: [Errno 2] No such file or directory I tried running dvc checkout --relink locally, but that did not work. I am using Gdrive for the data-repository…
Soerendip
  • 7,684
  • 15
  • 61
  • 128