2

I want to launch some experiments in DVC. But when I set values of experiment parameters, DVC deletes file 'params.yaml', and experiment doesn't set in queue.

Simplified code for example: Python file 'test.py':

import numpy as np
import json
import yaml

params = yaml.safe_load(open('params.yaml'))["test"]

precision = np.random.random()
recall = params['value']
accuracy = np.random.random()
 

rows = {'precision': precision,
        'recall': recall,
        'accuracy': accuracy}


with open(params['metrics_path'], 'w') as outfile:
    json.dump(rows, outfile)

fpr = 10*np.random.random((1,10)).tolist()
tpr = 10*np.random.random((1,10)).tolist()

with open('plot.json', 'w') as outfile2:
    json.dump(
      {
        "roc": [ {"fpr": f, "tpr": t} for f, t in zip(fpr, tpr) ]
      }, 
      outfile2
      )

params.yaml:

test:
  metrics_path: "scores.json"
  value: 1

dvc.yaml:

stages:
  test:
    cmd: python test.py
    deps:
    - test.py
    params:
    - test.metrics_path
    - test.value
    metrics:
    - scores.json:
        cache: false
    plots:
    - plot.json:
        cache: false
        x: fpr
        y: tpr

It is strange behavior. Is it possible to fix it?

Alimagadov K.
  • 175
  • 2
  • 7
  • 1
    Sorry, I can not reproduce it. what version of `dvc` are you using? ( output from `dvc doctor`) It runs as expect when I run `dvc exp run -S test.value=2` and the result from `dvc exp show` is also correct. – karajan1001 Jun 02 '22 at 09:40
  • DVC version: 2.10.2 Command ```dvc exp run -S test.value=2``` works correctly on my computer too. But when I try to use command ```dvc exp run --queue -S test.value=101``` to put experiment in queue, DVC deletes file 'params.yaml'. – Alimagadov K. Jun 02 '22 at 10:30
  • And then after using command ```dvc exp run --run-all``` I get error message: ERROR: 'dvc.yaml' does not exist ERROR: Failed to reproduce experiment '5920339' But this file exists (in the current directory, where DVC project was initialized)! – Alimagadov K. Jun 02 '22 at 10:42
  • Sorry, still can not reproduce it. Are you in a clean repo? – karajan1001 Jun 02 '22 at 13:58
  • There are only script, metrics files, dvc and git files in folder with dvc project: ``` /Documents/temp$ ls dvc.lock dvc.yaml params.yaml plot.json scores.json test.py ``` (I use this simple project for example, but I have same problem in my another big project) What do you mean for "clean repo"? – Alimagadov K. Jun 02 '22 at 15:48
  • 1
    @AlimagadovK. for now you can `git add params.yaml` (no need to `git commit`) before each `dvc exp run --queue` so it doesn't get deleted, but this may indeed be a bug... Feel free to open a report directly in https://github.com/iterative/dvc/issues ! – Jorge Orpinel Pérez Jun 02 '22 at 17:47
  • @Jorge Orpinel Pérez, Thank you! File 'params.yaml' doesn't delete if command ```git add params.yaml``` was called before, but command ```dvc exp run --run-all``` still not working. I get message ```ERROR: 'dvc.yaml' does not exist``` with this file located in the current dvc-project directory. I have reported about strange behavior for ```dvc exp run --queue```, but now I need to launch 'run-all' for queue. Maybe, is it possible to do it in some way? – Alimagadov K. Jun 03 '22 at 07:43
  • `dvc.yaml` has to be in Git too. Queued experiments only see files and changes that are committed (or at least Git-staged). – Jorge Orpinel Pérez Jun 04 '22 at 08:50

1 Answers1

0

I solved my problem. It is necessary, that all files (executable scripts, 'dvc.yaml', 'params.yaml') be tracked by git. In this case dvc exp run command works correctly.

Alimagadov K.
  • 175
  • 2
  • 7