2

I'm struggling with the DVC experiment management. Suppose the following scenario:

I have params.yaml file:

recommendations:
  k: 66
  q: 5

I run the experiment with dvc exp run -n exp_66, and then I do dvc exp push origin exp_66. After this, I modify params.yaml file:

recommendations:
  k: 99
  q: 5

and then run another experiment dvc exp run -n exp_99, after which I commit with dvc exp push origin exp_99.

Now, when I pull the corresponding branch with Git, I try to pull exp_66 from dvc by running dvc exp pull origin exp_66. This does the pull (no error messages), but the content of the params.yaml file is with k: 99 (and I would expect k: 66). What am I doing wrong? Does git push have to be executed after dvc push? Apart from that, I also found dvc exp apply exp_66, but I'm not sure what it does (it is suggested that after apply one should execute git add ., then git commit?

I would really appreciate if you could write down the workflow with committing different experiments, pushing, pulling, applying, etc.

  • Hopefully if you read both https://dvc.org/doc/user-guide/experiment-management/sharing-experiments and https://dvc.org/doc/user-guide/experiment-management/persisting-experiments it will help clarify. – Jorge Orpinel Pérez Mar 03 '22 at 18:57

1 Answers1

4

You did everything alright. In the end, after pulling, you can see that when using dvc exp show your experiments will be there. To restore the experiment available from your experiment list into your workspace, you simply need to run dvc exp apply exp_66. DVC will make sure that the changes corresponding to this experiment will be checked out.

Your workflow seems correct so far. One addition: once you make sure one of the experiments is what you want to "keep" in git history, you can use dvc exp branch {exp_id} {branch_name} to create a separate branch for this experiment. Then you can use git commands to save the changes.

don_pablito
  • 382
  • 1
  • 9
  • Thanks, it worked. However, I'm still not sure how dvc does the saving, i.e., how does it know which parameter value corresponds to a certain experiment, without ever doing `git` commands. As for your addition: why would I need to use `git` commands if all works without it? Could you please make an example as an extension to your answer? I will then accept the answer. Once again, thanks a lot! – kevin_was_here Mar 03 '22 at 15:13
  • Also, if you cold explain what `dvc push` and `dvc pull` do, and if they are really necessary in the above workflow... – kevin_was_here Mar 03 '22 at 15:14
  • Ok, so: 1. Regarding the experiments - they are actually using a lot of git capabilities under the hood. DVC experiments are essentially a custom references (to grasp an idea tags and branches are references that exist in git by default). To read more about that I recommend going through blogpost about it: https://dvc.org/blog/experiment-refs 2. `dvc push` and `dvc pull` are commands designed for initial DVC capabilities, back when there were no `exp` and one would need to use git branches and tags to version the experiments. – don_pablito Mar 03 '22 at 17:02
  • 2
    If I understood correctly, `dvc push` and `dvc pull` are essentially not needed, given the above workflow, right? – kevin_was_here Mar 03 '22 at 17:27
  • 2
    Yes, when interacting with `exp` the commands within `exp` scope are enough. – don_pablito Mar 03 '22 at 20:17