I'm using mill
to build a pipeline that
- cleans up a bunch of CSV files (producing new files)
- loads them into a database
- does more work in the database (create views, etc)
- runs queries to extract some files.
Should the tasks associated with steps 2 and 3 be producing something analogous to PathRef
? If so, what? They aren't producing a file on the disk but nevertheless should not be repeated unless the inputs change. Similarly, tasks associated with step 3 should run if tasks in step 2 are run again.
I see in the documentation for targets that you can return a case class and that re-evaluation depends on the .hashCode
of the target's return value. But I'm not sure what to do with that information.
And a related question: Does mill
hash the code in each task? It seems to be doing the right thing if I change the code for one task but not others.