I have an application for which the testing is quite extensive. Essentially, we must run the application a few hundred thousand time on different input. So I have built a custom Gradle task which manages forking processes and reaping finished processes. Each of the thousands of test runs generate a file that goes in a results directory. Full testing can take about a week when distributed across 10 cluster nodes.
The issue is, if I need to stop the testing for whatever reason, then there is currently no way for me to start back up where I left off. Gradle's incremental build and caching features (to my understanding) really only work when tasks finish, and it will just rerun the entire task from scratch if the previous invocation was interrupted (ctrl-c).
I could build in some detection of the results files and only rerun segments for which there is no results file. However, this will not work properly when the application is rebuilt, and then testing legitimately must start from scratch.
So how can I reliably detect which testing segments are up to date when the previous task invocation was interrupted?