0

I have an application for which the testing is quite extensive. Essentially, we must run the application a few hundred thousand time on different input. So I have built a custom Gradle task which manages forking processes and reaping finished processes. Each of the thousands of test runs generate a file that goes in a results directory. Full testing can take about a week when distributed across 10 cluster nodes.

The issue is, if I need to stop the testing for whatever reason, then there is currently no way for me to start back up where I left off. Gradle's incremental build and caching features (to my understanding) really only work when tasks finish, and it will just rerun the entire task from scratch if the previous invocation was interrupted (ctrl-c).

I could build in some detection of the results files and only rerun segments for which there is no results file. However, this will not work properly when the application is rebuilt, and then testing legitimately must start from scratch.

So how can I reliably detect which testing segments are up to date when the previous task invocation was interrupted?

DBear
  • 312
  • 2
  • 9
  • This sounds like a fun challenge, and certainly one that Gradle is suited for, but it's difficult to answer without more details. What is the task you have written? What are the inputs? I guess that it would be best if instead of having one single Gradle task that ran everything, you should define a single Gradle task type, but [register multiple instances](https://docs.gradle.org/7.6/userguide/tutorial_using_tasks.html#sec:dynamic_tasks) - one per combination of inputs. – aSemy Jan 28 '23 at 23:30

1 Answers1

1

Annotated tasks

For any Gradle task, if its output files exist and its inputs (including predecessor tasks) are all up-to-date, Gradle will treat that task as up-to-date and not run it again. You tell Gradle about inputs and outputs by annotating properties of the class you write to define the task.

You can make use of this by breaking your custom Gradle testing task into a number of smaller test tasks and have each of those task definitions declare annotated outputs. The test reports are probably the most suitable for those outputs. Then the test tasks which have a report will not have to be re-run if you stop the build halfway.

A whole application rebuild will always need all tests to be re-run

However, if your whole application is rebuilt then those test tasks will no longer be up-to-date as their predecessor build tasks will not be up-to-date. This makes sense of course: a new application build needs its tests to be run again to check it still works as intended.

Multimodule builds may mean only part of an application needs rebuilding

It may be that there are parts of the application that are not being rebuilt, and test tasks that depend solely on those intact parts of the application. If the chain of predecessor tasks for any previously completed test task are all up-to-date, then Gradle will not re-run those tests again either.

This would be more likely to be the case for more test tasks if your application, if appropriate, is separated into different Gradle subprojects in a multimodule build. Each then would have its own chain of tasks which may not have to be re-run if only part of the application's code or other inputs is changed.

Simon Jacobs
  • 1,147
  • 7
  • 7
  • I believe this could solve my problem; however, it's not an ideal solution because it means Gradle manages the testing parallelism, which is a global parameter for the entire build. As I mentioned in the original question, I split testing across multiple cluster nodes, so that a few hundred test segments can run in parallel. But obviously I don't want want to run the _entire_ build with hundreds of threads since most of the build runs on only a single machine. In fact part of the build should be run fully serial to avoid running out of memory. – DBear Jan 30 '23 at 16:27
  • I could group multiple test segments into a single task, but then the cluster will not be fully utilized for the last half-hour or so of each task. If there are more segments per task, then there is more duplicated work upon continuing after an interrupt; if there are fewer segments per task, then there is lower cluster utilization. All this could be avoided if 1) there was a good way to determine sub-task up-to-date-ness, or 2) there was a way to put some tasks into a separate parallelism domain. – DBear Jan 30 '23 at 16:34
  • 1
    Gradle can run tasks in parallel from different subprojects. But yeah, if you're running hundreds of tests in parallel there doesn't seem much alternative but to group many tests into one Gradle task, and then you run up against the fact Gradle can't mark only part of a task up-to-date, as you say. Not sure then there is much more you can do at the Gradle level and looks like you'll have to write your own code to determine which tests within a given Gradle task to re-run on a partial re-build. – Simon Jacobs Jan 31 '23 at 04:37