12

I have a Validator interface which provides a isValid(Thing) method, returning a ValidationResult which contains a boolean and a reason message.

I want to create a ValidatorAggregator implementation of this interface which performs an OR across multiple Validators (if any Validator returns a positive result, then the result is positive). If any validator succeeds, I'd like to short-circuit and just return its result. If no validator succeeds, I want to return all of the failure messages.

I can do this succinctly using a stream and findFirst().orElse(...) but using this pattern I lose all the intermediate results if findFirst returns empty:

public ValidationResult isValid(final Thing thing) {
    return validators.stream()
      .map(v -> validator.isValid(thing))
      .filter(ValidationResult::isValid)
      .findFirst()
      .orElseGet(() -> new ValidationResult(false, "All validators failed'));
}

Is there any way to capture the failed results using a stream, or indeed just more succinctly than the below?

public ValidationResult isValid(final Thing thing) {
    final Set<ValidationResult> failedResults = new HashSet<>();
    for (Validator validator : validators) {
        final ValidationResult result = validator.isValid(thing);
        if (result.isValid()) {
            return result;
        }
        failedResults.add(result);
    }
    return new ValidationResult(false, "No successful validator: " + failedResults); 
    // (assume failedResults stringifies nicely)
}

Edit: based on comments, I agree what I'm trying to do is premature optimisation (particularly as these validators are very lightweight). I'll probably go with something similar to Holger's solution of computing all validations and partitioning into successful/unsuccessful results.

This was marked as a dupe of Can you split a stream into two streams? and the partitioningBy answer sort-of is, but I think this question is asking, and the discussion answering, a different problem.

Rik
  • 790
  • 8
  • 11
  • 6
    The non-stream implementation is going to be difficult to beat. – Louis Wasserman Feb 26 '18 at 22:12
  • It will be difficult to get the short circuit behaviour with internal iteration. I reckon you could put something together with Java 9 `takeWhile` and a custom `Collector`, but it'll be messier than your loop... – Boris the Spider Feb 26 '18 at 22:17
  • 3
    your stream solution, even without the feature of saving intermediate results, doesn't look any more concise than the non-stream one to me. – guido Feb 26 '18 at 22:23
  • Possible duplicate of [Can you split a stream into two streams?](https://stackoverflow.com/questions/19940319/can-you-split-a-stream-into-two-streams) – n247s Feb 26 '18 at 22:29
  • 1
    As others mentioned, it will be difficult to get the short circuit behaviour. On the other hand, consider if it's not a case of premature optimisation. – Apokralipsa Feb 27 '18 at 06:30
  • 1
    You can spin a custom collector for this, *but* it's not going to be neither shorter nor short-circuiting like your for loop, so stick to that – Eugene Feb 27 '18 at 08:26
  • 2
    Any variant is a trade-off and may perform unnecessary operations. You should use the variant which serves best the case with the highest likelihood. – Holger Feb 27 '18 at 08:59
  • Thanks for the feedback everyone. I think the key takeaway I've got here is that it might be potentially be possible with some custom collectors, but it's almost certainly not going to be clearer than the for loop solution. In fact, I'm inclined to agree with Apokralipsa: the short-circuiting is almost certainly a premature optimisation, so a simpler implementation can be made instead. – Rik Feb 27 '18 at 20:33

1 Answers1

3

There is no perfect solution that handles all cases with the same efficiency. Even your loop variant, which fulfills the criteria of being short-circuiting and processing the validators only once, has the disadvantage of creating and filling a collection that might turn out to be unnecessary if just one validation succeeds.

The choice depends on the actual costs associated with the operations and the likelihood of having at least one successful validation. If the common case gets handled with the best performance, it may outweigh the solution’s penalties on the handling of the uncommon case.

So

// you may use this if the likelihood of a success is high; assumes
// reasonable costs for the validation and consists (repeatable) results
public ValidationResult isValid(final Thing thing) {
    return validators.stream()
      .map(v -> v.isValid(thing))
      .filter(ValidationResult::isValid)
      .findFirst()
      .orElseGet(() -> new ValidationResult(false, "All validators failed"
        + validators.stream().map(v -> v.isValid(thing)).collect(Collectors.toSet())));
}
// you may use this if the likelihood of a success is
// very low and/or you intent to utilize parallel processing
public ValidationResult isValid(final Thing thing) {
    Map<Boolean,Set<ValidationResult>> results = validators.stream()
        .map(v -> v.isValid(thing))
        .collect(Collectors.partitioningBy(ValidationResult::isValid, Collectors.toSet()));
    return results.get(true).stream().findAny()
        .orElseGet(() -> new ValidationResult(false,
                             "No successful validator: "+results.get(false)));
}
// if chances of successful validation are mixed or unpredictable
// or validation is so expensive that everything else doesn't matter
// stay with the loop
public ValidationResult isValid(final Thing thing) {
    final Set<ValidationResult> failedResults = new HashSet<>();
    for (Validator validator : validators) {
        final ValidationResult result = validator.isValid(thing);
        if (result.isValid()) {
            return result;
        }
        failedResults.add(result);
    }
    return new ValidationResult(false, "No successful validator: " + failedResults);
}

Consider sorting the list so that validators with a higher chance of success are at the beginning…

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Thanks for the different options. I think I'm aligned on a combination of your answer + Apokralipsa's key point that the short-circuiting is probably a premature optimisation. So the second option you provided (or some variation thereof, with/without streams) is likely the best and simplest one: just evaluate them all. – Rik Feb 27 '18 at 20:35
  • I'll mark this as the accepted answer as although it didn't directly solve the question I asked, it did provide several right ways to do what I ultimately wanted to achieve. – Rik Feb 27 '18 at 20:36