7

I think most people are now using branch coverage as a quality metric over statement coverage, but one metric I've not seen much about is: the quality of the test itself.

For example, I could write tests which exercise many of the branches in my code, but none of the tests do an assert. So while I've executed a lot of my branches, I've not checked the return conditions properly. Is there any way to capture this "assert" metric?

Are people using any metrics on the tests themselves?

John Farrelly
  • 7,289
  • 9
  • 42
  • 52
  • Technically for some languages/framework it is possible to build it into your CI pipeline. Eg: for php you may use phpcs and create a custom sniff that checks that every test method has at least one explicit assertion (or implicit from a mock) – zerkms Dec 10 '14 at 10:18
  • 1
    nunit will output the number of asserts-per-test. A test with 0 asserts would definitely be a smell. – Kevin Up Dec 10 '14 at 20:47
  • 1
    @Kevin Up: I don't know what nunit is counting but assertions are not everything to be considered. A meaningful test may contain no assertions but be annotated with ``@Test(expected = IllegalArgumentException.class)``. – nrainer Dec 11 '14 at 11:29
  • @Kevin Up: see also http://programmers.stackexchange.com/questions/7823/is-it-ok-to-have-multiple-asserts-in-a-single-unit-test – nrainer Dec 19 '14 at 10:46

2 Answers2

4

The blog post What does code coverage really mean? deals with this question. The study results indicate that the code coverage of unit tests is in general a good indication for the regression test reliability. For system tests (which execute large portions of a software system) the code coverage is not a useful approximation for the reliability.

Mutation testing can be used to evaluate the effectiveness of test cases. The idea is to mutate the source code by introducing faults and to check whether the test cases are capable of detecting the faults. The usual approach is to apply a mutation operator (eg: remove a line of code, replace an addition with a subtraction, invert a boolean condition) on a single method, to run all tests and to check if at least one of the test cases fails. The test cases which fail were able to reveal the broken code. The downsides of mutation testing are its computational complexity and the problem of equivalent mutants distorting the results (equivalent mutants are code chunks which were syntactically mutated but remained semantically unchanged). Pitest is a mutation testing system for Java which is used in industry.

Concerning test cases that do not contains any assertions, Martin Fowler writes:

Although assertion-free testing is mostly a joke, it isn’t entirely useless. [...] Some faults [such as null pointer exceptions] do show up through code execution.

nrainer
  • 2,542
  • 2
  • 23
  • 35
0

There are conceptual coverage metrics that check the quality of oracle (assertion), requirement coverage, or even cross-check the oracle with the code exercised (checked coverage: http://onlinelibrary.wiley.com/doi/10.1002/stvr.1497/full). They all promise a better fault-finding capability, backed-up by experimental results, but I'm doubtful if any of them are implemented by the tools we mostly use.

Taejoon Byun
  • 101
  • 1
  • 4