4

I am testing an approximative algorithm (FastDTW) against the optimal solution by calculating the relative error and comparing that to the errors given in the paper [1].

Problem is that the errors can get much larger than the ones given in the paper and without setting the tolerance to "accept all" there is no way that all tests pass.

Is there a way to tell QuickCheck that I expect only n of the tests to pass? I see that there is the function cover. But just wrapping the test in that does not seem to work as expected.

Alternatively I could run the test several times manually and pass if at least n tests pass, but I hope that this can be achieved through QuickCheck.

Edit in response to Carsten

I wrapped like this:

actualTest :: [Double] -> [Double] -> Bool
actualTest = ... -- runs dtw and fastDtw, compares errors against goal

coverTest :: Property
coverTest = cover True percentage label actualTest

But I am not sure about the first parameter class, or the label one. Thinking about it more ... I guess cover is used to ensure that at least a certain percentage of tests conform to a certain condition.

[1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.432.4253&rep=rep1&type=pdf#page=64

fho
  • 6,787
  • 26
  • 71
  • how did you wrap it in `cover`? – Random Dev Jul 20 '16 at 13:58
  • You can generate an arbitrary *list* of cases, count how many pass, and set your own threshold. – dfeuer Jul 20 '16 at 14:46
  • @dfeuer yep ... that's what I am doing now. But this doesn't seem to be ... *elegant* :-) – fho Jul 20 '16 at 16:01
  • No, it's not so elegant, but QuickCheck wasn't designed for testing Monte Carlo algorithms! To really test what you're after, you shouldn't really be counting cases that lie within fixed error bounds anyway. I'm no statistics master, but I think you should probably be looking at some more global error measurement (something in the general nature of the mean of the sum of the squares of the errors, say). The paper itself may give some insight into what that global measurement should be, or you can consult your local statistics expert. – dfeuer Jul 20 '16 at 18:07
  • There is also another problem: Values in `Arbitrary` instances are constructed to cover various corner cases and produce relatively small test cases, so they don't have uniform (or otherwise "nice") statistical distribution. So it's difficult to test any statistical properties of algorithms that take such values as inputs. My guess would be that a proper test should start with well defined input distribution and use some kind of [statistical hypothesis testing](https://en.wikipedia.org/wiki/Statistical_hypothesis_testing). – Petr Jul 21 '16 at 17:57

0 Answers0