In the PST
package one can estimate the prediction quality of individual sequences using the log-loss
, e.g:
R> ex2 <- c("a-a-b", "a-b-a-a-b", "b-b-b-b-a")
R> ex2 <- seqdef(ex2)
R> predict(S1.p1, ex2, output = "logloss")
logloss
[1] 0.9183
[2] 0.7311
[3] 0.9600
How do I compare these log-loss
values statistically? Is there a way to show that 0.9183
is significantly different from 0.9600
?