I'm trying to detect anomalies using Markov Chains. I have a training dataset with a sequence of events that I used to create a probability transition matrix. Then, I create another matrix using a test dataset. I'm looking for a way to compare these to matrices in order to spot anomalies. Example: let's say event A to event C happens 0 times in the training data, and therefore its probability in the matrix is 0. If this transition from event A to event C does happen in the test dataset, it will have a probability larger than 0. This is something I'd like to detect.
I tried just substracting the 2 matrices and then reporting everything that's larger than 0, but this is not great: a probability of 0 in training and 0.1 in test is more relevant (and anomalous) than a probability of 0.7 in training and 0.6 in test. Just substracting them does not showcase it this way. Moreover, this way it sees a difference of 0.5 vs 0.7 more anomalous than 0.0 vs 0.1.
Also, a probability of 0.3 in training and 0.6 in test is more important (because doubled) than 0.7 in training and 1 in test (because maybe the other events just did not happen in the test set, which is fine). By the way, I use pandas crosstab and series to calculate the transition matrix.