0

I have checked the documentation of the package and found an example of how they fitted a DTMC on data.frame objects using the following code:

library(holson)
data(holson)
singleMc<-markovchainFit(data=holson[,2:12],name="holson")

The data I apply the code to is structured essentially in the same way as the holson data only that there are 10 states. Additionally, the numbers in my excel file are indeed integers and not class characters. These states are the numbers 1 to 10. When I run the code on my data it gives me a transition matrix where the states are listed as followed in (1,10,2,3,4,5,6,7,8,9). Thus in the matrix the state following 1 is 10.

It appears to me that R Studio thinks that the character 10 is between 1 and 2? (Like lexicographic sorting?) How can I fix this issue and have the package recognize 10 as the character following 9?

EDIT: Here is an example

library(markovchain)
set.seed(12)
Test <- data.frame(entity = LETTERS[1:100],
                   Time1 = round(runif(n = 100, min = 1, max = 10)),
                   Time2 = round(runif(n = 100, min = 1, max = 10)),
                   Time3 = round(runif(n = 100, min = 1, max = 10)))
Test_Fit <- markovchainFit(data=Test[,2:4] , name="Test_FIT")
Est_Test_Fit <- Test_Fit$estimate
Est_Test_Fit@transitionMatrix
  • I suspect it’s not the `markovchain` package that’s the problem, but your input dataset. Take a look at it in the console. I suspect your `state` variable is character not numeric. If not, please post a MRE so we can investigate. – Limey Feb 07 '22 at 14:25
  • Thank you for the comment! I edited my post to include an example. I'm still rather new to R so sorry if its inconvenient. – Hermann Josef Feb 08 '22 at 06:23

1 Answers1

0

I am not familiar with the markovchain package. I tend to use r4jags.

Reading the markovchain manual, it seems that a call to markovchainFit should be preceded by a call to createSequenceMatrix. (See the example code on page 11 of the manual.) The first parameter of createSequenceMatrix is "... a n x n matrix or a character vector or a list". Thus, contrary to my comment above, it seems that markovchain expects the state labels to be character rather than numeric. Given your question, it seems that your states are ordered rather than merely categorical, so the ordering '"1"', `"10", '"2"' is a problem for you.

The solution would be to convert your numeric state labels to character before calling markovChainFit/createSequenceMatrix. Here are two possible ways of doing this:

charState <- LETTERS[state]

which will give you state labels of "A" to "J". or

charState <- sprintf("%02i", state)

which produces "01", "02", ... , "10".

By the way, did you run your test code before adding it to your question? Rows 27 to 100 of your Test dataframe have entity equal to NA, which I suspect is not what you intended. In addition, I suspect your columns Time1 to Time3 are misnamed because I believe they contain states of the process rather than times at which the process(es) was/were in a given state.

Limey
  • 10,234
  • 2
  • 12
  • 32
  • Thank you for the helpful comment! I was using the markovchain package simply as it is done on page 37 with the Holson example. I don't quite grasp the difference though. Truthfully I have rushed that test code to get my point across and stopped checking it when I got that transition matrix... The NA is probably because I named them with Letters haha Thanks again for the feedback!!! – Hermann Josef Feb 09 '22 at 04:32