0

I am following this tutorial on creating a Markov chain for attribution modelling in R.

However when I run it I get a compilation error, a solution was proposed on this page however it is still giving the same error, I have also tried two different workarounds to get past the compilation error, but both result in runtime errors where the output looks nothing like the one in the tutorial. The workarounds are documented at the bottom.

Any help with this would be highly appreciated.

library(dplyr)
library(reshape2)
library(ggplot2)
library(ggthemes)
library(ggrepel)
library(RColorBrewer)
library(ChannelAttribution)
library(markovchain)

##### simple example #####
# creating a data sample
df1 <- data.frame(path = c('c1 > c2 > c3', 'c1', 'c2 > c3'), conv = c(1, 0, 0), conv_null = c(0, 1, 1))

# calculating the model
mod1 <- markov_model(df1,
                var_path = 'path',
                var_conv = 'conv',
                var_null = 'conv_null',
                out_more = TRUE)

# extracting the results of attribution
df_res1 <- mod1$result

# extracting a transition matrix
df_trans1 <- mod1$transition_matrix
df_trans1 <- dcast(df_trans1, channel_from ~ channel_to, value.var = 'transition_probability')

### plotting the Markov graph ###
df_trans <- mod1$transition_matrix

# adding dummies in order to plot the graph
df_dummy <- data.frame(channel_from = c('(start)', '(conversion)', '(null)'),
                   channel_to = c('(start)', '(conversion)', '(null)'),
                   transition_probability = c(0, 1, 1))
df_trans <- rbind(df_trans, df_dummy)

# ordering channels
df_trans$channel_from <- factor(df_trans$channel_from,
                            levels = c('(start)', '(conversion)', '(null)', 'c1', 'c2', 'c3'))
df_trans$channel_to <- factor(df_trans$channel_to,
                            levels = c('(start)', '(conversion)', '(null)', 'c1', 'c2', 'c3'))
df_trans <- dcast(df_trans, channel_from ~ channel_to, value.var = 'transition_probability')

# creating the markovchain object
trans_matrix <- matrix(data = as.matrix(df_trans[, -1]),
                   nrow = nrow(df_trans[, -1]), ncol = ncol(df_trans[, -1]),
                   dimnames = list(c(as.character(df_trans[, 1])),             
c(colnames(df_trans[, -1]))))
trans_matrix[is.na(trans_matrix)] <- 0
trans_matrix1 <- new("markovchain", transitionMatrix = trans_matrix)

# plotting the graph
plot(trans_matrix1, edge.arrow.size = 0.35)

The error occurs on this line

trans_matrix1 <- new("markovchain", transitionMatrix = trans_matrix)

Error:
Aggregation function missing: defaulting to length
Error in validObject(.Object) : 
  invalid class “markovchain” object: Error! Rows of transition matrix do not some one

EDIT:

I have tried the following, both give runtime errors:

  1. user2554330s solution to divide each row by using

trans_matrix <- trans_matrix/rowSums(trans_matrix)

  1. Manually deleting the last row and column which are both NA, so that all columns equal 1
DataKing
  • 37
  • 1
  • 6
  • I think that's meant to say "do not sum to 1" – John Haugeland Nov 29 '20 at 16:36
  • It probably is, but that is extracted straight from the console, so I will leave it as it is in case it makes it easier for people to find the same issue. – DataKing Nov 29 '20 at 16:41
  • ya, i mean, i actually went to go fix it in the source package, only to realize it wasn't on github or gitlab – John Haugeland Nov 29 '20 at 18:08
  • I followed the advice that was on the previous question related to this issue, which was to download the developer version, there is more detail on that in the link in the question, so I am seeing slightly different output. If you could help at all it would be really appreciated. – DataKing Nov 29 '20 at 21:15

1 Answers1

0

The message contains an English error: instead of "do not some one" it should say "do not sum to one". The matrix you are using looks like this:

> trans_matrix
             (start) (conversion) (null) NA
(start)            1            0      0  2
(conversion)       0            1      0  0
(null)             0            0      1  0
<NA>               0            1      2  2

To fix this, you could divide each row by its sum using

trans_matrix <- trans_matrix/rowSums(trans_matrix)

After that there's no error. Whether that's the right matrix, I don't know.

user2554330
  • 37,248
  • 4
  • 43
  • 90
  • Thank you for your help, its probably because I am using a development patch created by someone who's first language is not English. I have made that change however it seems to now be outputting a different matrix to what I am following in the tutorial https://www.analyzecore.com/2016/08/03/attribution-model-r-part-1/ My apologies for the lack of understanding here, instead of posting another question I don't suppose you can tell from looking what is causing the output to be different? I have also tried trans_matrix <- trans_matrix[-4,-4] but to no avail. – DataKing Nov 29 '20 at 15:50
  • That message also appears in the version of the package on CRAN. I'd suggest you report the correction to the maintainer of the package, and ask them for help: they're more likely than me to spot where you're doing something wrong. – user2554330 Nov 29 '20 at 20:16
  • I have messaged the maintainer. – DataKing Nov 29 '20 at 21:15