Estimating probabilities for model subset after grouping

Question

The data used is available here (the file is called "figshare.txt").

I estimated the transition probabilities for a Markov model where the observations were grouped by location (group_by(km)).

data <- data %>% group_by(km) %>% summarize(pp_chain=list(pp)) %>% as.data.frame 
pp_chains <- data$pp_chain; names(pp_chains) <- data$km
fit <- markovchainFit(pp_chains)

The output (summarized here) shows the probability estimates for the model overall:

print(fit$estimate)
      0          1
0 0.9116832 0.08831677
1 0.5250852 0.47491476

Hypothetically, the output I am after would be more specific and would provide me with probabilities for each location (km).

It would look something like this:

km = 80
      0          1
0 0.7116832 0.28831677
1 0.1250852 0.17491476

km = 81
      0          1
0 0.8116832 0.18831677
1 0.4250852 0.37491476

km = 83
      0          1
0 0.6116832 0.38831677
1 0.3250852 0.27491476

Does anyone know how to extract the Markov model estimates for each location (km) individually after the model is run?

I know dplyr is your flavor, maybe base R's `by` can help: `by(data, data$km, FUN=function(df) markovchainFit(df$pp))` — Parfait, Feb 12 '18 at 02:09
@Parfait That also gives me the solution I was looking for. Thank you! — Blundering Ecologist, Feb 12 '18 at 04:00

score 1 · Accepted Answer · answered Feb 12 '18 at 02:31

Would a simple lapply() solution suffice? As far as I understand each sequence is treated separately, i.e. no complicated interdependencies etc.?

library(dplyr)
library(markovchain)

data <- read.table(paste0("https://ndownloader.figshare.com/files",
  "/10412271?private_link=ace5b44bc12394a7c46d"), header=TRUE, sep="\t")

data <- data %>% group_by(km) %>% summarize(pp_chain=list(pp)) %>% as.data.frame 
pp_chains <- data$pp_chain; names(pp_chains) <- data$km

est <- lapply(pp_chains, function(x) markovchainFit(x)$estimate)
head(est, 3)

# $`80`
          # 0         1
# 0 0.8470588 0.1529412
# 1 0.7222222 0.2777778


# $`81`
          # 0         1
# 0 0.6976378 0.3023622
# 1 0.2107574 0.7892426


# $`83`
          # 0          1
# 0 0.9706840 0.02931596
# 1 0.4210526 0.57894737

That answer does offer me the solution I was looking for. Thank you! — Blundering Ecologist, Feb 12 '18 at 03:54

Estimating probabilities for model subset after grouping

1 Answers1