I am new to this community, r, and programming in general. (Thanks in advance for your patience!) I am working on a project that involves bayesian-networks.
Strait to the question. The following code was posted on this site in response to a question titled "NA/NaN values in bnlearn package R"
rm(list=ls())
### generate random data (not simply independent binomials)
set.seed(123)
n.obs <- 10
a1 <- rbinom(n.obs,1,.3)
a2 <- runif(n.obs)
a3 <- floor(-3*log(.25+3*a2/4))
a3[a3>=2] <- NA
a2 <- floor(2*a2)
my.data <- data.frame(a1,a2,a3 )
### discretize data into proper categories
my.data <- cnDiscretize(my.data,numCategories=2)
my.data
## a1 a2 a3
## 1 1 2 1
## 2 2 1 2
## 3 1 2 1
## 4 2 2 2
## 5 2 1 NA
## 6 1 2 1
## 7 1 1 NA
## 8 2 1 NA
## 9 1 1 NA
## 10 1 2 1
## say we want a2 conditional on a1,a3
## first generate a network with a1,a3 ->a2
cnet <- cnNew(
nodes = c("a1", "a2", "a3"),
cats = list(c("1","2"), c("1","2"), c("1","2")),
parents = list(NULL, c(1,3), NULL)
)
## set the empirical probabilities from data=my.data
cnet2 <- cnSetProb(cnet,data=my.data)
## to get the conditional probability table
cnProb(cnet2,which='a2')
##$a2
## a1 a3 0 1
## A 0.0000000 0.0000000 0.0000000 1.0000000
## B 0.0000000 1.0000000 0.5712826 0.4287174
## A 1.0000000 0.0000000 0.0000000 1.0000000
## B 1.0000000 1.0000000 0.5685786 0.4314214
However when I copy, paste and run the code I get a different result (see below).
rm(list=ls())
### generate random data (not simply independent binomials)
set.seed(123)
n.obs <- 10
a1 <- rbinom(n.obs,1,.3)
a2 <- runif(n.obs)
a3 <- floor(-3*log(.25+3*a2/4))
a3[a3>=2] <- NA
a2 <- floor(2*a2)
my.data <- data.frame(a1,a2,a3 )
### discretize data into proper categories
my.data <- cnDiscretize(my.data,numCategories=2)
my.data
## a1 a2 a3
## 1 1 2 1
## 2 2 1 2
## 3 1 2 1
## 4 2 2 2
## 5 2 1 NA
## 6 1 2 1
## 7 1 1 NA
## 8 2 1 NA
## 9 1 1 NA
## 10 1 2 1
## say we want a2 conditional on a1,a3
## first generate a network with a1,a3 ->a2
cnet <- cnNew(
nodes = c("a1", "a2", "a3"),
cats = list(c("1","2"), c("1","2"), c("1","2")),
parents = list(NULL, c(1,3), NULL)
)
## set the empirical probabilities from data=my.data
cnet2 <- cnSetProb(cnet,data=my.data)
## to get the conditional probability table
cnProb(cnet2,which='a2')
## $a2
## a1 a3 1 2
## A 1.0 1.0 0.0 1.0
## B 1.0 2.0 0.5 0.5
## A 2.0 1.0 0.5 0.5
## B 2.0 2.0 0.5 0.5
Could someone explain why my results are different? I ask because I am trying to understand how catnet handles missing data.
Best,
John