0

I tried to calculate BDe Score in R without using the built in function of different R packages.

library(bnlearn)
library(tidyverse)

# Load the ALARM network
# load("http://www.bnlearn.com/bnrepository/alarm/alarm.bif.gz")
alarmNetwork_ls <- read.bif("alarm.bif.gz")

# Load the ALARM data
data("alarm")

# Select a subset of the data for testing
test_data <- alarm[sample(nrow(alarm), 1000), ]

# The functions above match on names;
# the name of one of the nodes in the network is "LVFAILURE",
# but this name in the alarm dataset is "LVF".
# We fixed the column name using the code below.

test_data <- test_data %>% 
  rename(
    HISTORY = HIST, 
    HREKG = HREK,
    HRSAT = HRSA,
    PRESS = PRSS,
    EXPCO2 = ECO2,
    MINVOL = MINV,
    MINVOLSET = MVS,
    HYPOVOLEMIA = HYP,
    ANAPHYLAXIS = APL,
    INSUFFANESTH = ANES,
    PULMEMBOLUS = PMB,
    INTUBATION = INT,
    KINKEDTUBE = KINK,
    DISCONNECT = DISC,
    LVEDVOLUME = LVV,
    STROKEVOLUME = STKV,
    CATECHOL = CCHL,
    LVFAILURE = LVF,
    ERRLOWOUTPUT = ERLO,
    ERRCAUTER = ERCA,
    SHUNT = SHNT,
    PVSAT = PVS,
    ARTCO2 = ACO2,
    VENTALV = VALV,
    VENTLUNG = VLNG,
    VENTTUBE = VTUB,
    VENTMACH = VMCH
  )

# calculate log-likelihood of data under the network
log_likelihood <- function(data, bn) {
  n <- nrow(data)
  nodes <- nodes(bn)
  parents <- parents(bn)
  logprob <- rep(0, n)
  for (i in 1:n) {
    prob <- 1
    for (j in 1:length(nodes)) {
      node <- nodes[[j]]
      node_name <- node$name
      node_parents <- parents[[j]]
      if (length(node_parents) == 0) {
        prob_node <- cpquery(bn, node_name, list(), data[i,])
      } else {
        parent_values <- data[i,node_parents]
        prob_node <- cpquery(bn, node_name, list(parents = parent_values), data[i,])
      }
      prob <- prob * prob_node
    }
    logprob[i] <- log(prob)
  }
  return(sum(logprob))
}

# calculate number of parameters in the model
num_params <- function(bn) {
  nodes <- nodes(bn)
  parents <- parents(bn)
  n_params <- 0
  for (i in 1:length(nodes)) {
    node <- nodes[[i]]
    node_states <- length(node$levels[[1]])
    n_parents <- length(parents[[i]])
    n_params <- n_params + node_states * (n_parents + 1)
  }
  return(n_params)
}

# calculate BDe score
BDe_score <- function(data, bn) {
  n <- nrow(data)
  LL <- log_likelihood(data, bn)
  d <- ncol(data)
  k <- num_params(bn)
  score <- LL - 0.5 * log(n) * k
  return(score)
}

# test function on alarm data and network
BDe_score(test_data, alarmNetwork_ls)

I tried ro run the above code but got follwing error:

Error in check.nodes(nodes = node, graph = x, max.nodes = 1) :
no node specified.

I know there are several R packages to calculate BDe score but can anyone help me to resolve my issue without using those built-in functions? Or if anyone can help me to code the proposition 18.2 of Probabilistic Graphical Models: Principles and Techniques Book by Daphne Koller and Nir Friedman

Marco
  • 2,368
  • 6
  • 22
  • 48
  • 3
    Are we supposed to know what a BDe score is? (Rhetorical question obviously) Please cite online literature or websites where there is a definition and hopefully some background on this obscure entity. – IRTFM Feb 20 '23 at 03:07
  • Here is some reference related to BDe score. chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://www.cs.helsinki.fi/u/bmmalone/probabilistic-models-spring-2014/ScoringFunctions.pdf – Tanvir Peyal Feb 20 '23 at 18:23
  • Probabilistic Graphical Models: Principles and Techniques Book by Daphne Koller and Nir Friedman Proposition 18.2 explain how to calculate Bayesian Score for Bayesian Networks in detail. chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/http://mcb111.org/w06/KollerFriedman.pdf – Tanvir Peyal Feb 20 '23 at 18:32
  • A quick use of `traceback()` after running your functions, indicates that the error comes from `parents(bn)` in your `log_likelihood` function. Please see the [dcos](https://www.bnlearn.com/documentation/man/mb.html) for how to call the function e.g. you need to supply a node – user20650 Feb 20 '23 at 19:04
  • After searching Google and finding that both the bnstruct and r.blip packages have implementations of the BDeu scoring function, and reading the first citation above which says that the BDeu function is just the empirical vrsion of the BDe function, I'm wondering why you are trying to build this from scratch? – IRTFM Feb 20 '23 at 19:14
  • @IRTFM this is a homework assignment from my professor. – Tanvir Peyal Feb 21 '23 at 16:48
  • 2
    All the more reason that I should not supply an answer. R and all of its CRAN hosted packages are open source. I don’t see how it’s any different to crib from an existing package than it would be to crib from an SO answer. “Luke, use the source. “ – IRTFM Feb 22 '23 at 04:38

0 Answers0