0

As part of a machine learning class assignment, I am implementing a NaiveBayes classifier without using any external library.

My training data set X has 8 features and one binary label for 800 rows; I have calculated 1:8 vectors for mean and sd for each feature by class, along with the priors for the two classes.

In order to assess accuracy of the classifier on the training dataset, I want to generate a matrix Y with the same dimensions (i=800, j=8) in which each element y_ij is given as

y_ij = dnorm(x_ij, mean = mean_j, sd_j, log = T)

I have tried sweep, apply, and lapply without success. I am stuck and unfortunately this is an issue with familiarity with R rather than understanding the algo. Help is greatly appreciated.

Paco Cruz
  • 73
  • 1
  • 8

1 Answers1

0

There's probably a better data setup for this, but if you already have X and two vectors of means and sds, xmean and xsd, you can use sapply. Here's a reproducible example:

X <- matrix(rnorm(30), 10, 3)
xmean <- apply(X, 2, mean)
xsd <- apply(X, 2, sd)
sapply(1:ncol(X), function(j) { dnorm(X[,j], xmean[j], xsd[j], log = TRUE) })

twedl
  • 1,588
  • 1
  • 17
  • 28