Sweep a log-dnorm across a training set matrix to find log-likelihood

Question

As part of a machine learning class assignment, I am implementing a NaiveBayes classifier without using any external library.

My training data set X has 8 features and one binary label for 800 rows; I have calculated 1:8 vectors for mean and sd for each feature by class, along with the priors for the two classes.

In order to assess accuracy of the classifier on the training dataset, I want to generate a matrix Y with the same dimensions (i=800, j=8) in which each element y_ij is given as

y_ij = dnorm(x_ij, mean = mean_j, sd_j, log = T)

I have tried sweep, apply, and lapply without success. I am stuck and unfortunately this is an issue with familiarity with R rather than understanding the algo. Help is greatly appreciated.

In `dnorm(x_ij, ...` is the `x_ij` the original value? And you want a new 800x8 matrix `Y` with new values `y_ij = dnorm(x_ij, ...`? — twedl, Jan 25 '18 at 18:29
I have low reputation so my vote doesn't count, but this is exactly what I needed. Thanks. — Paco Cruz, Jan 26 '18 at 01:42
np. you can probably accept the answer in lieu of an upvote if it helped. — twedl, Jan 26 '18 at 01:44

twedl · Accepted Answer · 2018-01-25T23:04:13.840

0

There's probably a better data setup for this, but if you already have X and two vectors of means and sds, xmean and xsd, you can use sapply. Here's a reproducible example:

X <- matrix(rnorm(30), 10, 3)
xmean <- apply(X, 2, mean)
xsd <- apply(X, 2, sd)
sapply(1:ncol(X), function(j) { dnorm(X[,j], xmean[j], xsd[j], log = TRUE) })

edited Jan 25 '18 at 23:04

answered Jan 25 '18 at 22:48

twedl

1,588
1
17
28

Sweep a log-dnorm across a training set matrix to find log-likelihood

1 Answers1