R LDA (linear discriminant analysis) how to get / compute LDA scores from LDA coefficients / loadings

Question

I am using the lda function in R to fit a model. After fitting a model, I would like to use the fits to then put into a computer program to classify based on inputs. I only see the coefficients of linear discriminants and group means. I thought I need the intercept and coefficients to get the actual score for each group.

scrameri · Answer 1 · 2021-09-18T23:36:36.100

The short answer is:

iris.lda <- lda(x = iris[, 1:4], grouping = iris$Species)
iris.lda$scores <- predict(iris.lda)$x

The exact procedure is a bit hidden in getAnywhere(predict.lda) because that function needs to handle many more things, such as looking for the original data in the R environment. If you'd like to understand how you'd do it manually in R or on a sheet of paper, the procedure of getting actual LDA scores from LDA coefficients boils down to the following 4 steps:

1) Calculate the group means for each variable

# tapply solution  
grmeans <- sapply(1:4, function(x) tapply(iris[,x], INDEX = iris$Species, FUN = mean))

# # dplyr solution
# library(dplyr)
# grmeans <- iris %>% group_by(Species) %>% summarize_all(list(mean))

grmeans 
#            [,1]  [,2]  [,3]  [,4]
#setosa     5.006 3.428 1.462 0.246
#versicolor 5.936 2.770 4.260 1.326
#virginica  6.588 2.974 5.552 2.026

# check group means
iris.lda <- lda(iris[, 1:4], iris$Species)
all.equal(unname(iris.lda$means), unname(grmeans)) # it's the same!
#[1] TRUE

2) Calculate the mean of group means for each variable

center <- colMeans(grmeans)
# center <- apply(grmeans, 2, mean) # apply solution

center
#[1] 5.843333 3.057333 3.758000 1.199333

3) Center the data at the mean of group means

iris.c <- scale(x = iris[, 1:4], center = center, scale = FALSE)

The x argument in scale can be the original data, or any new data one wants to project (predict) into the fitted discriminant space. However, one always has to use the centering vector defined by the original data (used for LDA model fitting, center in our example) to center new data accordingly.

4) Multiply the centered data with the LD coefficients (= loadings) stored in $scaling to get the actual scores

iris.lda <- lda(iris[, 1:4], iris$Species)
iris.lda$scores <- iris.c %*% iris.lda$scaling
# iris.lda$scores <- predict(iris.lda)$x # equivalent

This last step is a matrix multiplication, which corresonds to calculating all the linear combinations of variable terms and associated loadings (coefficients), as in multiple linear regression:

# example for the first centered observation (row)
(ex <- iris.c[1,])
#Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#  -0.7433333    0.4426667   -2.3580000   -0.9993333 

(coef <- iris.lda$scaling[,"LD1"])
#Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#   0.8293776    1.5344731   -2.2012117   -2.8104603 

(score <-  unname(coef[1] * ex[1] + coef[2] * ex[2] + coef[3] * ex[3] + coef[4] * ex[4])) 
#[1] 8.0618

all.equal(score, unname(iris.lda$scores[1,"LD1"])) # it's the same!
#[1] TRUE

The scores for other LD functions are calculated in the same way (replace "LD1" with "LD2" etc. in the code above).

All these steps are done by the predict.lda function:

all.equal(predict(iris.lda)$x, iris.lda$scores) # it's the same!
#[1] TRUE

Summary: The LDA scores can be computed using predict(iris.lda)$x. They simply consist of the linear combinations of centered variables (centered at $means) and LDA coefficients (loadings, stored in $scaling).

One can think of an LDA score as the fitted value of a multiple linear regression

# y = a + b1*x1 + b2*x2 + ...

where

# y           LDA score (`predict(iris.lda)$x`)
# a = 0       (no intercept since the data is centered)
# b1, b2, ... LDA coefficients (`$scaling`)
# x1, x2, ... centered data (at mean of group `$means`)

score 0 · Answer 2 · answered Jul 08 '21 at 20:19

0

R returns more information than it prints out on the console. Always read the manual page of a function, e.g. lda to see what information is returned in the "Value" section of the manual page. The "See also" section usually lists other functions that may be useful. Here is a simple example using the iris data set which is included with R:

library(MASS)
data(iris)
iris.lda <- lda(iris[, 1:4], iris$Species)
iris.pred <- predict(iris.lda)
xtabs(~iris$Species+iris.pred$class)
#             iris.pred$class
# iris$Species setosa versicolor virginica
#   setosa         50          0         0
#   versicolor      0         48         2
#   virginica       0          1        49

Out of 150 specimens, the discriminant analysis made only 3 mistakes. Two versicolor specimens were classified as virginica and one specimen of virginica was classified as versicolor.

answered Jul 08 '21 at 20:19

dcarlson

10,936
2
15
18

I'm curious about the equation here for "predict" which you're using above. I understand the rest of it, I just need the coefficients to recreate the predict equation. – Schatzi121 Jul 08 '21 at 20:37
If you just want to classify new data, use `predict` function by including the new data in the `newdata=` argument. It will return the a classification, the posterior probabilities, and the LD coordinates for each new observation. If you want to reconstruct the equations used to compute these you will have to look at the source code `getAnywhere(predict.lda)`. – dcarlson Jul 08 '21 at 20:56
I should have added that the linear discriminant coefficients are `iris.lda$scaling`. More details on the function are available in Venables & Ripley [Modern Applied Statistics with S](https://www.researchgate.net/publication/224817420_Modern_Applied_Statistics_With_S/link/5f5481f092851c250b96c0a9/download) which is available at ResearchGate. – dcarlson Jul 08 '21 at 21:23
Okay, I looked through the getAnywhere(predict.lda) and I see why no one seems to be able to answer about the prediction equation because it is quite complex. I was hoping there would be a much easier way to do this, like with regular linear regression where it is very straightforward. – Schatzi121 Jul 09 '21 at 02:26
And there are k prediction equations, one for each group. Each provides the posterior probability that the observation belongs to that group. The k posterior probabilities add to 1. – dcarlson Jul 09 '21 at 03:43
Yes, I understand that part, and now I see the whole prediction equation. You can write the prediction equation as one formula, with matrices, so you don't need k prediction equations. – Schatzi121 Jul 09 '21 at 04:42

R LDA (linear discriminant analysis) how to get / compute LDA scores from LDA coefficients / loadings

2 Answers2