0

I am new to R. I am trying to implement a LDA algorithm in my dataset. But I have 33 attributes. I have to choose some of them. I did this:

fit <- lda(G3 ~ school+ sex + age + address + famsize + Pstatus + Medu + Fedu +
     Mjob + Fjob + reason + guardian + traveltime + studytime + failures + 
    schoolsup + famsup + paid + activities + nursery + higher + internet + 
    romantic + famrel + freetime + goout + Dalc + Walc + health + absences + 
    G1 + G2, data=d1)

G3 has 20 different groups.

I know that in linear regression I can look to p-value in each attribute and choose the ones with the best p-value. Can someone tell me what can I do in LDA case?

Silva
  • 151
  • 2
  • 16
  • this is probably a question better for stats.stackexchange.com. I might try random subsets of the columns, as done in the random forest algorithm. You might also be interested in LASSO regression (the glmnet package) – shuckle Jun 09 '18 at 14:57
  • I was looking for some variable or value from this particular algorithm that could indicate that a specific attribute is better to predict this variable. – Silva Jun 09 '18 at 15:04
  • see this https://stackoverflow.com/questions/23900932/linear-discriminant-analysis-variable-importance – shuckle Jun 09 '18 at 15:08
  • Thank you, but I don't know how to use that. I t has the data() function, but my data set is in an excel table. I says it cannot find the data set. – Silva Jun 09 '18 at 15:19
  • Look at Eric Czech's comment on the answer from that link. I did a bit of research too, and it seems like there isn't a straightforward way to "get a number" from lda itself about if a variable is important or not – shuckle Jun 09 '18 at 15:30

0 Answers0