0

I have run LDA with MATLAB using the fitcdiscr function and predict.

I have a feeling there may be some bugs in my code however and as a sanity check would like to identify which features are being most heavily weighted in the classification.

Can this be done?

Adriaan
  • 17,741
  • 7
  • 42
  • 75
JP1
  • 731
  • 1
  • 10
  • 27

1 Answers1

2

There is a Coeffs field in your fitted object containing all the relevant information http://uk.mathworks.com/help/stats/classificationdiscriminant-class.html

In particular, if you fit a linear LDA there will be Linear field which is the linear operator used for projection. However, one should bear in mind that value of coefficients of linear models are not feature importances. There is much more in that to consider. Weight can be big because your feature have small values or because there is a highly biased distribution of the values. If you need feature selection technique - use feature selection methods (like L1 regularized models) otherwise you might easily get wrong conclusions from your data.

lejlot
  • 64,777
  • 8
  • 131
  • 164
  • Thanks! I have rescaled each of my features so there is a standard deviation and mean of 1 for each feature. Am I correct in saying the weights now correspond to importance? Or is rescaling not allowed.. – JP1 Jan 11 '16 at 10:43
  • Well it is way more complete than that. What you have done is reasonable and corresponds to some stat testing, although if features do not follow normal distribution this is still just a heuristic – lejlot Jan 11 '16 at 11:51