0

I am working on LDA (linear discriminant analysis), and you can refer to http://www.ccs.neu.edu/home/vip/teach/MLcourse/5_features_dimensions/lecture_notes/LDA/LDA.pdf .

enter image description here

My idea about semi-supervised LDA: I can use labeled data $X\in R^{d\times N}$ to computer all terms in $S_w$ and $S_b$. Now, I also have unlabeled data $Y\in R^{d\times M}$, and such data can be additionally used to estimate the covariance matrix $XX^T$ in $S_w$ by $\frac{N}{N+M}(XX^T+YY^T)$ which intuitively gets a better covariance estimation.

Implementation of different LDA: I also add a scaled identity matrix to $S_w$ for all compared methods, the scaling parameter should be tuned in different methods. I divide training data into two parts: labeled $X\in R^{d\times N}$, unlabeled $Y\in R^{d\times M}$ with $N/M$ ranging from $0.5$ to $0.05$. I run my semi-supervised LDA on three kinds of real datasets.

How to do classification: The eigenvectors of $S_w^{-1}S_b$ are used as the transformation matrix $\Phi$, thenenter image description here

Experiment results: 1) In the testing data, the classification accuracy of my semi-supervised LDA trained on data $X$& $Y$ is always a bit worse than the standard LDA trained only on data $X$. 2) Also, in one real data, the optimal scaling parameter can be very different for these two methods to achieve a best classification accuracy.

Could you tell me the reason and give me suggestion to make my semi-supervised LDA work? My codes have been checked. Many thanks.

olivia
  • 381
  • 3
  • 14
  • I use image data MNIST, and other datasets UCI-Ioslet and UCI-HAR. For a specfic dataset, I assume its subsets Y and X have the same distribution. They have roughly the same proportion of different classes. One of my guesses is that the mean vectors (the empirical centroid of each class) have great impact on the classificaton of LDA, however, my semi-supervised LDA cannot leverage Y to calculate the mean vectors. – olivia Feb 09 '17 at 15:37
  • My semi-supervised LDA just follows the common sense that more data can get a better covariance matrix estimation, which helps to get a better performance of a learning model involved in covariance matrix. – olivia Feb 09 '17 at 15:37
  • can you share your data sample and code? – Sandipan Dey Feb 10 '17 at 17:43
  • @SandipanDeyS https://1drv.ms/f/s!AmWu_azY3DGdkFOJ6r4KCoJTozqB – olivia Feb 11 '17 at 11:23

0 Answers0