I'm working on a classification
problem related to the marking of ip/tcp packet
, the classes are Best Effort and Non Best Effort; I'm using Python
language. I have selected these features: Protocol, Length of the packet, Port number used for source and destination, Ip addressess from source and destination, flag not fragment and the ECN field, everything to know a possible classification for the DSCP field
.
My idea is to apply a dimensionality reduction for reduce the space where I'm working and see if it can improve my results (using Algorithms like Random Forest, Naive Bayes, SVM).
For now I have used only PCA
that creates new axes and from them I can see the percentage of each variables considered respect to the beginning point.
However I have seen something related to the LDA
and it maximes the variance according to the label, so it is supervidsed learning
while PCA is unsupervised
.
Finally what do you suggest to me, how can I proceed ? Cause I do not know when I have to use PCA for improving classification results and when LDA . And if is correct to use them.