0

My question is a little bit theoretical.

I have a dataset with 100+ columns, Every EDA method that I use results in a messed-up plot, How can I get more interpretable plots and tables with such data?

2 Answers2

1

@Zine

Try using only the variables you need in the visualizations.


You can use Principal Components Analysis (PCA) to reduce the variables. It is an effective way of reducing the variables but contains the same quality data. For your reference, links to learn PCA: -

  1. https://www.sartorius.com/en/knowledge/science-snippets/what-is-principal-component-analysis-pca-and-how-it-is-used-507186

2)https://www.geeksforgeeks.org/ml-principal-component-analysispca/

3)https://www.machinelearningplus.com/machine-learning/principal-components-analysis-pca-better-explained/

0

Try dimensionality reduction methods like PCA and T-SNE.If you want to visualize data go with T-SNE .However PCA can give you some information about how much data (or variance of data) you are preserving while reducing dimensions.

Here's a link which can explain the different Dimensionality reduction techniques and their results.