Im conducting an experiment on blood test results data trying to predict the probability a patient has a curtain disease. using the blood test result i have reached over 2000 features and im trying to find a good way to eliminate features that doesnt help. is there more general way to find the unneccesery features ? im using xgboost and histGradientBoost models for the prediction
ive tried using feature importance but as i increase the number of patients in the dataset the important features changes ... i heard about a package called SHAP but my computer has no access to the internet and getting the package will take time