Model accuracy is very Low. How to improve

Question

I have a model with lot of missing data. There are around 20000 records for training and 5000 records for testing on which models performance is validated.The model has around 120 features. I have identified the cluster in the model based on certain feature and replaced missing values with median within those clusters. So most of the missing values are treated. When I could not find cluster I replaced missing values with zero. I tested this model performance, randomforest,xgboosting seems to have almost similar performance on these data. Xgboosting has 0.5 % higher accuracy.I tried to select best features from RFE and found that maximum i could obtain is 80% for this model. Also i observed that training accuracy is 80% and validation accuracy is 100%. How can I reduce the overfittness of the model. Does my missing data imputation being done wrongly? I know the model accuracy can go upto 90%. Not sure what I am doing wrong here. What should be done to boost my accuracy

score 0 · Answer 1 · answered Apr 23 '20 at 17:40

0

More data, feature selection, feature engineering.... Look on your data, fill missing field, maybe you find new correlations between data. There's no simple answer. Be creative.

answered Apr 23 '20 at 17:40

newblack

73
2
11

Model accuracy is very Low. How to improve

1 Answers1