I'm new to Sparkling Water and machine learning,
I've built GBM model with two datasets divided manually into train and test. Task is classification with all numeric atributes (response column is converted to enum type). Code is in Scala.
val gbmParams = new GBMParameters()
gbmParams._train = train
gbmParams._valid = test
gbmParams._response_column = "response"
gbmParams._ntrees = 50
gbmParams._max_depth = 6
val gbm = new GBM(gbmParams)
val gbmModel = gbm.trainModel.get
In model summary I get four different - one on train data and one on test data before building individual trees with prediction. The result is with predicted value as 1 in each case - this is for test data:
CM: Confusion Matrix (vertical: actual; across: predicted):
0 1 Error Rate
0 0 500 1,0000 500 / 500
1 0 300 0,0000 0 / 300
Totals 0 800 0,6250 500 / 800
The second confusion matrix is similar with predicted value as 1 in each case for train data. Third and Fourth confusion matrix after built trees gaves normal results with values distributed in all sections of matrix.
I need to interpret first and second matrix. Why is Sparkling Water doing that? Can I work with these results or it's just some middle step?
Thank you.