Neural network only predicts one class from binary class

Question

My task is to learn defected items in a factory. It means, I try to detect defected goods or fine goods. This led a problem where one class dominates the others (one class is 99.7% of the data) as the defected items were very rare. Training accuracy is 0.9971 and validation accuracy is 0.9970. It sounds amazing. But the problem is, the model only predicts everything is 0 class which is fine goods. That means, it fails to classify any defected goods. How can I solve this problem? I have checked other questions and tried out, but I still have the situation. the total data points are 122400 rows and 5 x features.

In the end, my confusion matrix of the test set is like this

array([[30508,     0],
       [   92,     0]], dtype=int64)

which does a terrible job.

My code is as below:

le = LabelEncoder()
y = le.fit_transform(y)



ohe = OneHotEncoder(sparse=False)
y = y.reshape(-1,1)
y = ohe.fit_transform(y)


scaler = StandardScaler()
x = scaler.fit_transform(x)


x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.25, random_state = 777) 




#DNN Modelling


epochs = 15
batch_size =128
Learning_rate_optimizer = 0.001



model = Sequential() 

model.add(Dense(5, 
                kernel_initializer='glorot_uniform',
                activation='relu', 
                input_shape=(5,)))  

model.add(Dense(5,
                kernel_initializer='glorot_uniform', 
                activation='relu'))   
model.add(Dense(8,
                kernel_initializer='glorot_uniform', 
                activation='relu'))

model.add(Dense(2,
                kernel_initializer='glorot_uniform', 
                activation='softmax')) 



model.compile(loss='binary_crossentropy',
              optimizer=Adam(lr = Learning_rate_optimizer), 
              metrics=['accuracy']) 


history = model.fit(x_train, y_train,
                    batch_size=batch_size, 
                    epochs=epochs,  
                    verbose=1, 
                    validation_data=(x_test, y_test))



y_pred = model.predict(x_test)

confusion_matrix(y_test.argmax(axis=1), y_pred.argmax(axis=1))

Thank you

your data is imbalanced. try using class weights in model.fit and check the results — mehrdadep, Mar 11 '19 at 07:19

Silver · Accepted Answer · 2019-03-11T07:59:15.693

4

it sounds like you have highly imbalanced dataset, the model is learning only how to classify fine goods. you can try one of the approaches listed here: https://machinelearningmastery.com/tactics-to-combat-imbalanced-classes-in-your-machine-learning-dataset/

edited Mar 11 '19 at 07:59

answered Mar 11 '19 at 07:51

Silver

125
6

score 2 · Answer 2 · answered Mar 12 '19 at 11:18

The best attempt would be to firstly take almost equal portions of data of both classes, split them into train-test-val, train the classifier and do thorough testing on your complete dataset. You can also try and use data augmentation techniques to your other set to get more data from the same set. Keep on iterating and maybe even try and change your loss function to suit your condition.

Neural network only predicts one class from binary class

2 Answers2