Tensorflow image classification binary crossentropy loss is negative

Question

I'm new to Tensorflow. I followed some tutorials with a provided dataset and wanted to try something on my own. I decided I'd try to classify Magic the Gathering sets. Each card has a symbol in different colors on it: Black, Gold and so on.

The colors don't matter, just the different symbols. So I created a dataset of 3 different sets (so 3 different symbols) and got around 15'000 images like this. Some are a little bit rotated, some have an X and Y offset, just to get some different images.

Then I adapted the tutorial on the tensorflow website for image classification. Instead of two classes I wanted to try three:

batch_size = 250
epochs = 3
IMG_HEIGHT = 55
IMG_WIDTH = 55

train_image_generator = ImageDataGenerator(rescale=1./255)
validation_image_generator = ImageDataGenerator(rescale=1./255)

train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
                                                           directory=train_dir,
                                                           shuffle=True,
                                                           target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                           class_mode='binary')

val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
                                                              directory=validation_dir,
                                                              target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                              class_mode='binary')

model = Sequential([
    Conv2D(16, 3, padding='same', activation='relu', input_shape=(IMG_HEIGHT, IMG_WIDTH ,3)),
    MaxPooling2D(),
    Conv2D(32, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Conv2D(64, 3, padding='same', activation='relu'),
    MaxPooling2D(),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

history = model.fit_generator(
    train_data_gen,
    steps_per_epoch=total_train // batch_size,
    epochs=epochs,
    validation_data=val_data_gen,
    validation_steps=total_val // batch_size,
    callbacks=[cp_callback]
)

But my loss is negative and I don't get a good accuracy after training. What did I mess up? Is the model used in the tutorial not good for my usecase? Or is there an error in the code because I used three instead of two classes?

score 0 · Answer 1 · answered Nov 07 '19 at 10:19

The model from the tutorial was used for binary classification (only two classes, cat or dog). You on the other hand want to classify 3 classes not 2. Therefore you have to adapt the architecture a little bit. Your last layer should be:

Dense(3, activation='softmax')

Three neurons because you have three classes and softmax activation because you want your outputs to be valid probabilities. To compile the model, use categorical_crossentropy instead of binary_crossentropy and make sure your labels are one-hot-encoded. Also for your ImageDataGenerator you should pass class_mode=categorical to the .flow_from_directory() function.

Tensorflow image classification binary crossentropy loss is negative

1 Answers1