0

I am new to tensorflow, and am trying to create a convolutional neural network for binary classification that can distinguish the difference between a normal face and the face of someone who is having a stroke.

The images for my dataset are contained within a directory called CNNImages, and contains two subdirectories: RegularFaces and Strokes. Within each subdirector are the PNG images I'm trying to feed into the neural network.

Following the approach suggested in this reference: https://towardsdatascience.com/build-your-own-convolution-neural-network-in-5-mins-4217c2cf964f, I've successfully used Spyder to create the neural network itself, which works when ran with mnist.load_data().

However, I am having trouble using tf.data.Dataset to upload my own images into the neural network. When I try to train my neural network on the image database I created, it returns a ValueError and states "too many values to unpack (expected 2)". I believe I'm either calling my database incorrectly or messsed something up with the database creation.

import tensorflow as tf
import os
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
import numpy as np

os.chdir("/Users/Colin/CNNImages")

files = tf.data.Dataset.list_files("/Users/Colin/CNNImages/*/*.png")

def load_images(path):
    image = tf.io.read_file(path)
    image = tf.io.decode_jpeg(image)
    image = tf.image.convert_image_dtype(image, tf.float32)
    image = tf.image.resize(image, (128, 128))

    parts = tf.strings.split(path, os.path.sep)
    bool_values = tf.equal(parts[-2], 'strokes')
    indices = tf.cast(bool_values, tf.int32)
    return image, indices

ds = files.map(load_images).batch(1)

next(iter(ds))

"""
Above: Image Formatter
Below: CNN
"""


batch_size = 128
num_classes = 2
epochs = 12

# input image dimensions
img_rows, img_cols = 128, 128

# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = ds

x_train = x_train.reshape(869,128,128,3)
x_test = x_test.reshape(217,128,128,3)

print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

When I call ds for (x_train, y_train), (x_test, y_test) = ds, I receive the ValueError which states "too many values to unpack (expected 2)". Did I mess that line up? Or did I design my tf dataset improperly?

Lescurel
  • 10,749
  • 16
  • 39

2 Answers2

0

Each element of your tf.data.Dataset is a tuple (img,label). If you want to create a validation split, you should use take and skip to create it. You also can't reshape and apply functions to the Dataset the way you are doing it later in the script.


To create a train/validation split on the dataset, use skip and take:

# number of element in the validation dataset
n_elem_validation_ds = 267 
val_ds = ds.take(n_elem_validation_ds)
train_ds = ds.skip(n_elem_validation_ds)

To apply functions to your dataset, use map :

# convert class vectors to binary class matrices
helper_categorical = lambda x: keras.utils.to_categorical(x, num_classes)
ds = ds.map(lambda img, label: (img, helper_categorical(label)))

Note: you can skip that keras.utils.to_categorical( function and use sparse_categorical_crossentropy as a loss function instead.


To fit your model on the dataset, simply pass the tf.data.Dataset to the fit function :

model.fit(train_ds, validation_data=val_ds)

To go further, you should read the following guide: tf.data: Build TensorFlow input pipelines.

Lescurel
  • 10,749
  • 16
  • 39
  • 1
    When I add the line `helper_categorical = lambda x: keras.utils.to_categorical(x, num_classes)` to my code, the following error appears: "TypeError: __array__() takes 1 positional argument but 2 were given". To fix this error, do I just change `(x, num_classes)` to `(num_classes)`? I tried doing so, but received the following error: "ValueError: too many values to unpack (expected 2)". – Colin Tree Feb 19 '21 at 20:10
0

try this

    helper_categorical = lambda x: keras.utils.to_categorical(x[1])    
Naveen
  • 1