3

I am doing multi-class classification of 5 classes. I am using Tensorflow with Keras. My code is like this:

# load dataset
dataframe = pandas.read_csv("Data5Class.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:47].astype(float)
Y = dataset[:,47]
print("Load Data.....")

encoder= to_categorical(Y)

def create_larger():  
    model = Sequential()
    print("Create Dense Ip & HL 1 Model ......")
    model.add(Dense(47, input_dim=47, kernel_initializer='normal',     activation='relu'))
    print("Add Dense HL 2 Model ......")
    model.add(Dense(40, kernel_initializer='normal', activation='relu'))
    print("Add Dense output Model ......")
    model.add(Dense(5, kernel_initializer='normal', activation='sigmoid'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    return model

estimators = []
estimators.append(('rnn', KerasClassifier(build_fn=create_larger, epochs=60,  batch_size=10, verbose=0)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoder, cv=kfold)
print("Accuracy: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))

The CSV file I have taken as an input contains the data with labels. The labels are like this 0, 1, 2, 3, 4 which represent 5 different classes.

  1. Then, as the labels are already in integer form, do I need to use the LabelEncoder() function in my code?
  2. Also, I have used to_categorical(Y) function. Should I use it or I should just pass the Y variable containing these labels to the classifier for training?

  3. I got the error like this: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead. This error occurred when I used encoder variable in the code results = cross_val_score(pipeline, X, encoder, cv=kfold) where encoder variable represents the to_categorical(Y) data. How to solve this error?

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132
Amulya Dixit
  • 31
  • 1
  • 3
  • No need to use LabelEncoder(). But usage of `to_categorical()` depends on the loss function used in the Keras model. Show that code. – Vivek Kumar Apr 17 '18 at 04:53
  • Yes. I have used categorical_crossentropy loss function. The code is like this: model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) – Amulya Dixit Apr 17 '18 at 05:05
  • You cannot use the Keras estimator with StratifiedKFold in the cross_val_score. You need to write a custom cross validation for that, in which the data passed to StratifiedKFold should be original `y` and then after splitting it should be encoded with `to_categorical()` and pass to Keras. – Vivek Kumar Apr 17 '18 at 09:21

1 Answers1

0

As mentioned on the Keras documentation here:

Note: when using the categorical_crossentropy loss, your targets should be in categorical format (e.g. if you have 10 classes, the target for each sample should be a 10-dimensional vector that is all-zeros except for a 1 at the index corresponding to the class of the sample). In order to convert integer targets into categorical targets, you can use the Keras utility to_categorical:

from keras.utils.np_utils import to_categorical
categorical_labels = to_categorical(int_labels, num_classes=None)

So this means that you need to use the to_categorical() method on your y before training. But no need to use LabelEncoder if y is already in integer type.

Vivek Kumar
  • 35,217
  • 8
  • 109
  • 132
  • Yes Sir. I tried using only to_categorical(Y) in my code. But, if I use only this and try running it. It shows error. ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead. – Amulya Dixit Apr 17 '18 at 05:27
  • Therefore, I did not include the to_categorical(Y). Will it give correct accuracy results then ? – Amulya Dixit Apr 17 '18 at 05:39
  • @AmulyaDixit On which code are you getting the error "Got 'multilabel-indicator' instead. ". Please edit the question. And, no you will not get correct results if you dont include the to_categorical – Vivek Kumar Apr 17 '18 at 06:08
  • seed = 7 numpy.random.seed(seed) dataframe = pandas.read_csv("Data.csv", header=None) dataset = dataframe.valuesX = dataset[:,0:47].astype(float) Y = dataset[:,47] – Amulya Dixit Apr 17 '18 at 06:28
  • 1
    @AmulyaDixit Dont post here. Edit the question – Vivek Kumar Apr 17 '18 at 06:28
  • @Ashery Mbilinyi refile for the correction and paste string below OP Vivek original answer. How you did it violates the answer. Your improvement was at the wrong place. **UPDATE :** For Keras 2.2.4 use `from keras.utils import to_categorical`. – ZF007 Jun 19 '19 at 14:09