-1

I am using keras in my multiclass text classifcation, the dataset contains 25000 arabic tweets with 10 class labels I use this code :

model = Sequential()
model.add(Dense(512, input_shape=(10902,)))#10902
model.add(Activation('relu'))
model.add(Dropout(0.3))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.3))
model.add(Dense(10))
model.add(Activation('softmax'))
model.summary()
 #categorical_crossentropy
model.compile(loss='sparse_categorical_crossentropy',        optimizer='rmsprop',
              metrics=['accuracy'])
..
history = model.fit(X_train, y_train,
                    batch_size=100,
                    epochs=30,
                    verbose=1,
                    validation_split=0.5)

Summary:

Layer (type)                 Output Shape              Param #   
=================================================================
dense_23 (Dense)             (None, 512)               5582336   
_________________________________________________________________
activation_22 (Activation)   (None, 512)               0         
_________________________________________________________________
dropout_15 (Dropout)         (None, 512)               0         
_________________________________________________________________
dense_24 (Dense)             (None, 512)               262656    
_________________________________________________________________
activation_23 (Activation)   (None, 512)               0         
_________________________________________________________________
dropout_16 (Dropout)         (None, 512)               0         
_________________________________________________________________
dense_25 (Dense)             (None, 10)                5130      
_________________________________________________________________
activation_24 (Activation)   (None, 10)                0         
=================================================================
Total params: 5,850,122
Trainable params: 5,850,122
Non-trainable params: 0

but i get error: could not convert string to float: 'food' where food is a class name

when i change loss to categorical_crossentropy i get the error Error when checking target: expected activation_24 to have shape (10,) but got array with shape (1,)

Update

'
nd=data.replace(['ads', 'Politic', 'eco', 'food', 'health', 'porno', 'religion', 'sports', 'tech','tv'], 
                     [1, 2, 3, 4, 5,6,7,8,9,10]) 
model = Sequential()
    model.add(Dense(512, input_shape=(10902,10)))#no. of words
    model.add(Activation('relu'))
    model.add(Dropout(0.3))
    model.add(Dense(512))
    model.add(Activation('relu'))
    model.add(Dropout(0.3))
    model.add(Dense(10))
    model.add(Activation('softmax'))
    model.summary()
     #categorical_crossentropy
    model.compile(loss='categorical_crossentropy',        optimizer='rmsprop',
                  metrics=['accuracy'])
    y_train=keras.utils.to_categorical(y_train) 
    history = model.fit(X_train, y_train,
                        batch_size=100,
                        epochs=30,
                        verbose=1,
                        validation_split=0.5)'
Ahmed
  • 23
  • 5
  • 1
    you can't pass strings into a neural network.. consider using a one-hot-encoding to transform strings into numbers or use an embedding layer – Primusa Nov 29 '18 at 21:23
  • you mean i replace class name to a number ? – Ahmed Nov 29 '18 at 21:40

1 Answers1

0

You correctly used Dense(10) at the end, in order to produce ten results, one for each class.

But you should have your output y_train shaped also with 10 classes.

It should have shape (numberOfTweets, 10).

For this you should:

  • If you have an array with indices, transform them using the keras function y_train=to_categorical(y_train).
  • If you have them as strings, you must transform them in indices, and then use to_categorical
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • shape (numberOfTweets, 10). contains number of tweets or number of words?? I change class label to integer nd=data.replace(['ads', 'Politic', 'eco', 'food', 'health', 'porno', 'religion', 'sports', 'tech','tv'], [1, 2, 3, 4, 5,6,7,8,9,10]) and add y_train=keras.utils.to_categorical(y_train) get error Error when checking input: expected dense_22_input to have 3 dimensions, but got array with shape (15058, 10902) – Ahmed Nov 30 '18 at 12:42
  • Please update your code, this error shows you're not using the same model you posted here. – Daniel Möller Nov 30 '18 at 12:53