How to Improve accuracy Multilabel sequential text classification

Asked Aug 14 '22 at 18:38

Active Aug 16 '22 at 08:38

Viewed 29 times

I have a dataset which consist of patient complaints as a input and multiple diagnoses as a output.

I tokenized every word of input text with padding so it like [22,10,4,5,0,0,0,0] and output is diagnosis which is one-hot encoding [1,0,0,0,0,1........] (850 diagnosis)

I am trying to train my model

 model = Sequential()
# Configuring the parameters
model.add(Embedding(vocab_size, output_dim=858, input_length=len(X_train)))
model.add(LSTM(128*2, return_sequences=True))  
# Adding a dropout layer
model.add(Dropout(0.5))
model.add(LSTM(64*2))
model.add(Dropout(0.5))
# Adding a dense output layer with sigmoid activation
model.add(Dense(858, activation='sigmoid'))
model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])

But the accuracy could not higher than 0.3 %30 over the 50 epochs.

Dataset :

Text: cought, fever, nausea. - Target : X1.0 Y.10 C.2.5 C3.5 (diagnosis code, each code represents diagnosis so i converted it to one hot encoding across to all distinct diagnosis)

edited Aug 16 '22 at 08:38

asked Aug 14 '22 at 18:38

Fırat Doğan

tough to say without any context on the dataset (apart from the topic) – Alberto Sinigaglia Aug 15 '22 at 00:17
I added the dateset – Fırat Doğan Aug 16 '22 at 08:39
the target are exclusive or not? because then you have to define what "accuracy" mean... for example, if they are not exclusive, and the NN get's correct 4 out of 5 diagnosis, is this correct? – Alberto Sinigaglia Aug 16 '22 at 10:29
Yes it's definitely exclusive from each other. (NN get's correct 4 out of 5 diagnosis) – Fırat Doğan Aug 23 '22 at 09:02
then `sigmoid` does not represent your distribution, you should use `softmax` – Alberto Sinigaglia Aug 23 '22 at 10:36
Thanks, i will use, but do you have any idea for the Accuracy metrics also ? – Fırat Doğan Aug 23 '22 at 12:14
it will change if you use the right activation – Alberto Sinigaglia Aug 23 '22 at 12:48

How to Improve accuracy Multilabel sequential text classification

0 Answers0