0

I have a dataset which consist of patient complaints as a input and multiple diagnoses as a output.

I tokenized every word of input text with padding so it like [22,10,4,5,0,0,0,0] and output is diagnosis which is one-hot encoding [1,0,0,0,0,1........] (850 diagnosis)

I am trying to train my model

 model = Sequential()
# Configuring the parameters
model.add(Embedding(vocab_size, output_dim=858, input_length=len(X_train)))
model.add(LSTM(128*2, return_sequences=True))  
# Adding a dropout layer
model.add(Dropout(0.5))
model.add(LSTM(64*2))
model.add(Dropout(0.5))
# Adding a dense output layer with sigmoid activation
model.add(Dense(858, activation='sigmoid'))
model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['accuracy'])

But the accuracy could not higher than 0.3 %30 over the 50 epochs.

Dataset :

Text: cought, fever, nausea. - Target : X1.0 Y.10 C.2.5 C3.5 (diagnosis code, each code represents diagnosis so i converted it to one hot encoding across to all distinct diagnosis)

0 Answers0