0

I am trying to train a custom model for sentiment analysis using the python Flair module. I have created the Dev, Test and Train CSVs successfully. When running the code to train the model:

from flair.datasets import ClassificationCorpus
from flair.embeddings import WordEmbeddings, FlairEmbeddings, DocumentLSTMEmbeddings
from flair.models import TextClassifier
from flair.trainers import ModelTrainer
from pathlib import Path

# Specify the directory where the dataset files are located
data_dir = Path('/Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel')

# Use ClassificationCorpus
corpus = ClassificationCorpus(data_dir,
                              test_file=data_dir / 'test.csv',
                              dev_file=data_dir / 'dev.csv',
                              train_file=data_dir / 'train.csv')

word_embeddings = [WordEmbeddings('glove'),
                   FlairEmbeddings('news-forward-fast'),
                   FlairEmbeddings('news-backward-fast')]

document_embeddings = DocumentLSTMEmbeddings(word_embeddings,
                                             hidden_size=512,
                                             reproject_words=True,
                                             reproject_words_dimension=256)

classifier = TextClassifier(document_embeddings,
                            label_dictionary=corpus.make_label_dictionary(label_type='class'),
                            label_type='class',
                            multi_label=False)

trainer = ModelTrainer(classifier, corpus)

# Training and evaluation
trainer.train(data_dir, max_epochs=10)

# Save the best model
best_model_path = data_dir / 'final-model.pt'
classifier.save(best_model_path)

It produces the following output:

michaelscoleri@Michaels-MBP Sentiment Analysis % /opt/homebrew/bin/python3.9 "/Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Ana
lysis/CustomModel/trainModel.py"
2023-06-06 15:42:45,457 Reading data from /Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel
2023-06-06 15:42:45,457 Train: /Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel/train.csv
2023-06-06 15:42:45,457 Dev: /Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel/dev.csv
2023-06-06 15:42:45,457 Test: /Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel/test.csv
2023-06-06 15:42:45,458 Initialized corpus /Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel (label type name is 'class')
/Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel/trainModel.py:20: DeprecationWarning: Call to deprecated method __init__. (The functionality of this class is moved to 'DocumentRNNEmbeddings') -- Deprecated since version 0.4.
  document_embeddings = DocumentLSTMEmbeddings(word_embeddings,
2023-06-06 15:42:47,600 Computing label dictionary. Progress:
209it [00:00, 6323.01it/s]
2023-06-06 15:42:47,644 Dictionary created for label 'class' with 22 values: 1.0 (seen 19 times), 0.7 (seen 12 times), 0.5 (seen 11 times), -0.1 (seen 11 times), 0.2 (seen 11 times), -1.0 (seen 10 times), 0.3 (seen 10 times), 0.1 (seen 10 times), -0.7 (seen 10 times), -0.9 (seen 10 times), -0.3 (seen 10 times), 0.0 (seen 10 times), -0.6 (seen 10 times), -0.4 (seen 9 times), -0.8 (seen 9 times), -0.2 (seen 9 times), 0.9 (seen 8 times), -0.5 (seen 8 times), 0.6 (seen 8 times), 0.4 (seen 7 times)
2023-06-06 15:42:47,647 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:47,648 Model: "TextClassifier(
  (embeddings): DocumentLSTMEmbeddings(
    (embeddings): StackedEmbeddings(
      (list_embedding_0): WordEmbeddings(
        'glove'
        (embedding): Embedding(400001, 100)
      )
      (list_embedding_1): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.25, inplace=False)
          (encoder): Embedding(275, 100)
          (rnn): LSTM(100, 1024)
        )
      )
      (list_embedding_2): FlairEmbeddings(
        (lm): LanguageModel(
          (drop): Dropout(p=0.25, inplace=False)
          (encoder): Embedding(275, 100)
          (rnn): LSTM(100, 1024)
        )
      )
    )
    (word_reprojection_map): Linear(in_features=2148, out_features=256, bias=True)
    (rnn): GRU(256, 512)
    (dropout): Dropout(p=0.5, inplace=False)
  )
  (decoder): Linear(in_features=512, out_features=22, bias=True)
  (dropout): Dropout(p=0.0, inplace=False)
  (locked_dropout): LockedDropout(p=0.0)
  (word_dropout): WordDropout(p=0.0)
  (loss_function): CrossEntropyLoss()
  (weights): None
  (weight_tensor) None
)"
2023-06-06 15:42:47,648 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:47,648 Corpus: "Corpus: 209 train + 27 dev + 26 test sentences"
2023-06-06 15:42:47,648 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:47,648 Parameters:
2023-06-06 15:42:47,648  - learning_rate: "0.100000"
2023-06-06 15:42:47,648  - mini_batch_size: "32"
2023-06-06 15:42:47,648  - patience: "3"
2023-06-06 15:42:47,648  - anneal_factor: "0.5"
2023-06-06 15:42:47,648  - max_epochs: "10"
2023-06-06 15:42:47,648  - shuffle: "True"
2023-06-06 15:42:47,648  - train_with_dev: "False"
2023-06-06 15:42:47,648  - batch_growth_annealing: "False"
2023-06-06 15:42:47,648 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:47,648 Model training base path: "/Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel"
2023-06-06 15:42:47,648 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:47,648 Device: cpu
2023-06-06 15:42:47,648 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:47,648 Embeddings storage mode: cpu
2023-06-06 15:42:47,648 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:48,148 epoch 1 - iter 1/7 - loss 3.09790397 - time (sec): 0.50 - samples/sec: 64.10 - lr: 0.100000
2023-06-06 15:42:48,696 epoch 1 - iter 2/7 - loss 3.13132584 - time (sec): 1.05 - samples/sec: 61.12 - lr: 0.100000
2023-06-06 15:42:49,175 epoch 1 - iter 3/7 - loss 3.15598075 - time (sec): 1.53 - samples/sec: 62.89 - lr: 0.100000
2023-06-06 15:42:49,672 epoch 1 - iter 4/7 - loss 3.17525667 - time (sec): 2.02 - samples/sec: 63.24 - lr: 0.100000
2023-06-06 15:42:50,200 epoch 1 - iter 5/7 - loss 3.16638331 - time (sec): 2.55 - samples/sec: 62.70 - lr: 0.100000
2023-06-06 15:42:50,699 epoch 1 - iter 6/7 - loss 3.15750388 - time (sec): 3.05 - samples/sec: 62.94 - lr: 0.100000
2023-06-06 15:42:51,166 epoch 1 - iter 7/7 - loss 3.15797413 - time (sec): 3.52 - samples/sec: 59.42 - lr: 0.100000
2023-06-06 15:42:51,166 ----------------------------------------------------------------------------------------------------
2023-06-06 15:42:51,166 EPOCH 1 done: loss 3.1580 - lr 0.100000
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00,  2.16it/s]
2023-06-06 15:42:51,630 Evaluating as a multi-label problem: False
2023-06-06 15:42:51,636 DEV : loss 3.030161142349243 - f1-score (micro avg)  0.1481
2023-06-06 15:42:51,641 BAD EPOCHS (no improvement): 0
2023-06-06 15:42:51,642 saving best model
Traceback (most recent call last):
  File "/Users/michaelscoleri/Desktop/NitroTrading/Coding/Sentiment Analysis/CustomModel/trainModel.py", line 33, in <module>
    trainer.train(data_dir, max_epochs=10)
  File "/opt/homebrew/lib/python3.9/site-packages/flair/trainers/trainer.py", line 855, in train
    self.model.save(base_path / "best-model.pt", checkpoint=save_optimizer_state)
  File "/opt/homebrew/lib/python3.9/site-packages/flair/nn/model.py", line 104, in save
    model_state = self._get_state_dict()
  File "/opt/homebrew/lib/python3.9/site-packages/flair/models/text_classification_model.py", line 63, in _get_state_dict
    "document_embeddings": self.embeddings.save_embeddings(use_state_dict=False),
  File "/opt/homebrew/lib/python3.9/site-packages/flair/embeddings/base.py", line 103, in save_embeddings
    params = self.to_params()
  File "/opt/homebrew/lib/python3.9/site-packages/flair/embeddings/base.py", line 91, in to_params
    raise NotImplementedError()
NotImplementedError

I can see the success of the beginning of the training, but am having a problem saving the model somewhere in the trainer.py function of the flair module. I am looking for a solution.

thecodeman
  • 49
  • 5

0 Answers0