I am getting errors. My most recent one being: ImportError: cannot import name 'LightningDistributedModule' from 'pytorch_lightning.overrides'.
I'm trying to load a pre-trained model and then teach it with other files. I have the links to these file locations in the Sharefiles.txt and I'm looking for the code to go one line at a time, load the link, open the file, train the model and then loop back to the next line in the file locations document.
This is what I have so far for my code:
import nemo
import mimetypes
import torch
import pytorch_lightning as pl
from pathlib import Path
from nemo.collections.nlp.modules.language_model import MegatronForSequenceClassification
from nemo.core.classes.module import LightningDistributedModule
# Create a Megatron model
model = nemo.collections.nlp.modules.language_model.MegatronForSequenceClassification.from_pretrained('megatron-bert-345m-uncased')
# Load the pretrained Megatron model
model.load_pretrained('megatron-bert-345m-uncased')
# Read the list of links from a text file
with open('ShareFiles.txt', 'r') as f:
links = f.readlines()
# Go through each link in the list
for link in links:
# Remove the newline character from the end of the link
link = link.strip()
# Get the content type of the file
content_type = mimetypes.guess_type(link)[0]
# Open the file
with open(link, 'r') as f:
# Train the model on the file
model.train(f, content_type=content_type)
# Save the model
model.save_pretrained('my_megatron_model')