I have trained a ResNet model and saved its weights to a .pt file as shown below.
## This is file 1 ##
model = resnet50()
model.to(device)
optimizer = Adam(model.parameters(), eps=1e-08, lr = 0.001, weight_decay=1e-4, betas=(0.9, 0.999))
criterion = nn.CrossEntropyLoss()
scheduler = lr_scheduler.MultiplicativeLR(optimizer, lr_lambda=lmbda)
model.train()
train_model(model, criterion, optimizer, scheduler, num_epochs=num_epochs)
torch.save(model.state_dict(), myresnet.pt')
model.eval()
loss, acc, y_pred, y_true = test_model(model, criterion)
I trained a model and achieved a validation accuracy of 95%.
Then, I tested the model on a separate test set, where it achieved an accuracy of 93%.
After these steps, I closed my code files.
Later, I created new empty code script and loaded the saved weights of the model(.pt file) for further use
## This is file 2 ##
model = models.resnet50()
state_dict = torch.load('myresnet.pt')
model.load_state_dict(state_dict)
model.eval()
model.to(device)
loss, acc, y_pred, y_true = test_model(model, criterion)
Problem
After loading the .pt file and testing with test set data only, the test accuracy seriously decreased to 20.6%
My try
Initially, I suspected that the .pt file was corrupted, so I reran my code multiple times, but the situation remained unchanged.
I copied all the code from file 2 and appended it to file 1, which resulted in the desirable accuracy.
why happend this? this is something to do with dataloader?
the below is my dataloader
batch_size = 4
image_size = [32, 32]
random_seed = int(time.time()//1000)
random.seed(random_seed)
def random_ratio_3d(): return [randrange(0, 100)/100, randrange(0, 100)/100, randrange(0, 100)/100]
tmp_mean, tmp_std = random_ratio_3d(), random_ratio_3d()
#data_train_path = 'data/train/'
data_test_path = 'data/test/'
#train_dataset = ImageFolder(data_train_path, Compose([Resize(image_size), ToTensor(), Normalize(mean=tmp_mean, std=tmp_std)]))
test_dataset = ImageFolder(data_test_path, Compose([Resize(image_size), ToTensor(), Normalize(mean=tmp_mean, std=tmp_std)]))
#train_idx, valid_idx = train_test_split(list(range(len(train_dataset))), test_size=0.2, random_state=random_seed)
datasets = {}
#datasets['train'] = Subset(train_dataset, train_idx)
#datasets['valid'] = Subset(train_dataset, valid_idx)
datasets['test'] = test_dataset
dataloaders, batch_num = {}, {}
num_workers = 6 # half of cpu core number
#dataloaders['train'] = DataLoader(datasets['train'], batch_size=batch_size, shuffle=True, num_workers=num_workers)
#dataloaders['valid'] = DataLoader(datasets['valid'],batch_size=batch_size, shuffle=True, num_workers=num_workers)
dataloaders['test'] = DataLoader(datasets['test'], batch_size=batch_size, shuffle=True, num_workers=num_workers)
#batch_num['train'], batch_num['valid'], batch_num['test'] = len(dataloaders['train']), len(dataloaders['valid']), len(dataloaders['test'])
batch_num['test'] = len(dataloaders['test'])