0

I am making embeddings using metapaths. I loaded a heterogeneous graph dataset. I made this using pytorch-geometric docs. Here is the information regarding the dataset.

HeteroData(
  Publication={
    x=[6662, 8],
    y_index=[6662, 1],
    y=[6662, 1]
  },
  Venue={
    x=[2290, 8],
    y_index=[2290, 1],
    y=[2290, 1]
  },
  Author={
    x=[11884, 8],
    y_index=[11884, 1],
    y=[11884, 1]
  },
  (Venue, cite, Venue)={ edge_index=[2, 15089] },
  (Publication, cite, Publication)={ edge_index=[2, 15089] },
  (Publication, in, Venue)={ edge_index=[2, 6354] }
)

These are the metapaths.

metapath = [
    
    ("Venue", "cite", "Venue"),
    ('Publication', 'cite', 'Publication'),
    ('Publication', 'in', 'Venue')
]
model = MetaPath2Vec(data.edge_index_dict, embedding_dim=128,
                     metapath=metapath, walk_length=50, context_size=3,
                     walks_per_node=3, num_negative_samples=1,
                     sparse=True).to(device)

Now when I try to run the test and train model I am getting an accuracy equals to zero. Below are the codes for my test and train model.

ef train(epoch, log_steps=50, eval_steps=100):
    model.train()

    total_loss = 0
    for i, (pos_rw, neg_rw) in enumerate(loader):
        optimizer.zero_grad()
        loss = model.loss(pos_rw.to(device), neg_rw.to(device))
        loss.backward()
        optimizer.step()

        total_loss += loss.item()
        if (i + 1) % log_steps == 0:
            print((f'Epoch: {epoch}, Step: {i + 1:05d}/{len(loader)}, '
                   f'Loss: {total_loss / log_steps:.4f}'))
            total_loss = 0

        if (i + 1) % eval_steps == 0:
            acc = test()
            print((f'Epoch: {epoch}, Step: {i + 1:05d}/{len(loader)}, '
                   f'Acc: {acc:.4f}'))

@torch.no_grad()
def test(train_ratio=0.3):
    model.eval()
    z = model('Publication', batch=data.y_index_dict['Publication'].flatten())
    
    y = data.y_dict['Publication'].flatten()

    perm = torch.randperm(z.size(0))
    train_perm = perm[:int(z.size(0) * train_ratio)]
    test_perm = perm[int(z.size(0) * train_ratio):]

    return model.test(z[train_perm], y[train_perm], z[test_perm],
                      y[test_perm], max_iter=1500)

for epoch in range(1, 60):

   train(epoch)
   print('trian')
   acc = test()
   print(f'Epoch: {epoch}, Accuracy: {acc:.4f}')

We have the approach to get an accuracy around 40-50%.

ted
  • 13,596
  • 9
  • 65
  • 107
  • Did you get an answer on this? I have the same problem and was looking at the values of y and z in test(). I still find the variable names confusing b/c I'm new to pytorch. It looks like z is a the vectors of the model but y is just an ordinal index. Maybe that's the problem? – MikeB2019x Mar 31 '23 at 15:35

0 Answers0