I'm using PTG's Neighborloader to subset a larger graph to nodes of interest and thier connections (just 1 or 2 hops). https://pytorch-geometric.readthedocs.io/en/latest/_modules/torch_geometric/loader/neighbor_loader.html
n_hops = 1
train_loader = ptg.loader.NeighborLoader(
data,
replace = False,
num_neighbors=[-1] * n_hops,
input_nodes=logons_user3106_train, #list of nodes of one type of interest
)
This question is essentially a duplicate of this: Pytorch Geometric gives me an 'edge_index' error
I am using a pretty simple PTG Autoencoder. I cannot iterate through the data loader. Due to a CUDA error ValueError: Encountered a CUDA error. Please ensure that all indices in 'edge_index' point to valid indices in the interval [0, 26) in your node feature matrix and try again.
epochs = 1
model = GAE(GCNEncoder(num_features, out_channels))
model = model.to(device)
for epoch in range(1, epochs + 1):
for batch in train_loader:
print(batch.x.shape)
print(np.unique(batch.edge_index[0]))
print(np.unique(batch.edge_index[1]))
print(batch.edge_index[0].max())
batch_ext = batch.clone()
batch.to(device)
loss = train()
auc, ap = test(batch.edge_index)
print(' Step loss: {:.4f}, AUC: {:.4f}, AP: {:.4f}'.format(epoch, loss, auc, ap))
print('\n')
print('Epoch: {:03d}, , loss: {:.4f}, AUC: {:.4f}, AP: {:.4f}'.format(epoch, loss, auc, ap))
If I take the batch the error happened on and look at it, all indices in batch.edge_index
seem to point to valid indices in my node feature matrix (batch.x
).
Node features size: torch.Size([26, 13])
Unique values in edge index[0]: [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
25]
Unique values in edge index[1]: [0]
max edge index value: tensor(25)
In fact I can take the erroneous batch that I copied to cpu and grab the associated value from either part of the edge index in the node feature matrix without issue:
for i in batch_ext.edge_index[0]:
print(batch_ext.x[i])
##OR
for i in batch_ext.edge_index[1]:
print(batch_ext.x[i])
returns vectors like:
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.])
tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.])
...
Does anyone have suggestions? Is there any other info I can provide? Thanks so much, I can't figure out how to debug this any further.