I have one graph, defined by 4 matrices: x
(node features), y
(node labels), edge_index
(edges list) and edge_attr
(edge features). I want to create a dataset in Pytorch Geometric with this single graph and perform node-level classification. It seems that just wrapping these 4 matrices into a data
object fails, for some reason.
I have created a dataset containing the attributes:
Data(edge_attr=[3339730, 1], edge_index=[2, 3339730], x=[6911, 50000], y=[6911, 1])
representing a graph. If I try to slice this graph, like:
train_dataset, test_dataset = dataset[:5000], dataset[5000:]
I get the error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-11-feb278180c99> in <module>
3 # train_dataset, test_dataset = torch.utils.data.random_split(dataset, [train_size, test_size])
4
----> 5 train_dataset, test_dataset = dataset[:5000], dataset[5000:]
6
7 # Create dataloader for training and test dataset.
~/anaconda3/envs/py38/lib/python3.8/site-packages/torch_geometric/data/data.py in __getitem__(self, key)
92 def __getitem__(self, key):
93 r"""Gets the data of the attribute :obj:`key`."""
---> 94 return getattr(self, key, None)
95
96 def __setitem__(self, key, value):
TypeError: getattr(): attribute name must be string
What am I doing wrong in the data construction?