I'm creating a graph-based autoencoder for point-clouds.
The original point-cloud's shape is [3, 1024]
- 1024 points, each of which has 3 coordinates
A point-cloud is turned into an undirected graph using the following steps:
- a point is turned into a node.
- a node's feature is 3 coordinates
- for each node-point find 5 nearest node-points to it. they are connected by an edge
- an edge's feature is the distance between the two node-points that it connects.
I use pytorch-geometric to construct my network and Chamfer distance from pytorch3d [source] as a loss function.
The architecture of my network is the following:
The encoder: GAT (3->16) -> GAT (16->24) -> GAT (24->36) -> shape([32*1024, 36])
The decoder: GAT (36-> 24) -> GAT (24->16) -> GAT (16->3) -> shape([32*1024, 3])
All these layers accept node features and edge features. Besides that, I use Dropout and ReLU.
After that, I just get the original graphs: [32, 1024, 3] and predicted graphs: [32, 1024, 3] and feed them to chamfer loss from pytorch3d. I get some sort of result, but if I visualize them I can see that the network did not learn anything.
The question is: since I compare only nodes' features with chamfer distance, does the network even use the adjacency matrix and edges' features? Do I just need to do some fine-tuning or maybe this model doesn't make any sense?
PS: At this point, I do not care about the architecture or whether 5 nearest neighbors is enough.