5

I've computed a distance matrix and I'm trying two approach to visualized it. This is my distance matrix:

delta =
[[ 0.          0.71370845  0.80903791  0.82955157  0.56964983  0.          0.        ]
 [ 0.71370845  0.          0.99583115  1.          0.79563006  0.71370845
   0.71370845]
 [ 0.80903791  0.99583115  0.          0.90029133  0.81180111  0.80903791
   0.80903791]
 [ 0.82955157  1.          0.90029133  0.          0.97468433  0.82955157
   0.82955157]
 [ 0.56964983  0.79563006  0.81180111  0.97468433  0.          0.56964983
   0.56964983]
 [ 0.          0.71370845  0.80903791  0.82955157  0.56964983  0.          0.        ]
 [ 0.          0.71370845  0.80903791  0.82955157  0.56964983  0.          0.        ]]

Considering labels from 1 to 7, 1 is really close to 6 and 7 and farther form 4.

At first I tried to use the tSNE dimensionality reduction:

from sklearn.preprocessing import normalize
from sklearn import manifold
from matplotlib import pyplot as plt
from matplotlib.lines import Line2D

import numpy

model = manifold.TSNE(n_components=2, random_state=0, metric='precomputed')
coords = model.fit_transform(delta)

cmap = plt.get_cmap('Set1')
colors = [cmap(i) for i in numpy.linspace(0, 1, simulations)]

plt.figure(figsize=(7, 7))
plt.scatter(coords[:, 0], coords[:, 1], marker='o', c=colors, s=50, edgecolor='None')

markers = []
labels = [str(n+1) for n in range(simulations)]
for i in range(simulations):
    markers.append(Line2D([0], [0], linestyle='None', marker="o", markersize=10, markeredgecolor="none", markerfacecolor=colors[i]))
lgd = plt.legend(markers, labels, numpoints=1, bbox_to_anchor=(1.17, 0.5))
plt.tight_layout()
plt.axis('equal')
plt.show()

This produces this plot:

enter image description here

Where we can see that this doesn't show 1 is close to 6 and 7. Instead, it is closest to 4.

Then, not sure if the reduction had stopped at some local minima, I tried to draw a graph:

import networkx as nx

plt.figure(figsize=(7, 7))

dt = [('len', float)]
A = delta
A = A.view(dt)

G = nx.from_numpy_matrix(A) 
pos = nx.spring_layout(G)

nx.draw_networkx_nodes(G, pos, node_color=colors, node_size=50)

lgd = plt.legend(markers, labels, numpoints=1, bbox_to_anchor=(1.17, 0.5))
plt.tight_layout()
plt.axis('equal')
plt.show()

As can be seen, the same occurs. If I keep repeating this latest method, I can end up with different sort of graphs:

enter image description here

Here, I get more closer of what I would expect. However, any of these behaviors seems to be right. The distance should be respected no matter how different is the initialization of the graph.

So, I'm wondering what I'm missing in to achieve a good representation of this distance matrix.

Thanks.

enter image description here

pceccon
  • 9,379
  • 26
  • 82
  • 158
  • Hi, Did you find an answer to your question ? I am stuck with the same problem... Even if I change my matrix, the T-SNE vizualisation doesn't seem to change ... – HugoLasticot Apr 26 '16 at 10:03
  • I changed to MDS, @HugoLasticot. I wasn't able to figure it out what was happening using the tSNE method. – pceccon Apr 27 '16 at 18:31
  • 1
    I tried a similar example the first time I experimented with tSNE, with similar results. Things worked fine when I increased the number of data points to around 100. Since this is a probabilistic algorithm, you need sufficiently many points to get a good picture. – TorsionSquid Feb 03 '18 at 03:26

0 Answers0