I have a precomputed distance matrix that I have run through Sklearn's MDS algorithm. All data are needed. The matrix has been scaled (0-1). I am looking to convert the analysis into a plot so the max n_components = 3.
I've tried to modify several parameters (n_components, random_state, n_init), however, I am unable to reduce the stress - 1 (normalized) value below 0.25, which is considered a 'poor' fit.
When I increase the n_components really high (n_components = 100), the stress score drops to 0.01. Would I be able to take these 100 dimensions and reduce them using PCA perhaps?
Any suggestions on how to improve the fit? Should I try a different tool instead?
Here is the code:
#Pre-computed distance matrix
df = pd.read_excel('./FTM_fingerprint_FULL_dissimilarity_matrix_MORGAN_1024_2.xlsx', index_col = 0, header = 0)
#Multidimensional scaling
mds1 = MDS(random_state = 1, dissimilarity = 'precomputed', n_init=16, n_components=3, eps=1e-9)
X_transform = mds1.fit_transform(df)
print(X_transform)
#normalized stress score
stress = mds1.stress_
print(stress)
Thanks