suggestion for clustering algorithm?

Question

I have a dataset of 590000 records after preprocessing and i wanted to find clusters out of it and it contains string data (for now assume i have only one column with 590000 unique values in dataset). Also i am using custom defined distance measure and needed to calculate the distance matrix of size 590000*590000. Using some partition logic i created the distance matrix but cannot merge those partitions into one big distance matrix due to memory constarints. Does anyone have any sort of idea to resolve it ?? I picked DBSCAN for it. Is there any way to use deep learning methodologies?? any other ideas

score 0 · Answer 1 · answered Jun 27 '19 at 05:54

0

Use a manageable sample first.

Because I doubt the results will be good enough to warrant any effort on scaling a method that does not work anyway.

answered Jun 27 '19 at 05:54

Has QUIT--Anony-Mousse

76,138
12
138
194

suggestion for clustering algorithm?

1 Answers1