Auto-encoder based unsupervised clustering

Question

I am trying to cluster a dataset using an encoder and since I am new in this field I cant tell how to do it.My main issue is how to define the loss function since the dataset is unlabeled and up to know, what I have seen from bibliography they define as loss function the distance between the desired output and the predicted output.My question is since that I dont have a desired output how should I implement this?

score 0 · Answer 1 · answered Oct 25 '17 at 10:07

You can use an auto encoder to pre-train your convolutional layers, like it described in my question here with usage of convolutional autoencoder for images

As you can see form code, loss function is Adam with metrics accuracy and dice coefficient, I think you can use accuracy only, since dice coefficient is image-specific

I’m not sure how it will work for you, because you hadn’t provided your idea how you will transform your bibliography lists to vector, perhaps you will create a list for bibliography id’s sorted by the cosine distance between them

For example, you can use a set of vector with cosine distances to each item in a bibliography list above for each reference in your dataset and use it as input for autoencoder

After encoder will be trained, you can remove the decoder part from your model output and use as an input for one of unsupervised clustering algorithms, for example, k-mean. You can find details about them here

Auto-encoder based unsupervised clustering

1 Answers1