Questions tagged [unsupervised-learning]

Unsupervised learning refers to machine learning contexts in which there is no prior 'training' period in which the learning agent is trained on objects of known type. As such, supervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimisation or maximisation of mathematical properties and not on an attempt to classify by understanding the right context.

Unsupervised learning (or clustering) refers to machine learning algorithms in which there is no 'label' available for the training data and the model tries to learn the underlying manifold. As such, unsupervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimization or maximization of mathematical properties and not on an attempt to classify by understanding the right context.

618 questions
3
votes
1 answer

Supervised learning v.s. offline (batch) reinforcement learning

Most materials (e.g., David Silver's online course) I can find offer discussions about the relationship between supervised learning and reinforcement learning. However, it is actually a comparison between supervised learning and online reinforcement…
AntiInsect
  • 345
  • 2
  • 8
3
votes
2 answers

Python training Kmeans algorithm to predict the dominant color of a image

I am trying to create a model that will predict the dominant color in the image using K-means clustering. I have the data all set up, but I am unsure how I can proceed after fitting the model. Thanks from sklearn.cluster import KMeans import…
3
votes
0 answers

Unsupervised clustering of words in R without knowing k

As a beginner in NLP, I am trying to find the best way to cluster single words with unsupervised clustering, specifically where the number of clusters k is not known in advance. I have a group of words that contains clusters of words are very…
iskandarblue
  • 7,208
  • 15
  • 60
  • 130
3
votes
1 answer

How to tune / choose the preference parameter of AffinityPropagation?

I have large dictionary of "pairwise similarity matrixes" that would look like the following: similarity['group1']: array([[1. , 0. , 0. , 0. , 0. ], [0. , 1. , 0.09 , 0.09 , 0. …
3
votes
1 answer

Using Python for asymmetric calculation of jaccard distance

I have some SAS coding that I am trying to convert to Python. I am having difficulties calculating the jaccard distance on asymmetric data – where the zeros should be ignored in the calculation. I do find some examples on jaccard but they do not…
Geir
  • 55
  • 4
3
votes
1 answer

deciding to the type of kernel parameter in Kernel PCA

I am new to machine learning and I am trying to do unsupervised learning with k-means clustering (even if I read that k-means cannot work very well with categorical data). I encoded my categorical variables and tried to apply kernel PCA since I have…
3
votes
2 answers

K-Means: assign clusters to new data points

I've implemented a k-means clustering algorithm in python, and now I want to label a new data with the clusters I got with my algorithm. My approach is to iterate through every data point and every centroid to find the minimum distance and the…
efsee
  • 579
  • 1
  • 10
  • 22
3
votes
1 answer

apcluster in R: Memory limitation

I am trying to run clustering exercise in R. The algorithm that I used is apcluster(). The script that I used is: s1 <- negDistMat(df, r=2, method="euclidean") apcluster <- apcluster(s1) My data set is having around 0.1 million rows. When I…
3
votes
1 answer

Comparing HDBSCAN labels with soft cluster results

I'm getting the soft clusters from a dataset using HDBSCAN as follows: clusterer = hdbscan.HDBSCAN(min_cluster_size=10, prediction_data=True) clusterer.fit(data) soft_clusters = hdbscan.all_points_membership_vectors(clusterer) closest_clusters =…
3
votes
1 answer

Does or will H2O provide any pretrained vectors for use with h2o word2vec?

H2O recently added word2vec in its API. It is great to be able to easily train your own word vectors on a corpus you provide yourself. However even greater possibilities exist from using big data and big computers, of the type that software…
Geoffrey Anderson
  • 1,534
  • 17
  • 25
3
votes
1 answer

Machine Learning - one class classification/novelty detection/anomaly assessment?

I need a machine learning algorithm that will satisfy the following requirements: The training data are a set of feature vectors, all belonging to the same, "positive" class (as I cannot produce negative data samples). The test data are some…
3
votes
2 answers

Is a genetic algorithm a form of unsupervised learning?

I have a pretty simple question. However I have searched extensively and am unable to find the answer. Is a genetic algorithm considered to be a form of unsupervised learning? I know that the algorithms evolves independently, however the fitness of…
3
votes
1 answer

Co-clustering algorithm in python

Are there implementations available for any co-clustering algorithms in python? The scikit-learn package has k-means and hierarchical clustering but seems to be missing this class of clustering.
2
votes
2 answers

How do I evaluate Clustering?

I am still researching on evaluating clusters formed using clustering (unsupervised learning)? I tried googling but the measures I get are too theoretical. It will be great if people can share the mechanisms they are using to evaluate the clusters…
2
votes
3 answers

Unsupervised Learning in R? Classify Matrices - what is the right package?

Recently I watched a lot of Stanford's hilarious Open Classroom's video lectures. Particularly the part about unsupervised Machine Learning got my attention. Unfortunately it stops were it might get even more interesting. Basically I am looking to…
Matt Bannert
  • 27,631
  • 38
  • 141
  • 207