Questions tagged [unsupervised-learning]

Unsupervised learning refers to machine learning contexts in which there is no prior 'training' period in which the learning agent is trained on objects of known type. As such, supervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimisation or maximisation of mathematical properties and not on an attempt to classify by understanding the right context.

Unsupervised learning (or clustering) refers to machine learning algorithms in which there is no 'label' available for the training data and the model tries to learn the underlying manifold. As such, unsupervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimization or maximization of mathematical properties and not on an attempt to classify by understanding the right context.

618 questions
2
votes
0 answers

How to concatenate phone hmm model to a composite word or sentence hmm model

I want to do the embedded training for speech recognition. In the beginning, I want to use the monophone with 3-states, as the paper decripted, I can concatenate all the phones in one word or sentence to make a composited hmm model, and do embedded…
YonF
  • 641
  • 5
  • 20
2
votes
1 answer

Applications of Unsupervised learning using k-means cluster or Association rules

I have been looking into some applications for unsupervised learning, but have only found some hypothetical applications on the internet, for example unsupervised learning could be used for, say, fraud detection. For example, for supervised learning…
Barry B Benson
  • 331
  • 1
  • 11
2
votes
1 answer

Unsupervised Naive Bayes - how does it work?

So as I understand it, to implement an unsupervised Naive Bayes, we assign random probability to each class for each instance, then run it through the normal Naive Bayes algorithm. I understand that, through each iteration, the random estimates get…
2
votes
1 answer

Is image segmentation using neural networks always supervised?

Is there a crucial distinction between semantic and just normal image segmentation with neural networks? Is non-semantic segmentation some type of unsupervised pixel-clustering method?
hirschme
  • 774
  • 2
  • 11
  • 40
2
votes
2 answers

Python AUC Calculation for Unsupervised Anomaly Detection (Isolation Forest, Elliptic Envelope, ...)

I am currently working in anomaly detection algorithms. I read papers comparing unsupervised anomaly algorithms based on AUC values. For example i have anomaly scores and anomaly classes from Elliptic Envelope and Isolation Forest. How can i compare…
2
votes
1 answer

Clustering time events

I have a question about performing clustering with clouds of points in which one dimension - representing time - is somewhat protected. To make it super clear, consider this video With a naked eye one may see some dense clouds flying around like…
2
votes
1 answer

What is the best algorithm to perform unsupervised text classification(clustering) using python scikit-learn?

I tried CountVectorizer + KMeans but I don't know the number of clusters. Calculating the number of clusters in KMeans took a lot of time when I used the gap statistic method. NMF requires determining the number of components beforehand too.
2
votes
1 answer

Ensemble Learning in Unsupervised Learning

I have a question regarding the current literature in ensemble learning (more specifically in unsupervised learning). For what I read in the literature, Ensemble Learning when applied to Unsupervised Learning resumes basically to Clustering…
2
votes
1 answer

Using weights from Autoencoder to initialize neural network in tensorflow

I built an Autoencoder using Python and Tensorflow. To build the Autoencoder I used the Tensorflow tutorial on how to build an Autoencoder to read the MNIST Data set on handwritten digits. I used it to find features of CGRA compositions. So far I…
2
votes
1 answer

Evaluating the model as you train with scikit's LatentDirichletAllocation class

I am experimenting with the LatentDirichletAllocation() class in scikit-learn, and the evaluate_every parameter has the following description. How often to evaluate perplexity. Only used in fit method. set it to 0 or negative number to not…
neelshiv
  • 6,125
  • 6
  • 21
  • 35
2
votes
1 answer

How to find optimal number of clusters in hierarchical clustering using Gap statistic?

I want to run hierarchical clustering with single linkage to cluster documents with 300 features and 1500 observations. I want to find the optimal number of clusters for this problem. The below link uses the below code to find the number of clusters…
2
votes
1 answer

RandomForest score method ValueError

I am trying to find the score of a given data set with respect to some training data. I have written the following code: from sklearn.ensemble import RandomForestClassifier import numpy as np randomForest = RandomForestClassifier(n_estimators =…
2
votes
1 answer

How should zero standard deviation in one of the features be handled in multi-variate gaussian distribution

I am using multi-variate guassian distribution to analyze abnormality. This is how the training set looks 19-04-16 05:30:31 1 0 0 377816 305172 5567044 0 0 0 14 62 75 0 0 100 0 0
2
votes
1 answer

How to use feature selection and dimensionality reduction in Unsupervised learning?

I've been working on classifying emails from two authors. I've been successful in executing the same using supervised learning along with TFIDF vectorization of text, PCA and SelectPercentile feature selection. I used scikit-learn package to achieve…
2
votes
0 answers

Error sampling from GMM using sklearn.mixture.GMM

I'm using sklearn.mixture.GMM to fit some data and am having trouble sampling from the GMM for one item in the dataset. In over 1000 instances of the data it works fine, but in the case below (data_not_working) I get an error when running the…