Questions tagged [unsupervised-learning]

Unsupervised learning refers to machine learning contexts in which there is no prior 'training' period in which the learning agent is trained on objects of known type. As such, supervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimisation or maximisation of mathematical properties and not on an attempt to classify by understanding the right context.

Unsupervised learning (or clustering) refers to machine learning algorithms in which there is no 'label' available for the training data and the model tries to learn the underlying manifold. As such, unsupervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimization or maximization of mathematical properties and not on an attempt to classify by understanding the right context.

618 questions

votes

3 answers

When to use supervised or unsupervised learning?

Which are the fundamental criterias for using supervised or unsupervised learning? When is one better than the other? Is there specific cases when you can only use one of them? Thanks

asked Jul 04 '17 at 13:49

Daniel Amaya

votes

1 answer

Affinity propagation preference parameter

I've had encouraging results clustering a set of entity names using scikit-learn's affinity propagation implementation, with a modified Jaro-Winkler distance as the similarity metric, but my clusters are still too numerous (ie. too many false…

python scikit-learn cluster-analysis unsupervised-learning

asked Apr 24 '17 at 14:08

nitrl

2,185
2
15
15

votes

1 answer

unsupervised semantic clustering of phrases

I have about a thousand potential survey items as a vector of strings that I want to reduce to a few hundred. Normally when we talk about data reduction, we have actual data. I administer the items to participants and use factor analysis, PCA, or…

r cluster-analysis text-mining unsupervised-learning

asked Jun 16 '14 at 11:14

Eric Green

7,385
11
56
102

votes

1 answer

Drawing clustered graphs in Python

I already have a way of clustering my graph, so the process of clustering isn't the issue here. What I want to do is, once we have all the nodes clustered - to draw the clustered graph in Python, something like this: I looked into networkx, igraph…

python graph cluster-analysis graph-drawing unsupervised-learning

asked May 22 '13 at 14:37

Belphegor

4,456
11
34
59

votes

1 answer

principal component analysis (PCA) in R: which function to use?

Can anyone explain what the major differences between the prcomp and princomp functions are? Is there any particular reason why I should choose one over the other? In case this is relevant, the type of application I am looking at is a quality…

r linear-algebra pca unsupervised-learning

asked Jan 10 '13 at 00:57

AndraD

2,830
6
38
48

votes

1 answer

Semi-supervised Naive Bayes with NLTK

I have built a semi-supervised version of NLTK's Naive Bayes in Python based on the EM (expectation-maximization algorithm). However, in some iterations of EM I am getting negative log-likelihoods (the log-likelihoods of EM must be positive in every…

python machine-learning nltk naivebayes unsupervised-learning

asked Oct 23 '12 at 13:55

SUP

votes

1 answer

replace the silhouette with the Inertia

I have a problem. I am working with k-means and would like to find the optimal cluster. Unfortunately, my data set is too large to apply silhouette . Is there an option to adapt this code and replace the silhouette with the Inertia? MVC from…

python machine-learning cluster-analysis k-means unsupervised-learning

asked Jun 03 '22 at 09:06

Test

votes

1 answer

Why grpreg library and gglasso library in R are giving different results for group LASSO?

I have been trying to do unsupervised feature selection using LASSO (by removing class column). The dataset includes categorical (factor) and continuous (numeric) variables. Here is the link. I built a design matrix using model.matrix() which…

r feature-selection unsupervised-learning lasso-regression

asked Feb 27 '20 at 17:58

Mehmet Yildirim

votes

2 answers

Clustering images based on their similarity

I am facing a problem of image clustering based on their similarity, without knowing the number of clusters. Ideally i would like to achieve something that resembles this http://cs231n.github.io/assets/cnnvis/tsne.jpeg…

machine-learning image-processing computer-vision cluster-analysis unsupervised-learning

asked Oct 19 '19 at 11:01

Bartek Wójcik

votes

1 answer

Passing Target/Label data to Scikit-learn GridSearchCV's fit method for OneClassSVM

From my understanding, One-Class SVM's are trained without target/label data. One answer at Use of OneClassSVM with GridSearchCV suggests passing Target/Label data to GridSearchCV's fit method when the classifier is the OneClassSVM. How does the…

scikit-learn svm unsupervised-learning gridsearchcv one-class-classification

asked Oct 01 '19 at 01:40

user3731622

4,844
8
45
84

votes

1 answer

How to get nearest neighbours in fasttext for unsupervised learning models (cbow, skipgram)?

The examples (related to word representations) on fasttext official web site (fasttext.cc) suggest that it is possible to calculate the nearest neighbors on vectors derived with cbow (or skip-gram model) (in short, on unsupervised learning models).…

python nearest-neighbor unsupervised-learning fasttext

asked Sep 12 '19 at 09:38

IsidoraG

votes

1 answer

BERT performing worse than word2vec

I am trying to use BERT for a document ranking problem. My task is pretty straightforward. I have to do a similarity ranking for an input document. The only issue here is that I don’t have labels - so it’s more of a qualitative analysis. I am on my…

machine-learning deep-learning word2vec unsupervised-learning bert-language-model

asked Apr 21 '19 at 21:30

user3741951

votes

2 answers

How to programmatically determine the column indices of principal components using FactoMineR package?

Given a data frame containing mixed variables (i.e. both categorical and continuous) like, digits = 0:9 # set seed for reproducibility set.seed(17) # function to create random string createRandString <- function(n = 5000) { a <- do.call(paste0,…

r cluster-analysis pca feature-selection unsupervised-learning

asked Jul 17 '18 at 10:54

mnm

1,962
4
19
46

votes

1 answer

Implementation of Excess-Mass or Mass-Volume curves

I am looking for an implementation of Excess-Mass or Mass-Volume curves which are used for the evaluation of unsupervised anomaly detection algorithms. I'd prefer an implementation in Python but I could re-write it from any other language. Thank…

algorithm implementation unsupervised-learning anomaly-detection

asked Dec 05 '17 at 09:24

Stergios

3,126
6
33
55

votes

2 answers

Unsupervised loss function in Keras

Is there any way in Keras to specify a loss function which does not need to be passed target data? I attempted to specify a loss function which omitted the y_true parameter like so: def custom_loss(y_pred): But I got the following error: Traceback…

machine-learning keras unsupervised-learning

asked Jun 26 '17 at 13:40

Nick Bishop

Prev 1 2

…

41 42 Next