Questions tagged [unsupervised-learning]

Unsupervised learning refers to machine learning contexts in which there is no prior 'training' period in which the learning agent is trained on objects of known type. As such, supervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimisation or maximisation of mathematical properties and not on an attempt to classify by understanding the right context.

Unsupervised learning (or clustering) refers to machine learning algorithms in which there is no 'label' available for the training data and the model tries to learn the underlying manifold. As such, unsupervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimization or maximization of mathematical properties and not on an attempt to classify by understanding the right context.

618 questions
1
vote
3 answers

Research paper has Supervised and Unsupervised Learning definition

I am looking for some Research paper or books have good, basic definiton of what Supervised and Unsupervised Learning is. So that i am able to quote these definition in my project. Thank you so much.
1
vote
0 answers

Unsupervised Neural Network to Maximize a Function?

Suppose I have vectors of dimension 1 x N {X_1...X_n} and {X_1' ...X_n'} where each X and X' are related but the relation is not able to be modeled by a function. I want to train a neural network by feeding it X_i and outputting Y_i with dimension N…
Y.Z.
  • 19
  • 2
1
vote
1 answer

Difference betweeen Mini Batch K-Means and Sequential/online KMeans

I am trying out examples of K-Means and its variants using scikit-learn library sklearn.cluster. What is the difference between minibatch K-Means clustering and online/sequential K-Means clustering ? I could not find the implementation of online…
1
vote
1 answer

KNN outlier detection in R

I am trying to run a script I was given to perform outlier detection using a weighted KNN outlier score, but keep getting the following error: Error in apply(kNNdist(x = dat, k = k), 1, mean) : dim(X) must have a positive length The script I…
ARH
  • 127
  • 6
1
vote
1 answer

Centroids of K-means clustering

I was trying to cluster cities and came up with a problem: I wanted the centroids to be obligatory in a city and they ended up in a desert area. I want to know if it is possible to "say" that the centroids have to be points in the input data. In…
Bruno Mello
  • 4,448
  • 1
  • 9
  • 39
1
vote
1 answer

what is the best algorithm to cluster this data

can some one help me find a good clustering algorithm that will cluster this into 3 clusters without defining the number of clusters. i have tried many algorithms in its basic form.. nothing seems to work properly. clustering =…
Eshaka
  • 974
  • 1
  • 14
  • 38
1
vote
2 answers

Clustering and Distance/Dissimilarity Matrrix based on String/Integer sequences in Python

I have customer's data based on his stay in the shop. The shop has 4 zones; zone 1,2,3 and 4. Now every 2 minutes, I get his reading as 10 numbers based on which zone he is in.…
1
vote
1 answer

Backward algorithm Hidden Markov Model, 0th index (termination step) yields wrong result

I'm implementing a backward HMM algorithm in PyTorch. I used this link as reference. This link contains the results of the numerical example used (I am attempting to implement that and compare my generated results to it). Page 3, section 2. Backward…
1
vote
1 answer

scikit-learn kmeans clustering overflow error

While finding KMeans elbow, it showing overflow error elbow=[] for i in range(30): model = KMeans(n_clusters=i) model.fit(feature_matrix) …
1
vote
3 answers

Why are data not split in training and testing for unsupervised learning algorithms?

We know that Prediction and Classification problems can break data according to a training ratio (generally 70-30 or 80-20 split), where the training data is passed to a model to be fit and its output is tested against the test data. Let's Say if I…
1
vote
1 answer

Creating a Dataset to input only Images

I need a dataset object that contains only images for unsupervised learning in Chainer framework. I am trying to use DatasetMixin for this purpose. Images is a list containing images. class SimpleDataset(dataset.DatasetMixin): def __init__(self,…
TulakHord
  • 422
  • 7
  • 15
1
vote
2 answers

How can I apply clustering by condition in python

I have a dataset about 50 000 samples and it contains 2 features where the first is binary and the second is continual. I would like to use python for using the clustering method in order to create 2 categories. PS: I couldn't specify when the…
Nirmine
  • 91
  • 10
1
vote
1 answer

Use Kmodes in Python with a big csv file

I would like some assist with a problem I have. I have a big csv file (6239292, 5) and want to perform an unsupervised machine learning technique (kmodes). My code is this: import numpy as np import pandas as pd print("initialising") syms =…
Gerasimos
  • 279
  • 2
  • 8
  • 17
1
vote
1 answer

Finding loss mask of variable length in keras tensorflow

Trying to build loss function which captures the below functionality, which mask the output values once 'end of sequence' is encountered. Given a tensor of shape [BatchSize,MaxSequenceLenght,OutputNodes] Consider the below example batch size =…
Ankith
  • 277
  • 2
  • 4
  • 13
1
vote
1 answer

Keras repeat elements throwing ValueError List argument 'indices' to 'SparseConcat' Op with length 0 shorter than minimum length 2

I am trying to implement the code for Unsupervised Aspect Extraction from the code available here. Link to the paper While implementing Attention class in ml_layers.py, i am getting error in call function at line y = K.repeat_elements(y, self.steps,…
Nitin
  • 2,572
  • 5
  • 21
  • 28