Questions tagged [unsupervised-learning]

Unsupervised learning refers to machine learning contexts in which there is no prior 'training' period in which the learning agent is trained on objects of known type. As such, supervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimisation or maximisation of mathematical properties and not on an attempt to classify by understanding the right context.

Unsupervised learning (or clustering) refers to machine learning algorithms in which there is no 'label' available for the training data and the model tries to learn the underlying manifold. As such, unsupervised learning includes such disciplines as mathematical clustering, whereby data is segmented into clusters based on the minimization or maximization of mathematical properties and not on an attempt to classify by understanding the right context.

618 questions
5
votes
1 answer

How to prepare a dataset for speech recognition

I need to train a Bidirectional LSTM model to recognize discrete speech (individual numbers from 0 to 9) I have recorded speech from 100 speakers. What should I do next? (Suppose I am splitting them into individual .wav files containing one number…
5
votes
1 answer

scipy.optimize + kmeans clustering

I have the following setup for kmeans clustering algorithm that I am implementing for a project: import numpy as np import scipy import sys import random import matplotlib.pyplot as plt import operator class KMeansClass: #takes in an npArray…
anonuser0428
  • 11,789
  • 22
  • 63
  • 86
5
votes
8 answers

K- Means algorithm

I'm trying to program a k-means algorithm in Java. I have calculated a number of arrays, each of them containing a number of coefficients. I need to use a k-means algorithm in order to group all this data. Do you know any implementation of this…
dedalo
  • 2,541
  • 12
  • 32
  • 34
4
votes
4 answers

Selecting an appropriate similarity metric & assessing the validity of a k-means clustering model

I have implemented k-means clustering for determining the clusters in 300 objects. Each of my object has about 30 dimensions. The distance is calculated using the Euclidean metric. I need to know How would I determine if my algorithms works…
4
votes
1 answer

Implement CVAE for a single image

I have a multi-dimensional, hyper-spectral image (channels, width, height = 15, 2500, 2500). I want to compress its 15 channel dimensions into 5 channels.So, the output would be (channels, width, height = 5, 2500, 2500). One simple way to do is to…
4
votes
1 answer

Interpreting K-Means cluster_centers_ output

I am having difficulty interpreting the results of the cluster_centers_ array output. Consider the following MWE: from sklearn.cluster import KMeans from sklearn.datasets import load_iris import numpy as np # Load the data iris = load_iris() X, y =…
John Stud
  • 1,506
  • 23
  • 46
4
votes
2 answers

Is there any supervised clustering algorithm or a way to apply prior knowledge to your clustering?

In my case I have a dataset of letters and symbols, detected in an image. The detected items are represented by their coordinates, type (letter, number etc), value, orientation and not the actual bounding box of the image. My goal is, using this…
4
votes
0 answers

Clustering algorithms: HDBSCAN in R vs HDBSCAN in Python?

For working with exploratory data, which would be best clustering method? Currently I use HDBSCAN. Problem is that the results I get from using HDBSCAN in R is different from results obtained via HDSCBAN in Python. R version: …
4
votes
1 answer

Example Request: unsupervised deep learning in python

Context I'm relatively new to neural nets and would like to learn about clustering methods that are able to make class predictions after learning a representation. Some tutorials online for autoencoders/rbms/deep belief networks typically have a…
Quetzalcoatl
  • 2,016
  • 4
  • 26
  • 36
4
votes
2 answers

How to change node labels of dendrogram plot

I did a hierarchical cluster for a project. I have 300 observations each of 20 variables. I indexed all the variables so that each variable is between 0 and 1, a larger value being better. I used the following code to create a cluster plot. d_data…
4
votes
1 answer

Implementing Face Recognition using Local Descriptors (Unsupervised Learning)

I'm trying to implement a face recognition algorithm using Python. I want to be able to receive a directory of images, and compute pair-wise distances between them, when short distances should hopefully correspond to the images belonging to the…
4
votes
1 answer

How to train and fine-tune fully unsupervised deep neural networks?

In scenario 1, I had a multi-layer sparse autoencoder that tries to reproduce my input, so all my layers are trained together with random-initiated weights. Without a supervised layer, on my data this didn't learn any relevant information (the code…
4
votes
1 answer

Add regression layer to caffe

I have implemented a smile detection system based on deep learning. The bottom layer is the output of the system and has 10 output according to the amount of the person's smile. I want to convert these ten output with a numeric output in the range…
4
votes
3 answers

Document Clustering in python using SciKit

I recently started working on Document clustering using SciKit module in python. However I am having a hard time understanding the basics of document clustering. What I know ? Document clustering is typically done using TF/IDF. Which…
3
votes
1 answer

How does Tensorflow's Decision Forests handle categorical data?

I'm evaluating two different unsupervised ML algorithms, Isolation Forest and LSTM Autoencoder model, to identify anomalies in a large time series data. This dataset includes mostly categorical data such as Ip Adresses, cloud subscription Ids,tenant…