Questions tagged [cluster-analysis]

Cluster analysis is the process of grouping "similar" objects into groups known as "clusters", along with the analysis of these results.

Cluster analysis is the task of grouping objects into subsets (called clusters) so that observations in the same cluster are similar in some sense, while observations in different clusters are dissimilar.

In machine-learning and data-mining, clustering is a method of unsupervised learning used to discover hidden structure in unlabeled data, and is commonly used in exploratory data analysis. Popular algorithms include k-means, expectation maximization (EM), spectral clustering, correlation clustering and hierarchical-clustering.

Related topics: classification, pattern-recognition, knowledge discovery, taxonomy. Not to be confused with cluster computing.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Cross Validated, Data Science, or Artificial Intelligence instead; otherwise you're probably off-topic. Please choose one site only and do not cross-post to more than one - see Is cross-posting a question on multiple Stack Exchange sites permitted if the question is on-topic for each site?

6244 questions

votes

2 answers

How to perform clustering without removing rows where NA is present in R

I have a data which contain some NA value in their elements. What I want to do is to perform clustering without removing rows where the NA is present. I understand that gower distance measure in daisy allow such situation. But why my code below…

r cluster-analysis bioconductor

asked Dec 07 '13 at 05:31

neversaint

60,904
137
310
477

votes

3 answers

Kmeans matlab "Empty cluster created at iteration 1" error

I'm using this script to cluster a set of 3D points using the kmeans matlab function but I always get this error "Empty cluster created at iteration 1". The script I'm using: [G,C] = kmeans(XX, K, 'distance','sqEuclidean', 'start','sample'); XX…

matlab cluster-analysis k-means

asked Aug 02 '13 at 05:54

Tak

3,536
11
51
93

votes

4 answers

Correlation clustering in R

I'd like to use correlation clustering and I figure R is a good place to start. I can present the data to R as a set of large, sparse vectors or as a table with a pre-computed dissimilarity matrix. My questions are: are there existing R functions…

r cluster-analysis nlp

asked Sep 23 '09 at 23:03

daveb

74,111
6
45
51

votes

3 answers

Clustering words into groups

This is a Homework question. I have a huge document full of words. My challenge is to classify these words into different groups/clusters that adequately represent the words. My strategy to deal with it is using the K-Means algorithm, which as you…

cluster-analysis k-means text-analysis

asked Dec 07 '12 at 18:53

Parijat Kalia

4,929
10
50
77

votes

1 answer

Clustering and Bayes classifiers Matlab

So I am at a cross roads on what to do next, I set out to learn and apply some machine learning algorithms on a complicated dataset and I have now done this. My plan from the very beginning was to combine two possible classifiers in an attempt to…

matlab cluster-analysis classification bayesian fuzzy-c-means

asked Jul 19 '12 at 18:19

G Gr

6,030
20
91
184

votes

1 answer

Plotting output of kmeans(PyCluster impl)

How does on plot output of kmeans clustering in python? I am using PyCluster package. allUserVector is an n by m dimensonal vector , basically n users with m features. import Pycluster as pc import numpy as np clusterid,error,nfound =…

python cluster-analysis k-means

asked Mar 23 '12 at 22:01

Maxwell

votes

2 answers

Markov Clustering Algorithm

I've been working through the following example of the details of the Markov Clustering algorithm: http://www.cs.ucsb.edu/~xyan/classes/CS595D-2009winter/MCL_Presentation2.pdf I feel like I have accurately represented the algorithm but I am not…

javascript cluster-analysis markov

asked Jan 06 '12 at 20:50

methodin

6,717
1
25
27

votes

2 answers

Combining different similarities to build one final similarity

Im pretty much new to data mining and recommendation systems, now trying to build some kind of rec system for users that have such parameters: city education interest To calculate similarity between them im gonna apply cosine similarity and…

cluster-analysis data-mining distance similarity

asked Nov 20 '11 at 13:09

Leg0

votes

5 answers

Clustering 2d integer coordinates into sets of at most N points

I have a number of points on a relatively small 2-dimensional grid, which wraps around in both dimensions. The coordinates can only be integers. I need to divide them into sets of at most N points that are close together, where N will be quite a…

algorithm cluster-analysis

asked Nov 16 '11 at 01:38

Ben

68,572
20
126
174

votes

2 answers

k-means: Same clusters for every execution

Is it possible to get same kmeans clusters for every execution for a particular data set. Just like for a random value we can use a fixed seed. Is it possible to stop randomness for clustering?

r statistics cluster-analysis k-means

asked Sep 21 '11 at 13:57

Robin

votes

3 answers

Looking for collective intelligence .Net / C# resources

Firstly, I realise that this is a very similar question to this one: Which are the good open source libraries for Collective Intelligence in .net/java? ... but all the answers to that one were Java centric so I am asking again, this time looking…

c# .net algorithm cluster-analysis collective-intelligence

asked Apr 10 '09 at 10:23

Steve

8,469
1
26
37

votes

5 answers

How to summarize a list of combination

I have a list of 2 elements' combination like below. cbnl <- list( c("A", "B"), c("B", "A"), c("C", "D"), c("E", "D"), c("F", "G"), c("H", "I"), c("J", "K"), c("I", "H"), c("K", "J"), c("G", "F"), c("D", "C"), c("E", "C"), c("D", "E"), c("C",…

r list cluster-analysis

asked Dec 15 '21 at 12:44

kabocha

votes

5 answers

How to cluster objects (without coordinates)

I have a list of opaque objects. I am only able to calculate the distance between them (not true, just setting the conditions for the problem): class Thing { public double DistanceTo(Thing other); } I would like to cluster these objects. I…

algorithm language-agnostic cluster-analysis

asked Mar 28 '09 at 00:54

Frank Krueger

69,552
46
163
208

votes

2 answers

Kubernetes increase resources for all deployments

I am new to Kubernetes. I have a K8 cluster with multiple deployments (more than 150), each having more than 4 pods scaled. I have a requirement to increase resource limits for all deployments in the cluster; and I'm aware I can increase this…

kubernetes deployment yaml cluster-analysis kubectl

asked Jul 26 '21 at 13:55

Aniruddha Salve

votes

1 answer

HDBSCAN difference between parameters

I'm confused about the difference between the following parameters in HDBSCAN min_cluster_size min_samples cluster_selection_epsilon Correct me if I'm wrong. For min_samples, if it is set to 7, then clusters formed need to have 7 or more…

machine-learning scikit-learn cluster-analysis hierarchical-clustering hdbscan

asked Jun 09 '21 at 05:22

HR1

Prev 1 2 3

…

99 100 Next