Questions tagged [hierarchical-clustering]

Hierarchical clustering is a clustering technique that generates clusters at multiple hierarchical levels, thereby generating a tree of clusters. Hierarchical clustering provides advantages to analysts with its visualization potential.

Hierarchical clustering is a clustering technique that generates clusters at multiple hierarchical levels, thereby generating a tree of clusters.

Examples

Common methods include DIANA (DIvisive ANAlysis) which performs top down clustering (usually starts from the entire data set and then divides it till eventually a point is reached where each data point resides in a single cluster, or reaches a user-defined condition).

Another widely known method is AGNES (AGlomerative NESting) which basically performs the opposite of DIANA.

Distance metric& some advantages

There are multitude of ways to compute the distance metric upon which the clustering techniques divide/accumulate in to new clusters (as complete and single link distances which basically compute maximum and minimum respectively).

Hierarchical clustering provides advantages to analysts with its visualization potential, given its output of the hierarchical classification of a dataset. Such trees (hierarchies) could be utilized in a myriad of ways.

Other non-hierarchical clustering techniques

Other clustering methodologies include, but are not limited to, partitioning techniques (as k means and PAM) and density based techniques (as DBSCAN) known for its advantageous discovery of unusual cluster shapes (as non-circular shapes).

Suggested learning sources to look into

  • Han, Kamber and Pei's Data Mining book; whose lecture slides and companion material could be found here.
  • Wikipedia has an entry on the topic here.
1187 questions
-2
votes
2 answers

Clustering using Representatives (CURE)

I need a numerical example which demonstrates the working of clustering using CURE algorithm. https://www.cs.ucsb.edu/~veronika/MAE/summary_CURE_01guha.pdf
-2
votes
2 answers

How to do community detection in a edge weighted network/graph?

My General Problem is: How to do community detection in a weighted undirected social network/graph? Dataset that I want to Cluster looks like this, DrugA, DrugB,Weight x,y,6 y,z,9 y,p,5 x,p,3 In my dataset I have multiple nodes of drugs and the…
-2
votes
1 answer

DIfference Between two Clusters

So, I have a thing to do, but i need an advice how to do that. My data points is: 1,2,9,6,4 and I need to compute distance between clusters. I need to you Euclidean distance. My answer was: {1,1} = 0. {1,2}=1 , {1,9} = 8. Am i doing correct or not?
blockByblock
  • 366
  • 2
  • 4
  • 14
-2
votes
2 answers

Clustering in R

I used hclust to cluster my data and cutree to specify the numbers of cluster to be 3. Is there any way that I can examine each of the cluster? By examine I mean to list out the cases/observations that are in e.g. the first cluster. I tried all the…
BigData
  • 73
  • 1
  • 1
  • 4
-2
votes
1 answer

How to cluster latitude-longitude data based on fixed radius from centroid as the only constraint?

I have around 200k latitude & longitude data points. How can I cluster them so that each clusters have latitude & longitude points strictly within radius = 1 km from centroid only? I tried leadercluster algorithm/package in R but eventhough I…
-2
votes
1 answer

Drawing complex heat map in R

I am trying to draw a using the heatmap.2 with dendrograms using hierarchical cluster analysis. However I need two write different methods for each of the dendrograms. For y axis, I need to write Ward's Method, distance binary. And my X axis, Ward…
-2
votes
1 answer

How can i get center of a cluster? Data points

Hello assume that i have the following variables inside single cluster 1.1 0 1.9 0 0 1.3 0.6 0.6 0.6 Now how can i find centroid of this cluster?
Furkan Gözükara
  • 22,964
  • 77
  • 205
  • 342
-2
votes
1 answer

clustering weather stations by historical temperature data

I have very limited knowledge of machine learning. I'm looking for a certain clustering algorithm that can help me to group data points together by some historical data of those points. Think of this example: There are n weather stations (for…
-2
votes
1 answer

WEKA HierarchicalClusterer class always return 2 clusters

Here is my code: import weka.clusterers.ClusterEvaluation; import weka.clusterers.HierarchicalClusterer; import weka.clusterers.EM; import weka.core.converters.CSVLoader; import weka.core.converters.ConverterUtils.DataSource; import…
London guy
  • 27,522
  • 44
  • 121
  • 179
-3
votes
1 answer

How to make Hierarchical Cluster Heatmap in R?

I have this data set, I want to make Hierarchical Cluster Heatmap in R. Please help me structure(list(Location = c("Karnaphuli River", "Sangu River", "Kutubdia Channel", "Moheshkhali Channel", "Bakkhali River", "Naf River", "St. Martin's Island",…
Kazi
  • 67
  • 7
-3
votes
1 answer

How to perform clustering of points when distance between any two points are given?

I have a Set of lets say 100 points. And the distance of a point from any other point is given. Which means I have 100x100 dataset giving me distance of each of the 100 points from all the other 100 points. I want to form clusters from this dataset…
-3
votes
1 answer

Number of iterations in k-modes clustering in R

I've been trying to perform clustering using NBClust library. My set included categorical and numerical variables and I have one-hot encoded categorical ones. The results obtained with this method made sense but I have been told that if set includes…
Blazej Kowalski
  • 367
  • 1
  • 6
  • 16
-3
votes
1 answer

Obtain the Clustered Documents of DBSCAN

I attempted to use DBSCAN (from scikit-learn) to cluster text documents. I use TF-IDF (TfidfVectorizer in sklearn) to create the feature of each document. However, I have not found a way to obtain (print) the documents that are clustered by DBSCAN.…
-3
votes
1 answer

DBSCAN one demension, finding core points

One of the practice quiz question(not homework) is asking to find how many core points in one dimensional points with given EPS and MinPTS. I thought DBSCAN should be used for only two dimensions. Any guidance is much appreciated. Question
-3
votes
2 answers

Nearest algorithm according to which the humans analyze the data?

How do people analyze the data? Nearest algorithm according to which the humans analyze the data Can I say that the people group the data similar to the s.link algorithm based on these test cases?
1 2 3
79
80