I want to perform dendrogram visualization using hierarchical grouping with Minkowski method on my dataset from eurostat library. I want to make values shown in this dendrogram:
to display country names like in this one
I can only use base R packages and/or ggplot2 due to project's requirements.
Use this code to recreate my situation:
install.packages("eurostat")
install.packages("dplyr")
install.packages("ggplot2")
library(eurostat)
library(dplyr)
library(ggplot2)
member_states <- c("AT", "BE", "BG", "HR", "CY", "CZ",
"DK", "EE", "FI", "FR", "DE", "GR",
"HU", "IE", "IT", "LV", "LT", "LU",
"MT", "NL", "PL", "PT", "RO", "SK",
"SI", "ES", "SE", "EL")
hicp <- get_eurostat("prc_hicp_manr", time_format = "date")
hicp_filtered <- hicp %>% filter(time >= as.Date("2000-02-01")
& time <= as.Date("2022-09-01")) %>%
filter(coicop == "CP00") %>%
filter(geo %in% member_states) %>%
mutate(geo = case_when(
geo == "AT" ~ "Austria",
geo == "BE" ~ "Belgium",
geo == "BG" ~ "Bulgaria",
geo == "HR" ~ "Croatia",
geo == "CY" ~ "Cyprus",
geo == "CZ" ~ "Czech Republic",
geo == "DK" ~ "Denmark",
geo == "EE" ~ "Estonia",
geo == "FI" ~ "Finland",
geo == "FR" ~ "France",
geo == "DE" ~ "Germany",
geo == "GR" ~ "Greece",
geo == "HU" ~ "Hungary",
geo == "IE" ~ "Ireland",
geo == "IT" ~ "Italy",
geo == "LV" ~ "Latvia",
geo == "LT" ~ "Lithuania",
geo == "LU" ~ "Luxembourg",
geo == "MT" ~ "Malta",
geo == "NL" ~ "Netherlands",
geo == "PL" ~ "Poland",
geo == "PT" ~ "Portugal",
geo == "RO" ~ "Romania",
geo == "SK" ~ "Slovakia",
geo == "SI" ~ "Slovenia",
geo == "ES" ~ "Spain",
geo == "SE" ~ "Sweden",
geo == "EL" ~ "Greece",
TRUE ~ geo
))
data <- hicp_filtered[, c(3,4,5)]
data_widened <- reshape(transform(data,
id = ave(seq_along(geo), geo, FUN = seq_along)),
idvar = c("id", "time"),
direction = "wide", timevar = "geo")
To perform that classification analysis I tried to write this code:
distance_matrix <- dist(data_widened[3:29, ], method = "minkowski", p = 1.5)
hc <- hclust(distance_matrix, method = "ward.D2")
plot(hc)
How can I replace those weird values with country names and align clusters on my plot too look like in the desired form?
Thanks in advance.